代码之家  ›  专栏  ›  技术社区  ›  Nick Criswell

展开两个矢量列

  •  1
  • Nick Criswell  · 技术社区  · 6 年前

    我有一个 tibble 有两列。每列都包含有序的纬度和经度对。结构如下:

    library(dplyr)
    library(tidyr)
    
    > my_df
    # A tibble: 3 x 2
      V1        V2       
      <list>    <list>   
    1 <dbl [2]> <dbl [2]>
    2 <dbl [2]> <dbl [2]>
    3 <dbl [2]> <dbl [2]>
    
    my_df = structure(list(V1 = list(c(44.0252714, -88.1536451), c(42.9856117, 
    -87.9355419), c(42.8600366, -87.9541568)), V2 = list(c(44.9535298, 
    -90.9188588), c(45.4864422, -89.7339536), c(43.0743635, -87.9765372
    ))), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"
    ))
    

    我想把它变成一个四列的数据框,看起来像:

    > my_df2
            y1        x1       y2        x2
    1 44.02527 -88.15365 44.95353 -90.91886
    2 42.98561 -87.93554 45.48644 -89.73395
    3 42.86004 -87.95416 43.07436 -87.97654
    

    我试着用 unnest tidyr 但没有成功。

    > my_df %>% unnest()
    # A tibble: 6 x 2
         V1    V2
      <dbl> <dbl>
    1  44.0  45.0
    2 -88.2 -90.9
    3  43.0  45.5
    4 -87.9 -89.7
    5  42.9  43.1
    6 -88.0 -88.0
    
    > my_df %>% unnest(V1, V2)
    # A tibble: 6 x 2
         V1    V2
      <dbl> <dbl>
    1  44.0  45.0
    2 -88.2 -90.9
    3  43.0  45.5
    4 -87.9 -89.7
    5  42.9  43.1
    6 -88.0 -88.0
    

    我需要以某种方式控制不安的发生,但我不知道怎么做。

    3 回复  |  直到 6 年前
        1
  •  3
  •   s_baldur    6 年前

    下面是一个技巧,首先将每个向量转换为字符串:

    my_df %>%
      rowwise() %>%
      mutate_all(funs(toString(.))) %>%
      separate(V1, c("y1", "x1"), ", ") %>%
      separate(V2, c("y2", "x2"), ", ") %>%
      mutate_all(funs(as.numeric(.)))
    # A tibble: 3 x 4
         y1    x1    y2    x2
      <dbl> <dbl> <dbl> <dbl>
    1  44.0 -88.2  45.0 -90.9
    2  43.0 -87.9  45.5 -89.7
    3  42.9 -88.0  43.1 -88.0
    

    编辑 更基本的R型方法:

    my_df2 <- 
      do.call(cbind, lapply(my_df, function(x) do.call(rbind, x))) %>% 
      as.tibble()
    names(my_df2) <- c("y1", "x1", "y2", "x2")
    my_df2
    # A tibble: 3 x 4
         y1    x1    y2    x2
      <dbl> <dbl> <dbl> <dbl>
    1  44.0 -88.2  45.0 -90.9
    2  43.0 -87.9  45.5 -89.7
    3  42.9 -88.0  43.1 -88.0
    
        2
  •  2
  •   AntoniosK    6 年前

    一个可能的解决方案是:

    library(dplyr)
    library(tidyr)
    
    my_df %>% 
      rowwise() %>%                                  # for each row
      mutate_all(funs(list(data.frame(t(.))))) %>%   # transpose your vector and create a dataframe
      unnest()                                       # unnest
    
    # # A tibble: 3 x 4
    #      X1    X2   X11   X21
    #   <dbl> <dbl> <dbl> <dbl>
    # 1  44.0 -88.2  45.0 -90.9
    # 2  43.0 -87.9  45.5 -89.7
    # 3  42.9 -88.0  43.1 -88.0
    
        3
  •  1
  •   Frank    6 年前

    data.table::transpose ...

    rn_df = function(x, suff = 0, cols = c("x","y"))
      setNames(x, paste0(cols, suff))
    
    my_df %>% 
      lapply(data.table::transpose) %>%
      unname %>%
      Map(rn_df, ., seq_along(.)) %>%
      unlist(recursive=FALSE) %>% 
      data.frame
    
            x1        y1       x2        y2
    1 44.02527 -88.15365 44.95353 -90.91886
    2 42.98561 -87.93554 45.48644 -89.73395
    3 42.86004 -87.95416 43.07436 -87.97654
    

    这应该扩展到原来的一些列 my_df ,假设它们都变成具有相同命名模式的col。


    在我看来,您最好使用长格式的数据,而不是生成以后必须解析的列名称:

    library(data.table)
    res = my_df %>% lapply(. %>% (data.table::transpose) %>% setDT) %>%
      rbindlist(id = "src") %>%
      setnames(-1, c("x", "y"))
    
       src        x         y
    1:  V1 44.02527 -88.15365
    2:  V1 42.98561 -87.93554
    3:  V1 42.86004 -87.95416
    4:  V2 44.95353 -90.91886
    5:  V2 45.48644 -89.73395
    6:  V2 43.07436 -87.97654