代码之家  ›  专栏  ›  技术社区  ›  TheGoat

使用dplyr折叠重复行值并透视更宽的非重复行

  •  0
  • TheGoat  · 技术社区  · 4 年前

    我希望将列值相同(ID:Var2)和不同(Var3:Var4)的行组合在一起,我想根据Var5变量将其扩展到唯一的列。

    我的数据如下:

     foo <- data.frame(ID = c(1,1,2,2),Var1 = c("A","A","C","C"),Var2 = c("B","B","D","D"),Var3 = c("X","Y","Z",NA),var4 = c("S","T","U","V"),Var5 = c("RF","SJ","RF","SJ"))
    

    我希望我的数据如下:

    bar <- data.frame(ID = c(1,2),Var1 = c("A","C"), Var2 = c("B","D"), RF_Var3 = c("X","Z"), RF_Var4 = c("S","U"), SJ_var3 = c("Y","NA"),SJ_Var4 = c("T","V"))
    

    重要的是SJ vars和RF vars要一起排序。

    任何帮助都将不胜感激。

    0 回复  |  直到 4 年前
        1
  •  2
  •   ThomasIsCoding    4 年前

    A. dplyr 选项

    foo %>%
      pivot_longer(cols = Var3:var4) %>%
      arrange(ID, Var1, Var2, Var5) %>%
      pivot_wider(id_cols = ID:Var2, values_from = value, names_from = Var5:name)
    

    这给了

         ID Var1  Var2  RF_Var3 RF_var4 SJ_Var3 SJ_var4
      <dbl> <chr> <chr> <chr>   <chr>   <chr>   <chr>
    1     1 A     B     X       S       Y       T
    2     2 C     D     Z       U       <NA>    V
    

    同样地,a data.table 选项与 dcase 可能也有帮助

    dcast(
      melt(setDT(foo), id = c("ID", "Var1", "Var2", "Var5"))[order(ID, Var1, Var2, Var5)],
      ID + Var1 + Var2 ~ Var5 + variable,
      value = c("value")
    )
    

    这给了

       ID Var1 Var2 RF_Var3 RF_var4 SJ_Var3 SJ_var4
    1:  1    A    B       X       S       Y       T
    2:  2    C    D       Z       U    <NA>       V
    
        2
  •  1
  •   nyk    4 年前
    library(tidyverse)
    
    # note that Var4 is capital letter V, which is slightly different from your sample data
    foo %>% pivot_wider(id_cols = c(ID:Var2), values_from = Var3:Var4, names_from = Var5, names_glue = "{Var5}_{.value}")
    
    # A tibble: 2 x 7
         ID Var1  Var2  RF_Var3 SJ_Var3 RF_Var4 SJ_Var4
      <dbl> <chr> <chr> <chr>   <chr>   <chr>   <chr>  
    1     1 A     B     X       Y       S       T      
    2     2 C     D     Z       NA      U       V      
    
    
    
    
        3
  •  0
  •   hello_friend    4 年前

    这里有一个多步骤的Base R方法,我相信可以用 reshape() :

    # Subset of the unique values we would like to view the transposed data
    # by: dimensions => data.frame
    dimensions <- unique(foo[,1:3])
    
    # Name of vector thats values we are to transpose: transpose_vec => character scalar
    transpose_vec <- "Var5" 
     
    # Names of vectors thats values are to be spread by the transposition: 
    # vecs_to_spread => character vector
    vecs_to_spread <- setdiff(names(foo), c(names(dimensions), transpose_vec))
    
    # Foreach value of the transpose vec, repeat it by the number of vectors 
    # we are going to spread it:  tpd_foo_vecs => character vector
    tpd_foo_vecs <- unique(paste(rep(foo[,transpose_vec], each = length(vecs_to_spread)), 
                                 vecs_to_spread, sep = "_"))
    
    # Transpose the vectors to spread and flatten the values out into a vector: 
    # vals => character vector
    vals <- c(t(foo[,vecs_to_spread]))
    
    # Name the values using the transposed vector: named_vals => named character vector
    named_vals <- setNames(vals, rep(tpd_foo_vecs, length(vals)/length(tpd_foo_vecs)))
    
    # Split the vector into a list, flatten into a data.frame and column bind 
    # with the dimensions: res => data.frame
    res <- cbind(dimensions, data.frame(split(named_vals, names(named_vals))))