代码之家  ›  专栏  ›  技术社区  ›  Nick Criswell

r data.table就地联接多列

  •  1
  • Nick Criswell  · 技术社区  · 6 年前

    data.table 太神奇了。

    This question

    library(data.table)
    dt1 <- data.table(col1 = c("a", "b", "c"), 
                      col2 = 1:3, 
                      col3 = c(TRUE, FALSE, FALSE))
    
    setkey(dt1, col1)
    
    set.seed(1)
    dt2 <- data.table(col1 = sample(c("a", "b", "c"), size = 10, replace = TRUE), 
                      another_col = sample(1:10, size = 10, replace = TRUE), 
                      and_anouther = sample(c(TRUE, FALSE), size = 10, replace = TRUE))
    
    setkey(dt2, col1)
    
    # I want to stick the columns from dt1 onto dt2
    
    # this works
    dt3 <- dt2[dt1]
    dt3
        col1 another_col and_anouther col2  col3
     1:    a           9        FALSE    1  TRUE
     2:    b           2        FALSE    2 FALSE
     3:    b           9        FALSE    2 FALSE
     4:    b           6        FALSE    2 FALSE
     5:    b           5         TRUE    2 FALSE
     6:    b           8        FALSE    2 FALSE
     7:    c           9         TRUE    3 FALSE
     8:    c           5        FALSE    3 FALSE
     9:    c           7        FALSE    3 FALSE
    10:    c           6        FALSE    3 FALSE
    
    # but i want to do this by reference
    
    # this works for one column
    dt2[dt1, col2 := i.col2]
    dt2
    
        col1 another_col and_anouther col2
     1:    a           3        FALSE    1
     2:    a           8         TRUE    1
     3:    a           8         TRUE    1
     4:    b           2         TRUE    2
     5:    b           7        FALSE    2
     6:    b          10         TRUE    2
     7:    b           4        FALSE    2
     8:    c           4         TRUE    3
     9:    c           5         TRUE    3
    10:    c           8         TRUE    3
    
    # ok, remove that column
    dt2[, col2 := NULL]
    
    # now try to join multiple columns 
    # this doesn't work
    dt2[dt1, (col2 := i.col2, 
              col3 := i.col3)]
    
    # neither does this
    dt2[dt1, .(col2 := i.col2, 
              col3 := i.col3)]
    
    # this just give me to the two columns
    dt2[dt1, .(col2 = i.col2, 
               col3 = i.col3)]
    dt2
       col2  col3
     1:    1  TRUE
     2:    1  TRUE
     3:    1  TRUE
     4:    2 FALSE
     5:    2 FALSE
     6:    2 FALSE
     7:    2 FALSE
     8:    3 FALSE
     9:    3 FALSE
    10:    3 FALSE  
    
                    ^
    

    创建于2018-10-30 reprex package

    基本上,我想从 dt3 dt2 . 谢谢!

    2 回复  |  直到 6 年前
        1
  •  7
  •   Nick Criswell    6 年前

    我应该看看 one more questions awesome reference. . 我所要做的就是使用 := 接线员。

    dt2[dt1, `:=` (col2 = i.col2, 
              col3 = i.col3)]
    
    dt2
        col1 another_col and_anouther col2  col3
     1:    a           3        FALSE    1  TRUE
     2:    a           8         TRUE    1  TRUE
     3:    a           8         TRUE    1  TRUE
     4:    b           2         TRUE    2 FALSE
     5:    b           7        FALSE    2 FALSE
     6:    b          10         TRUE    2 FALSE
     7:    b           4        FALSE    2 FALSE
     8:    c           4         TRUE    3 FALSE
     9:    c           5         TRUE    3 FALSE
    10:    c           8         TRUE    3 FALSE
    
        2
  •  4
  •   Anonymous coward    6 年前

    函数语法比标准方法更简洁。

    dt2[dt1, c("col2", "col3") := .(col2, col3), on = c(col1 = "col1")][order(col1)]
    
        col1 another_col and_anouther col2  col3
     1:    a           3        FALSE    1  TRUE
     2:    a           8         TRUE    1  TRUE
     3:    a           8         TRUE    1  TRUE
     4:    b           2         TRUE    2 FALSE
     5:    b           7        FALSE    2 FALSE
     6:    b          10         TRUE    2 FALSE
     7:    b           4        FALSE    2 FALSE
     8:    c           4         TRUE    3 FALSE
     9:    c           5         TRUE    3 FALSE
    10:    c           8         TRUE    3 FALSE