代码之家  ›  专栏  ›  技术社区  ›  SeGa

如何在分组列中按行添加值

  •  0
  • SeGa  · 技术社区  · 6 年前

    我有一些每秒100个数据输入的传感器数据。最后一列是毫秒,现在都是10毫秒。如何将毫秒按时间和日期分组并在一起?

    testdata <- structure(list(local_date = c("26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017", "26-06-2017",  "26-06-2017", "26-06-2017"), 
                               local_time = c("13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23", "13:58:23",  "13:58:23", "13:58:23", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24",  "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24", "13:58:24" ), 
                               ms = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,  10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10)), 
                          .Names = c("local_date",  "local_time", "ms"), row.names = c(NA, -200L), class = c("data.table", "data.frame"))
    

    前100行共享同一时间(13:58:23)和日期(26-06-2017),但它们都有10毫秒。结果应该只有一个每秒钟10毫秒的条目,下面的毫秒将添加到前面的条目中。

    此代码段将使用以下序列创建结果:

    testdata$ms = rep(seq(from = 10, to = 1000, by = 10), 2)
    

    但由于原始数据不是那么干净,所以我必须按日期和时间对数据进行分组,然后按行方式将毫秒相加。

    我想要一个 data.table 解决方案,但 dplyr 也可以。

    2 回复  |  直到 6 年前
        1
  •  1
  •   bouncyball    6 年前

    听起来你需要分组 cumsum :

    library(dplyr) 
    
    testdata$ms2 = rep(seq(from = 10, to = 1000, by = 10), 2)
    
    testdata %>%
        group_by(local_date, local_time) %>%
        mutate(cumsum_ms = cumsum(ms))
    
       local_date local_time    ms   ms2 cumsum_ms
       <chr>      <chr>      <dbl> <dbl>     <dbl>
     1 26-06-2017 13:58:23      10    10        10
     2 26-06-2017 13:58:23      10    20        20
     3 26-06-2017 13:58:23      10    30        30
     4 26-06-2017 13:58:23      10    40        40
     5 26-06-2017 13:58:23      10    50        50
    
        2
  •  1
  •   SeGa    6 年前

    并添加一个 data.table 版本:

    testdata[, ms := cumsum(ms), by = .(local_time, local_date)]