代码之家  ›  专栏  ›  技术社区  ›  Tim Wilcox

如何最好地在R中执行此特定行操作?

  •  -1
  • Tim Wilcox  · 技术社区  · 1 年前

    所以对于这个任务,在我的真实数据集中。我有18行,indcode=000000,ownership code=10。区分因素是面积。同样,我有18行,indcode=4911,ownership code=10。下面的示例数据将其缩小到4,以便于计算。一些上下文。。在我的真实数据集中,我有从1月2日到6月23日的年度(02)和月份(1月)的月度数据。910是新的indcode。。它代表了特定地区和时间内联邦政府的总就业人数。联邦就业定义为indcode=000000减去indcode=4911。indcode=55只是为了使其更加现实。

    附言,我对“02 Jan”有一些困难,所以可以随意将其重命名为Jan。只是想让它与真正的产品保持一致。

     indcode <- c("000000","000000","000000","000000", "55", "4911","4911","4911","4911")
     ownership <- c("10","10","10","10","10","10","10","10","10")
     area <- c("000000","031","029","017","029","000000","031","029","017")
     "02-Jan" <- c(1000,600,300,100,50,100,50,40,10)
     "02-Feb" <- c(1003,601,301,101,51,101,51,41,11)
    
      first <- data.frame(indcode, ownership, area, `02-Jan`, `02-Feb`)
    

    对于每个区域,这里都有一个例子。实际的02值不是1000-100,而是900,但我认为这会让它更清楚。

        indcode    ownership    area     02-Jan    02-Feb
          910          10        000000    1000-100     1003-101  
          910          10        031       600-50       601-51
    
    1 回复  |  直到 1 年前
        1
  •  3
  •   Jon Spring    1 年前
    library(dplyr)
    first |>
      summarize(across(3:4, ~max(.)-min(.)), 
      # OLD: summarize(across(3:4, ~paste(rev(range(.)), collapse = "-")), 
                .by = area) |>
        #"3:4" refers to the 3rd and 4th column once we set aside the area grouping
        # We could alternated specify the columns by name, e.g. X02.Jan:X02.Feb
      mutate(indcode = 910, ownership = 10, .before = 1)
    

    后果

      indcode ownership   area X02.Jan X02.Feb
    1     910        10 000000     900     902
    2     910        10    031     550     550
    3     910        10    029     260     260
    4     910        10    017      90      90