代码之家  ›  专栏  ›  技术社区  ›  Artur Vidaurre de Almeida

用R列中的下一个值替换NA值

  •  0
  • Artur Vidaurre de Almeida  · 技术社区  · 2 年前

    我正在努力 lag() 不生产 NA 价值观

    df <- data.frame("Score" = as.numeric(c("20", "10", "15", "30", "15", "10")),
                 "Time" = c("1", "2", "1", "2", "1", "2"),
                 "Team" = c("A", "A", "B", "B", "C", "C"))
    

    Diff 这将计算每个团队的得分差异:

     df <- df %>% 
     group_by(Team) %>% 
     mutate(Diff = Score - lag(Score))
    

    我的问题是,这种方法会产生 显然,价值观:

      Score Time  Team   Diff
      20     1     A        NA
      10     2     A       -10
      15     1     B        NA
      30     2     B        15
      15     1     C        NA
      10     2     C        -5
    

    我的目标是在最后做到这一点:

      Score Time  Team   Diff
      20     1     A       -10
      10     2     A       -10
      15     1     B        15
      30     2     B        15
      15     1     C        -5
      10     2     C        -5
    

    我尝试使用 case_when() 函数替换 不适用

     df %>% 
     group_by(Team) %>% 
     mutate(Diff = Score - lag(Score)) %>% 
     mutate(Diff = case_when(
     NA ~ lead(Diff)
     ))
    

    不管怎样,我该怎么做 值替换为下一个值 差异 价值

    1 回复  |  直到 2 年前
        1
  •  1
  •   Peter_Evan    2 年前

    只需使用 fill()

    library(tidyverse)
    
    df <- data.frame("Score" = as.numeric(c("20", "10", "15", "30", "15", "10")),
                     "Time" = c("1", "2", "1", "2", "1", "2"),
                     "Team" = c("A", "A", "B", "B", "C", "C"))
    df <- df %>% 
      group_by(Team) %>% 
      mutate(Diff = Score - lag(Score)) %>% 
      fill(Diff, .direction = 'up')
    
    df
    # output
    #   Score Time  Team   Diff
    #   <dbl> <chr> <chr> <dbl>
    #1    20 1     A       -10
    #2    10 2     A       -10
    #3    15 1     B        15
    #4    30 2     B        15
    #5    15 1     C        -5
    #6    10 2     C        -5