代码之家  ›  专栏  ›  技术社区  ›  J.Sabree

case\u when在使用group by时忽略一些参数

  •  2
  • J.Sabree  · 技术社区  · 2 年前

    二者都 在当月的观察中,中低阶级群体至少有5个人。否则,我希望它显示为NA。

    library(dplyr)
    
    #Sample dataset
    test_data <- tibble(month = c(rep(c("Jan"), 4), rep(c("Feb"), 4)),
                        ses = c(rep(c("High", "Mid", "Mid Low", "Low"), 2)),
                        total = c(10, 20, 4, 30, 9, 11, 40, 60),
                        total_selected = c(9, 10, 8, 3, 8, 6, 8, 6))
    
    #Failed attempt
    wrong <- test_data %>%
    group_by(month) %>%
      mutate(adjusted_total = case_when(
        ses == "Mid Low" & total[ses == "Mid"] <5 | total[ses == "Low"] <5 ~ NA_real_,
        TRUE ~ total
      ))
    

    使用解决方案编辑

    我意识到我的代码有一个拼写错误。首先,我指的是or语句,而不是AND。其次,阈值对于我的数据来说太低了。当我调整到OR语句,并且截止到15时

    
    correct <- tibble(month = c(rep(c("Jan"), 4), rep(c("Feb"), 4)),
                        ses = c(rep(c("High", "Mid", "Mid Low", "Low"), 2)),
                        total = c(10, 20, 4, 30, 9, 11, 40, 60),
                        total_selected = c(9, 10, 8, 3, 8, 6, 8, 6)) %>%
      group_by(month) %>%
      mutate(adjusted_total = case_when(
        ses == "Mid Low" & total[ses == "Mid"] < 15 | total[ses == "Low"] < 15 ~ NA_real_,
        TRUE ~ total
      ))
    
    
    1 回复  |  直到 2 年前
        1
  •  1
  •   akrun    2 年前

    case_when/ifelse/if_else 所有参数都要求参数长度相同。这里,其中一个逻辑表达式的长度不同。正确的方法是用 any “total”子集的

    test_data %>%
    group_by(month) %>%
      mutate(adjusted_total = case_when(
        ses == "Mid Low" & any(total[ses  %in% c("Mid", "Low")] < 15) ~ NA_real_,
        TRUE ~ total
      )) %>% 
    ungroup
    

    -输出

    # A tibble: 8 × 5
      month ses     total total_selected adjusted_total
      <chr> <chr>   <dbl>          <dbl>          <dbl>
    1 Jan   High       10              9             10
    2 Jan   Mid        20             10             20
    3 Jan   Mid Low     4              8              4
    4 Jan   Low        30              3             30
    5 Feb   High        9              8              9
    6 Feb   Mid        11              6             11
    7 Feb   Mid Low    40              8             NA
    8 Feb   Low        60              6             60
    

    replace

    test_data %>%
       group_by(month) %>% 
       mutate(adjusted_total = replace(total,
        ses == "Mid Low" & any(total[ses %in% c("Mid", "Low")] < 15), 
        NA)) %>%
       ungroup