代码之家  ›  专栏  ›  技术社区  ›  Nate

使用R中的多个条件删除df中的行

  •  1
  • Nate  · 技术社区  · 2 年前

    是否可以通过引用2列或更多列中的特定字符串或因子级别来删除数据行?对于小型数据集,这很容易,因为我可以滚动数据帧并删除所需的行,但对于大型数据集,如何实现这一点,而不必无休止地滚动查看哪些行符合我的标准?

    虚假数据:

    df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
                      month = rep(c("March", "October"), each = 1), 
                      site = rep(c("1", "2", "3", "4", "5"), each = 2),
                      common_name = rep(c("Tuna", "shark"), each = 1),
                      num = sample(x = 0:2, size  = 20, replace = TRUE))
    

    例如:如何在2019年3月只删除一行代码中的站点“1”,而不查看它位于哪一行?

    3 回复  |  直到 2 年前
        1
  •  2
  •   DaveArmstrong    2 年前

    您可以使用 subset() :

    df1 <- data.frame(year = rep(c(2019, 2020), each = 10),
                      month = rep(c("March", "October"), each = 1), 
                      site = rep(c("1", "2", "3", "4", "5"), each = 2),
                      common_name = rep(c("Tuna", "shark"), each = 1),
                      num = sample(x = 0:2, size  = 20, replace = TRUE))
    
    subset(df1, !(site == "1" & year == 2019 & month == "March"))
    #>    year   month site common_name num
    #> 2  2019 October    1       shark   0
    #> 3  2019   March    2        Tuna   1
    #> 4  2019 October    2       shark   0
    #> 5  2019   March    3        Tuna   0
    #> 6  2019 October    3       shark   0
    #> 7  2019   March    4        Tuna   2
    #> 8  2019 October    4       shark   2
    #> 9  2019   March    5        Tuna   0
    #> 10 2019 October    5       shark   2
    #> 11 2020   March    1        Tuna   1
    #> 12 2020 October    1       shark   1
    #> 13 2020   March    2        Tuna   2
    #> 14 2020 October    2       shark   2
    #> 15 2020   March    3        Tuna   1
    #> 16 2020 October    3       shark   0
    #> 17 2020   March    4        Tuna   1
    #> 18 2020 October    4       shark   0
    #> 19 2020   March    5        Tuna   0
    #> 20 2020 October    5       shark   2
    

    于2022年5月31日由 reprex package (v2.0.1)

        2
  •  1
  •   akrun    2 年前

    我们可以使用 paste

    subset(df1, paste(year, month, site) != '2019 March 1')
    

    -输出

       year   month site common_name num
    2  2019 October    1       shark   1
    3  2019   March    2        Tuna   1
    4  2019 October    2       shark   2
    5  2019   March    3        Tuna   0
    6  2019 October    3       shark   0
    7  2019   March    4        Tuna   2
    8  2019 October    4       shark   1
    9  2019   March    5        Tuna   1
    10 2019 October    5       shark   1
    11 2020   March    1        Tuna   1
    12 2020 October    1       shark   1
    13 2020   March    2        Tuna   1
    14 2020 October    2       shark   2
    15 2020   March    3        Tuna   1
    16 2020 October    3       shark   0
    17 2020   March    4        Tuna   1
    18 2020 October    4       shark   1
    19 2020   March    5        Tuna   1
    20 2020 October    5       shark   2
    
        3
  •  1
  •   M.Viking    2 年前

    一条线替代 subset dplyr:filter 使用R括号表示法:

    df2 <- df1[!(df1$site=="1" & df1$year==2019 & df1$month=="March"),]