代码之家  ›  专栏  ›  技术社区  ›  ecology

如何删除总和为0的列和行,同时保留非数值列

  •  1
  • ecology  · 技术社区  · 6 年前

    以下是我的数据子集。我正在尝试删除总和为0的列和行。。。关键是我想在结果输出中保留列1到8。有什么想法吗?我试过很多次了。一个整洁的解决方案是最好的。

    Site    Date    Mon Day Yr          Szn SznYr       A   B   C   D   E   F   G
    B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   0   0   0   0
    B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   1   0   0   0   0
    B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   3   0   0   0
    B0001   7/29/97 7   29  1997    Summer  1997-Summer 0   0   0   0   0   0   10
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   5   0   0
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   6   0
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   0
    B0002   7/28/97 7   28  1997    Summer  1997-Summer 0   0   0   0   0   0   8
    B0002   6/28/07 6   28  2007    Summer  2007-Summer 0   3   6   1   7   0   1
    
    2 回复  |  直到 6 年前
        1
  •  2
  •   moodymudskipper    6 年前

    试试这个:

    # remove rows 
    df <- df[rowSums(df[-(1:7)]) !=0, ]
    # remove columns    
    df <- df[c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
    #     Site    Date Mon Day   Yr    Szn       SznYr B C D E F  G
    # 2  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 1 0 0 0  0
    # 3  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 0 3 0 0  0
    # 4  B0001 7/29/97   7  29 1997 Summer 1997-Summer 0 0 0 0 0 10
    # 5  B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 5 0  0
    # 7  B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 0 6  0
    # 10 B0002 7/28/97   7  28 1997 Summer 1997-Summer 0 0 0 0 0  8
    # 11 B0002 6/28/07   6  28 2007 Summer 2007-Summer 3 6 1 7 0  1
    

    您可以一步完成此操作,以获得与@dan-y相同的输出(在这种特定情况下相同,但如果实际数据中有负值,则不同):

        df <- df[rowSums(df[-(1:7)]) !=0,
                 c(1:7,7 + which(colSums(df[-(1:7)]) !=0))]
    
        2
  •  2
  •   DanY    6 年前

    这不是花哨的,但它是明确和易于修改的:

    # generate example data
    df <- data.frame(
        site = c(rep("B1", 4), rep("B2", 7)),
        szn  = rep("Summar", 11),
        A= c(0,0,0,0,0,0,0,0,0,0,0),
        B= c(0,0,0,0,0,0,0,0,0,0,3),
        C= c(0,1,0,0,0,0,0,0,0,0,6),
        D= c(0,0,3,0,0,0,0,0,0,0,1),
        E= c(0,0,0,0,5,0,0,0,0,0,7),
        F= c(0,0,0,0,0,0,6,0,0,0,0),
        G= c(0,0,10,0,0,0,0,0,0,8,1),
        stringsAsFactors = FALSE
    )
    
    # get names of cols you want to check for 0s
    other_cols <- names(df)[1:2]
    num_cols   <- names(df)[3:9]
    
    # check rowsum and colsum
    rows_to_keep <- rowSums(df[ , num_cols]) != 0
    cols_to_keep <- colSums(df[ , num_cols]) != 0
    
    # keep (1) rows that don't sum to zero 
    #      (2) numeric cols that don't sum to zero, and
    #      (3) the "other" cols that are non-numeric
    df[rows_to_keep , c(other_cols, num_cols[cols_to_keep])]