代码之家  ›  专栏  ›  技术社区  ›  Cina

基于列转置数据并保留重复的数据(与长格式不太相似)[重复]

  •  1
  • Cina  · 技术社区  · 6 年前

    这与长到宽格式稍有不同。(请不要重复报告)

    我有如下数据。我想根据term列和subject列中相应的值进行转置。结果类似于df_结果:

    DF <- data.frame(ID = c("10", "10", "10", "10", "10", "11", "11", "11", "12", "12"),
                 term = c("1", "1", "2", "2", "3", "1", "1", "2", "1", "1"),
                 subject = c("math1", "phys1", "math2", "chem1", "cmp1", "math1", "phys1", "math2", "math1", "phys1"),
                 graduation = c ("grad", "grad", "grad", "grad", "grad", "drop", "drop", "drop", "enrolled", "enrolled"))
    
    Df
    
    ID   term   subject   graduation
    10    1      math1      grad
    10    1      phys1      grad
    10    2      math2      grad
    10    2      chem1      grad
    10    3      cmp1       grad
    11    1      math1      drop
    11    1      phys1      drop
    11    2      math2      drop
    12    1      math1      enrolled
    12    1      phys1      enrolled
    

    DFA-结果:

    ID  term1  term2  term3   graduation
    10  math1  math2  cmp1     grad
    10  phys1  chem1  NA       grad
    11  math1  math2  NA       drop
    11  phys1   NA    NA       drop
    12  math1   NA    NA       Enrolled
    12  math2   NA    NA       Enrolled
    

    使用 reshape 生产接近我想要的,但它只保持第一个匹配。

    resjape(DF, idvar = c("ID","graduation"), timevar = "term", direction = "wide") 
    

    它产生:

      ID graduation subject.1 subject.2 subject.3
    1 10       grad     math1     math2      cmp1
    6 11       drop     math1     math2      <NA>
    9 12   enrolled     math1      <NA>      <NA>
    

    问题是 timevar 只保留第一场比赛。 使用 dcast melt 仅用函数填充数据 length .

    我怎样才能用r解出来?

    1 回复  |  直到 6 年前
        1
  •  2
  •   DanY    6 年前

    这与从长到宽的重塑相同,但您需要一个新变量来帮助您以新格式唯一地标识行。我称这个变量 classnum 下面和我用 data.table 帮助我创建它的语法:

    # add helper variable "classnum"
    library(data.table)
    setDT(DF)
    DF[ , classnum := 1:.N, by=.(ID, term)]
    
    #reshape long-to-wide
    tidyr::spread(DF, term, subject)
    

    结果:

       ID graduation classnum     1     2    3
    1: 10       grad        1 math1 math2 cmp1
    2: 10       grad        2 phys1 chem1 <NA>
    3: 11       drop        1 math1 math2 <NA>
    4: 11       drop        2 phys1  <NA> <NA>
    5: 12   enrolled        1 math1  <NA> <NA>
    6: 12   enrolled        2 phys1  <NA> <NA>