代码之家  ›  专栏  ›  技术社区  ›  Tim Wilcox

R中pivot wide和pivot longer的一些问题

  •  1
  • Tim Wilcox  · 技术社区  · 3 年前

    下面是我所做的示例数据和一个操作。我以前也做过类似的事情,下面的代码可以做到,但现在不是这样。第一个问题,我还需要再做一段时间吗?第二,为什么我要拿NA

     areaname<-c("Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Clark County","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace","Someplace")
    periodyear<-c(2011,2012,2013,2014,2015,2016,2017,2018,2019,2020,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020)
    annualavg<-c(17.56,18.66,19.25,20.35,21.45,22.33,22.44,32.15,33.14,47.555,17.59,18.99,19.33,2.35,88.45,2.33,29.44,36.15,39.14,47.51)
    
    table<-data.frame(areaname,periodyear,annualavg)
    
    table$annualavgr <- round(table$annualavg,digits = 0)
    
     chart17<-table %>%
     dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
     ungroup() %>%
     pivot_longer(col = annualavgr, names_to = "measure", values_to = "value") %>%
     group_by(areaname,measure) %>%
     pivot_wider(names_from = periodyear, values_from = value)%>%gt()
    

    期望的最终结果(或类似的结果)

                     2011    2012    2013     2014  and so on.... 
      Clark County    18      19      19       20
    
                     2011    2012    2013     2014
      Someplace       18      19      19       2
    
    1 回复  |  直到 3 年前
        1
  •  2
  •   akrun    3 年前

    我们需要在中同时使用这两列 pivot_longer

    library(dplyr)
    library(tidyr)
    table %>%
      dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
      ungroup() %>% 
      pivot_longer(cols = c(annualavg, annualavgr),
            names_to = "measure", values_to = "value") %>% 
      pivot_wider(names_from = periodyear, values_from = value)
    

    # A tibble: 4 x 12
      areaname     measure    `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
      <chr>        <chr>       <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
    1 Clark County annualavg    17.6   18.7   19.2  20.4    21.4  22.3    22.4   32.2   33.1   47.6
    2 Clark County annualavgr   18     19     19    20      21    22      22     32     33     48  
    3 Someplace    annualavg    17.6   19.0   19.3   2.35   88.4   2.33   29.4   36.2   39.1   47.5
    4 Someplace    annualavgr   18     19     19     2      88     2      29     36     39     48  
    

    table %>%
      dplyr::select("areaname","periodyear","annualavg","annualavgr")%>%
      ungroup() %>% 
      pivot_longer(cols = c(annualavg, annualavgr), 
          names_to = "measure", values_to = "value") %>%   
      pivot_wider(names_from = periodyear, values_from = value) %>% 
      group_by(areaname) %>%
      summarise(across(where(is.numeric), mean, na.rm = TRUE))
    

    -输出

    # A tibble: 2 x 11
      areaname     `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
      <chr>         <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
    1 Clark County   17.8   18.8   19.1  20.2    21.2  22.2    22.2   32.1   33.1   47.8
    2 Someplace      17.8   19.0   19.2   2.17   88.2   2.16   29.2   36.1   39.1   47.8
    

    如果我们只需要一列'annualavgr',就不需要了 再长一点 ,而不是 select 出了“年鉴”

    table %>%
       dplyr::select("areaname","periodyear","annualavgr")%>%
       ungroup %>% 
       pivot_wider(names_from = periodyear, values_from = annualavgr)
    # A tibble: 2 x 11
      areaname     `2011` `2012` `2013` `2014` `2015` `2016` `2017` `2018` `2019` `2020`
      <chr>         <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
    1 Clark County     18     19     19     20     21     22     22     32     33     48
    2 Someplace        18     19     19      2     88      2     29     36     39     48