代码之家  ›  专栏  ›  技术社区  ›  Sebastian

用多重性别特定年龄组旋转人口数据

  •  0
  • Sebastian  · 技术社区  · 2 年前

    我的人口结构如下:

    tibble(
      female = c("0-10", "10-20"),
      female_population = c(30000, 50000),
      male = c("0-10", "10-20"),
      male_population = c(33000, 45000),
      total = c("0-10", "10-20"),
      total_population = female_population + male_population
    )
    
    # A tibble: 2 x 6
      female female_population male  male_population total total_population
      <chr>              <dbl> <chr>           <dbl> <chr>            <dbl>
    1 0-10               30000 0-10            33000 0-10             63000
    2 10-20              50000 10-20           45000 10-20            95000
    

    我想把它作为一个支点,这样我就可以得到一个关于年龄、性别和人口的单一分类,如下所示:

    tibble(
      sex = rep(c("female", "male", "total"), each = 2),
      age = rep(c("0-10", "10-20"), 3),
      population = c(30000, 50000, 33000, 45000, 63000, 95000)
    )
    
    # A tibble: 6 x 3
      sex    age   population
      <chr>  <chr>      <dbl>
    1 female 0-10       30000
    2 female 10-20      50000
    3 male   0-10       33000
    4 male   10-20      45000
    5 total  0-10       63000
    6 total  10-20      95000
    

    有没有想过如何优雅地做到这一点,也许使用pivot\u更长时间?

    1 回复  |  直到 2 年前
        1
  •  1
  •   Carl    2 年前

    pivot_longer :

    ( %>% 可用于代替 |> ; 后者出现在base R的最新版本中。)

    library(tidyverse)
    
    df <- tibble(
      female = c("0-10", "10-20"),
      female_population = c(30000, 50000),
      male = c("0-10", "10-20"),
      male_population = c(33000, 45000),
      total = c("0-10", "10-20"),
      total_population = female_population + male_population
    )
    
    df |> 
      pivot_longer(!contains("_"), names_to = "sex", values_to = "age") |> 
      mutate(population = case_when(
        sex == "female" ~ female_population,
        sex == "male"   ~ male_population,
        sex == "total"  ~ total_population
      )) |> 
      select(-contains("_")) |> 
      arrange(sex)
    
    #> # A tibble: 6 × 3
    #>   sex    age   population
    #>   <chr>  <chr>      <dbl>
    #> 1 female 0-10       30000
    #> 2 female 10-20      50000
    #> 3 male   0-10       33000
    #> 4 male   10-20      45000
    #> 5 total  0-10       63000
    #> 6 total  10-20      95000
    

    于2022年7月2日由 reprex package (v2.0.1)