代码之家  ›  专栏  ›  技术社区  ›  Marco C

从嵌套列表的不同级别提取元素

  •  0
  • Marco C  · 技术社区  · 6 年前

    > str(content)
    List of 3
     $ author-retrieval-response:List of 1
      ..$ :List of 6
      .. ..$ @status       : chr "found"
      .. ..$ @_fa          : chr "true"
      .. ..$ coredata      :List of 3
      .. .. ..$ dc:identifier : chr "AUTHOR_ID:55604964500"
      .. .. ..$ document-count: chr "6"
      .. .. ..$ cited-by-count: chr "13"
      .. ..$ h-index       : chr "3"
      .. ..$ coauthor-count: chr "7"
      .. ..$ preferred-name:List of 2
      .. .. ..$ surname   : chr "García Cruz"
      .. .. ..$ given-name: chr "Gustavo Adolfo"
     $ author-retrieval-response:List of 1
      ..$ :List of 6
      .. ..$ @status       : chr "found"
      .. ..$ @_fa          : chr "true"
      .. ..$ coredata      :List of 3
      .. .. ..$ dc:identifier : chr "AUTHOR_ID:56595713900"
      .. .. ..$ document-count: chr "4"
      .. .. ..$ cited-by-count: chr "21"
      .. ..$ h-index       : chr "3"
      .. ..$ coauthor-count: chr "5"
      .. ..$ preferred-name:List of 2
      .. .. ..$ surname   : chr "Akimov"
      .. .. ..$ given-name: chr "Alexey"
     $ author-retrieval-response:List of 1
      ..$ :List of 6
      .. ..$ @status       : chr "found"
      .. ..$ @_fa          : chr "true"
      .. ..$ coredata      :List of 3
      .. .. ..$ dc:identifier : chr "AUTHOR_ID:12792624600"
      .. .. ..$ document-count: chr "10"
      .. .. ..$ cited-by-count: chr "117"
      .. ..$ h-index       : chr "6"
      .. ..$ coauthor-count: chr "7"
      .. ..$ preferred-name:List of 2
      .. .. ..$ surname   : chr "Alecke"
      .. .. ..$ given-name: chr "Björn"
    

    我对提取以下值感兴趣:

    dc:标识符,文档计数,计数引用,h索引, 合著者计数,姓,名

    在数据帧结构中解析它们。

    我有两个问题:第一个问题是我无法访问列表中的不同级别。的确,虽然 content[[3]]

    > content[[3]][[2]]
    Error in content[[3]][[2]] : subscript out of bounds
    

    我也想象着,一旦我能接触到它,我就不能简单地使用它 sapply 因为我想从列表中解析的元素不在同一级别。

    我粘贴 dput

    structure(list(`author-retrieval-response` = list(structure(list(
        `@status` = "found", `@_fa` = "true", coredata = structure(list(
            `dc:identifier` = "AUTHOR_ID:55604964500", `document-count` = "6", 
            `cited-by-count` = "13"), .Names = c("dc:identifier", 
        "document-count", "cited-by-count")), `h-index` = "3", `coauthor-count` = "7", 
        `preferred-name` = structure(list(surname = "García Cruz", 
            `given-name` = "Gustavo Adolfo"), .Names = c("surname", 
        "given-name"))), .Names = c("@status", "@_fa", "coredata", 
    "h-index", "coauthor-count", "preferred-name"))), `author-retrieval-response` = list(
        structure(list(`@status` = "found", `@_fa` = "true", coredata = structure(list(
            `dc:identifier` = "AUTHOR_ID:56595713900", `document-count` = "4", 
            `cited-by-count` = "21"), .Names = c("dc:identifier", 
        "document-count", "cited-by-count")), `h-index` = "3", `coauthor-count` = "5", 
            `preferred-name` = structure(list(surname = "Akimov", 
                `given-name` = "Alexey"), .Names = c("surname", "given-name"
            ))), .Names = c("@status", "@_fa", "coredata", "h-index", 
        "coauthor-count", "preferred-name"))), `author-retrieval-response` = list(
        structure(list(`@status` = "found", `@_fa` = "true", coredata = structure(list(
            `dc:identifier` = "AUTHOR_ID:12792624600", `document-count` = "10", 
            `cited-by-count` = "117"), .Names = c("dc:identifier", 
        "document-count", "cited-by-count")), `h-index` = "6", `coauthor-count` = "7", 
            `preferred-name` = structure(list(surname = "Alecke", 
                `given-name` = "Björn"), .Names = c("surname", "given-name"
            ))), .Names = c("@status", "@_fa", "coredata", "h-index", 
        "coauthor-count", "preferred-name")))), .Names = c("author-retrieval-response", 
    "author-retrieval-response", "author-retrieval-response"))
    

    非常感谢你的帮助!

    1 回复  |  直到 6 年前
        1
  •  2
  •   Parfait    6 年前

    考虑一个 rapply (递归应用函数)展平 lapply 跨越前三个父元素。然后将结果与 t() data.frame() 构造函数调用。

    flat_list <- lapply(my_list, function(x) data.frame(t(rapply(x, function(x) x[1]))))
    
    final_df <- do.call(rbind, unname(flat_list))
    

    输出

    final_df
    
    #   X.status X._fa coredata.dc.identifier coredata.document.count coredata.cited.by.count h.index coauthor.count preferred.name.surname preferred.name.given.name
    # 1    found  true  AUTHOR_ID:55604964500                       6                      13       3              7            García Cruz            Gustavo Adolfo
    # 2    found  true  AUTHOR_ID:56595713900                       4                      21       3              5                 Akimov                    Alexey
    # 3    found  true  AUTHOR_ID:12792624600                      10                     117       6              7                 Alecke                     Björn