代码之家  ›  专栏  ›  技术社区  ›  O.rka

如何将“pandas”中“DataFrame”的“unstack”方法还原为原始对象?

  •  0
  • O.rka  · 技术社区  · 6 年前

    我想转换对称相似矩阵( pd.DataFrame )变成一个未支撑的 pd.Series pd.MultiIndex 然后回到一个

    这是获得对称 pd.数据帧 以及我试图逆转手术。

    pivot ? 我想和另一个 pd.数据帧 df_sqr_revert 和原作一样 df_sqr .

    有人知道怎么逆转这个手术吗?

    data = {'sepal_length': {'sepal_length': 1.0, 'sepal_width': 0.44531537502467533, 'petal_length': 0.935877078652436, 'petal_width': 0.9089768166845817}, 'sepal_width': {'sepal_length': 0.44531537502467533, 'sepal_width': 1.0, 'petal_length': 0.2897419517994226, 'petal_width': 0.32172795519309727}, 'petal_length': {'sepal_length': 0.935877078652436, 'sepal_width': 0.2897419517994226, 'petal_length': 1.0, 'petal_width': 0.9813785485254833}, 'petal_width': {'sepal_length': 0.9089768166845817, 'sepal_width': 0.32172795519309727, 'petal_length': 0.9813785485254833, 'petal_width': 1.0}}
    
    df_sqr = pd.DataFrame(data)
    #               petal_length  petal_width  sepal_length  sepal_width
    # petal_length      1.000000     0.981379      0.935877     0.289742
    # petal_width       0.981379     1.000000      0.908977     0.321728
    # sepal_length      0.935877     0.908977      1.000000     0.445315
    # sepal_width       0.289742     0.321728      0.445315     1.000000
    Se_vertical = df_sqr.unstack()
    # petal_length  petal_length    1.000000
    #               petal_width     0.981379
    #               sepal_length    0.935877
    #               sepal_width     0.289742
    # petal_width   petal_length    0.981379
    #               petal_width     1.000000
    #               sepal_length    0.908977
    #               sepal_width     0.321728
    # sepal_length  petal_length    0.935877
    #               petal_width     0.908977
    #               sepal_length    1.000000
    #               sepal_width     0.445315
    # sepal_width   petal_length    0.289742
    #               petal_width     0.321728
    #               sepal_length    0.445315
    #               sepal_width     1.000000
    # dtype: float64
    
    # df_sqr_revert = Se_vertical.stack()
    # AttributeError: 'Series' object has no attribute 'stack'
    
    1 回复  |  直到 6 年前
        1
  •  2
  •   tobsecret    6 年前

    自相矛盾的是,你想要的是第二个非堆叠呼叫:

    In [14]: df
    Out[14]: 
                  sepal_length  sepal_width  petal_length  petal_width
    petal_length      0.935877     0.289742      1.000000     0.981379
    petal_width       0.908977     0.321728      0.981379     1.000000
    sepal_length      1.000000     0.445315      0.935877     0.908977
    sepal_width       0.445315     1.000000      0.289742     0.321728
    
    
    In [13]: df_sqr.unstack().unstack()
    Out[13]: 
                  petal_length  petal_width  sepal_length  sepal_width
    sepal_length      0.935877     0.908977      1.000000     0.445315
    sepal_width       0.289742     0.321728      0.445315     1.000000
    petal_length      1.000000     0.981379      0.935877     0.289742
    petal_width       0.981379     1.000000      0.908977     0.321728
    

    这个 documentation 提到在一个数列的情况下unstack等于pivot,就像你在问题中怀疑的那样。


    只是因为我好奇,当我们在列和索引标签前面加上前缀时,stack和unstack之间的区别变得更加明显:

    In [17]: df.columns  = [f'columns_{i}' for i in df.columns]
    
    In [18]: df.index  = [f'index_{i}' for i in df.index]
    

    .stack() 将行索引设置为多索引的最左侧级别:

    In [20]: df.stack()
    Out[20]: 
    index_petal_length  columns_sepal_length    0.935877
                        columns_sepal_width     0.289742
                        columns_petal_length    1.000000
                        columns_petal_width     0.981379
    index_petal_width   columns_sepal_length    0.908977
                        columns_sepal_width     0.321728
                        columns_petal_length    0.981379
                        columns_petal_width     1.000000
    index_sepal_length  columns_sepal_length    1.000000
                        columns_sepal_width     0.445315
                        columns_petal_length    0.935877
                        columns_petal_width     0.908977
    index_sepal_width   columns_sepal_length    0.445315
                        columns_sepal_width     1.000000
                        columns_petal_length    0.289742
                        columns_petal_width     0.321728
    dtype: float64
    

    .unstack()

    In [21]: df.unstack()
    Out[21]: 
    columns_sepal_length  index_petal_length    0.935877
                          index_petal_width     0.908977
                          index_sepal_length    1.000000
                          index_sepal_width     0.445315
    columns_sepal_width   index_petal_length    0.289742
                          index_petal_width     0.321728
                          index_sepal_length    0.445315
                          index_sepal_width     1.000000
    columns_petal_length  index_petal_length    1.000000
                          index_petal_width     0.981379
                          index_sepal_length    0.935877
                          index_sepal_width     0.289742
    columns_petal_width   index_petal_length    0.981379
                          index_petal_width     1.000000
                          index_sepal_length    0.908977
                          index_sepal_width     0.321728
    dtype: float64