代码之家  ›  专栏  ›  技术社区  ›  natemcintosh

熊猫concat类似的数据帧和系列

  •  0
  • natemcintosh  · 技术社区  · 5 年前

    我有一个数据帧列表,所有数据帧都有相同的列。有时,数据帧只有一行,因此是一个系列。当我试图将此列表与 pd.concat ,如果有一个系列,它将我想要的列放在索引中。请参阅下面的最小工作示例。

    In [1]: import pandas as pd                                                                                                                                                                                              
    
    In [2]: import numpy as np                                                                                                                                                                                               
    
    In [3]: d = {'a':np.random.randn(100), 'b':np.random.randn(100)}                                                                                                                                                         
    
    In [4]: df = pd.DataFrame(d)                                                                                                                                                                                             
    
    In [5]: thing1 = df.iloc[:10, :]                                                                                                                                                                                         
    
    In [6]: thing1                                                                                                                                                                                                           
    Out[6]: 
              a         b
    0 -0.505268 -1.109089
    1 -1.792729 -0.580566
    2 -0.478042  0.410095
    3 -0.758376  0.558772
    4  0.112519  0.556316
    5 -1.015813 -0.568148
    6  1.234858 -1.062879
    7 -0.455796 -0.107942
    8  1.231422  0.780694
    9 -1.082461 -1.809412
    
    In [7]: thing2 = df.iloc[10,:]                                                                                                                                                                                           
    
    In [8]: thing2                                                                                                                                                                                                           
    Out[8]: 
    a   -1.527836
    b    0.653610
    Name: 10, dtype: float64
    
    In [9]: thing3 = df.iloc[11:, :]                                                                                                                                                                                         
    
    In [10]: thing3                                                                                                                                                                                                          
    Out[10]: 
               a         b
    11 -1.247939 -0.694491
    12  1.359737  0.625284
    13 -0.491533 -0.230665
    14  1.360465  0.472451
    15  0.691532 -1.822708
    16  0.938316  1.310101
    17  0.485776 -0.313206
    18  1.398189 -0.232446
    19 -0.626278  0.714052
    20 -1.292272 -1.299580
    21 -1.521746 -1.615611
    22  1.464332  2.839602
    23  0.707370 -0.162056
    24 -1.825903  0.000278
    25  0.917284 -0.094716
    26 -0.239839  0.132572
    27 -0.463240 -0.805458
    28  1.174125  0.131057
    29  0.183503  0.328603
    30  0.045839 -0.244965
    31  0.449265  0.642082
    32  2.381600 -0.417044
    33  0.276217 -0.257426
    34  0.755067  0.012898
    35  0.130339 -0.094300
    36 -1.643097  0.038982
    37  0.895719  0.789494
    38  0.701480 -0.668440
    39 -0.201400  1.441928
    40 -2.018043 -0.106764
    ..       ...       ...
    70  0.971799  0.298164
    71  1.307070 -2.093075
    72 -1.049177  2.183065
    73 -0.469273 -0.739449
    74  0.685838  2.579547
    75  1.994485  0.783204
    76 -0.414760 -0.285766
    77 -1.005873 -0.783886
    78  1.486588 -0.349575
    79  1.417006 -0.676501
    80  1.284611 -0.817505
    81 -0.624406 -1.659931
    82 -0.921061  0.424663
    83 -0.645472 -0.769509
    84 -1.217172 -0.943542
    85 -0.184948  0.482977
    86 -0.253972 -0.080682
    87 -0.699122  0.368751
    88  1.391163  0.042899
    89 -0.075512  0.019728
    90  0.449151  0.486462
    91 -0.182553  0.876379
    92 -0.209162  0.390093
    93  0.789094  1.570251
    94 -1.018724 -0.084603
    95  1.109534  1.840739
    96  0.774806 -0.380387
    97  0.534344  1.165343
    98  1.003597 -0.221899
    99 -0.659863 -1.061590
    
    [89 rows x 2 columns]
    
    In [11]: pd.concat([thing1, thing2, thing3])                                                                                                                                                                             
    Out[11]: 
               a         b         0
    0  -0.505268 -1.109089       NaN
    1  -1.792729 -0.580566       NaN
    2  -0.478042  0.410095       NaN
    3  -0.758376  0.558772       NaN
    4   0.112519  0.556316       NaN
    5  -1.015813 -0.568148       NaN
    6   1.234858 -1.062879       NaN
    7  -0.455796 -0.107942       NaN
    8   1.231422  0.780694       NaN
    9  -1.082461 -1.809412       NaN
    a        NaN       NaN -1.527836
    b        NaN       NaN  0.653610
    11 -1.247939 -0.694491       NaN
    12  1.359737  0.625284       NaN
    13 -0.491533 -0.230665       NaN
    14  1.360465  0.472451       NaN
    15  0.691532 -1.822708       NaN
    16  0.938316  1.310101       NaN
    17  0.485776 -0.313206       NaN
    18  1.398189 -0.232446       NaN
    19 -0.626278  0.714052       NaN
    20 -1.292272 -1.299580       NaN
    21 -1.521746 -1.615611       NaN
    22  1.464332  2.839602       NaN
    23  0.707370 -0.162056       NaN
    24 -1.825903  0.000278       NaN
    25  0.917284 -0.094716       NaN
    26 -0.239839  0.132572       NaN
    27 -0.463240 -0.805458       NaN
    28  1.174125  0.131057       NaN
    ..       ...       ...       ...
    70  0.971799  0.298164       NaN
    71  1.307070 -2.093075       NaN
    72 -1.049177  2.183065       NaN
    73 -0.469273 -0.739449       NaN
    74  0.685838  2.579547       NaN
    75  1.994485  0.783204       NaN
    76 -0.414760 -0.285766       NaN
    77 -1.005873 -0.783886       NaN
    78  1.486588 -0.349575       NaN
    79  1.417006 -0.676501       NaN
    80  1.284611 -0.817505       NaN
    81 -0.624406 -1.659931       NaN
    82 -0.921061  0.424663       NaN
    83 -0.645472 -0.769509       NaN
    84 -1.217172 -0.943542       NaN
    85 -0.184948  0.482977       NaN
    86 -0.253972 -0.080682       NaN
    87 -0.699122  0.368751       NaN
    88  1.391163  0.042899       NaN
    89 -0.075512  0.019728       NaN
    90  0.449151  0.486462       NaN
    91 -0.182553  0.876379       NaN
    92 -0.209162  0.390093       NaN
    93  0.789094  1.570251       NaN
    94 -1.018724 -0.084603       NaN
    95  1.109534  1.840739       NaN
    96  0.774806 -0.380387       NaN
    97  0.534344  1.165343       NaN
    98  1.003597 -0.221899       NaN
    99 -0.659863 -1.061590       NaN
    
    [101 rows x 3 columns]
    

    请注意,对于这个问题,我需要保持原始索引。

    我花了很长时间研究文档,但似乎无法解决我的问题。有什么简单的方法可以解决这个问题吗?

    1 回复  |  直到 5 年前
        1
  •  2
  •   Poojan    5 年前
    thing2 = pd.DataFrame(thing2).transpose()
    pd.concat([thing1, thing2, thing3])
    

    以你为例 transpose() 将设置 Pandas Series 索引为柱状,然后你可以很容易地凝固。
    此处的文档: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.transpose.html