代码之家  ›  专栏  ›  技术社区  ›  Mohamed Thasin ah

如何在熊猫中并排附加两个数据帧列

  •  2
  • Mohamed Thasin ah  · 技术社区  · 6 年前

    我有一个df,它包含所有的数值列。我想找到 cumprod 对于每列,并将每列结果的结果并排附加到中。如何做到这一点。为了便于比较,我希望这个并列的结果。

    例如:

    我的输入DF:

            col1      col2      col3
    0   1.000000  1.000000  1.000000
    1   0.998766  0.999490  0.998892
    2   0.997779  0.999081  0.998005
    3   0.996299  0.998469  0.996676
    4   0.994573  0.997754  0.995126
    5   0.993095  0.997140  0.993797
    6   0.991125  0.996322  0.992027
    7   0.989648  0.995708  0.990699
    8   0.988171  0.995094  0.989372
    9   0.986695  0.994480  0.988045
    10  0.984729  0.993660  0.986276
    11  0.983010  0.992943  0.984730
    

    DF的生产线:

            col1      col2      col3
    0   1.000000  1.000000  1.000000
    1   0.998766  0.999490  0.998892
    2   0.996547  0.998572  0.996899
    3   0.992859  0.997043  0.993585
    4   0.987471  0.994803  0.988742
    5   0.980653  0.991958  0.982609
    6   0.971949  0.988310  0.974775
    7   0.961887  0.984069  0.965708
    8   0.950509  0.979241  0.955444
    9   0.937863  0.973836  0.944022
    10  0.923541  0.967662  0.931066
    11  0.907850  0.960833  0.916849
    

    预期输出:

            col1      col1      col2      col2      col3      col3
    0   1.000000  1.000000  1.000000  1.000000  1.000000  1.000000
    1   0.998766  0.998766  0.999490  0.999490  0.998892  0.998892
    2   0.997779  0.996547  0.999081  0.998572  0.998005  0.996899
    3   0.996299  0.992859  0.998469  0.997043  0.996676  0.993585
    4   0.994573  0.987471  0.997754  0.994803  0.995126  0.988742
    5   0.993095  0.980653  0.997140  0.991958  0.993797  0.982609
    6   0.991125  0.971949  0.996322  0.988310  0.992027  0.974775
    7   0.989648  0.961887  0.995708  0.984069  0.990699  0.965708
    8   0.988171  0.950509  0.995094  0.979241  0.989372  0.955444
    9   0.986695  0.937863  0.994480  0.973836  0.988045  0.944022
    10  0.984729  0.923541  0.993660  0.967662  0.986276  0.931066
    11  0.983010  0.907850  0.992943  0.960833  0.984730  0.916849
    

    注:如果我得到 cum_of_coln 而不是 coln 在列名中更优先

    获取我使用的cum-prod的代码,

    print df
    print df.cumprod()
    
    3 回复  |  直到 6 年前
        1
  •  2
  •   cs95 abhishek58g    6 年前

    计算 cumprod 然后使用 cytoolz 并交错列标题:

    from toolz import interleave
    
    df2 = df.cumprod().add_prefix('cum_of_')
    df3 = pd.concat([df, df2], axis=1)[list(interleave([df, df2]))]
    

    或者,你可以使用 sorted :

    df2 = df.cumprod().add_prefix('cum_of_')
    df3 = pd.concat([df, df2], axis=1)
    df3 = df3[sorted(df3, key=lambda x: x.split('_')[-1])]
    

    第三个选项是在排序后改变列标题。应该很有效率。

    df3 = pd.concat([df,  df.cumprod()], axis=1).sort_index(axis=1)
    c = df3.columns.values
    c[1::2] = 'cum_of_' + c[1::2]
    df3.columns = c
    

    df3.head()
            col1  cum_of_col1      col2  cum_of_col2      col3  cum_of_col3
    0   1.000000     1.000000  1.000000     1.000000  1.000000     1.000000
    1   0.998766     0.998766  0.999490     0.999490  0.998892     0.998892
    2   0.997779     0.996548  0.999081     0.998571  0.998005     0.996899
    3   0.996299     0.992860  0.998469     0.997043  0.996676     0.993586
    4   0.994573     0.987471  0.997754     0.994803  0.995126     0.988743
    
        2
  •  1
  •   jezrael    6 年前

    使用 concat 并根据列表理解生成的列表重新排序:

    cols = [item for x in df.columns for item in (x, 'cum_of_' + x)]
    df = pd.concat([df, df.cumprod().add_prefix('cum_of_')], axis=1)[cols]
    
    print (df)
            col1  cum_of_col1      col2  cum_of_col2      col3  cum_of_col3
    0   1.000000     1.000000  1.000000     1.000000  1.000000     1.000000
    1   0.998766     0.998766  0.999490     0.999490  0.998892     0.998892
    2   0.997779     0.996548  0.999081     0.998571  0.998005     0.996899
    3   0.996299     0.992860  0.998469     0.997043  0.996676     0.993586
    4   0.994573     0.987471  0.997754     0.994803  0.995126     0.988743
    5   0.993095     0.980653  0.997140     0.991958  0.993797     0.982610
    6   0.991125     0.971949  0.996322     0.988310  0.992027     0.974775
    7   0.989648     0.961888  0.995708     0.984068  0.990699     0.965709
    8   0.988171     0.950510  0.995094     0.979240  0.989372     0.955445
    9   0.986695     0.937863  0.994480     0.973835  0.988045     0.944023
    10  0.984729     0.923541  0.993660     0.967661  0.986276     0.931067
    11  0.983010     0.907850  0.992943     0.960832  0.984730     0.916850
    
        3
  •  1
  •   yatu Sayali Sonawane    6 年前

    将列直接附加到 pd.assign :

    df.assign(**df.cumprod().add_prefix('cumprod_'))
    
     col1      col2      col3  cumprod_col1  cumprod_col2  cumprod_col3
    0   1.000000  1.000000  1.000000      1.000000      1.000000      1.000000
    1   0.998766  0.999490  0.998892      0.998766      0.999490      0.998892
    2   0.997779  0.999081  0.998005      0.996548      0.998571      0.996899
    3   0.996299  0.998469  0.996676      0.992860      0.997043      0.993586
    4   0.994573  0.997754  0.995126      0.987471      0.994803      0.988743
    5   0.993095  0.997140  0.993797      0.980653      0.991958      0.982610
    6   0.991125  0.996322  0.992027      0.971949      0.988310      0.974775
    7   0.989648  0.995708  0.990699      0.961888      0.984068      0.965709
    8   0.988171  0.995094  0.989372      0.950510      0.979240      0.955445
    9   0.986695  0.994480  0.988045      0.937863      0.973835      0.944023
    10  0.984729  0.993660  0.986276      0.923541      0.967661      0.931067
    11  0.983010  0.992943  0.984730      0.907850      0.960832      0.916850
    

    如果要将列排序为 col1 - cumprod_col1... 你可以使用 reindex_axis 按字母顺序对列排序,在本例中添加了后缀 add_suffix

    df = df.assign(**df.cumprod().add_suffix('_cumprod'))
    df = df.reindex_axis(sorted(df.columns), axis=1)
    
          col1    col1_cumprod      col2  col2_cumprod      col3  col3_cumprod
    0   1.000000      1.000000  1.000000      1.000000  1.000000      1.000000
    1   0.998766      0.998766  0.999490      0.999490  0.998892      0.998892
    2   0.997779      0.996548  0.999081      0.998571  0.998005      0.996899
    3   0.996299      0.992860  0.998469      0.997043  0.996676      0.993586
    4   0.994573      0.987471  0.997754      0.994803  0.995126      0.988743
    5   0.993095      0.980653  0.997140      0.991958  0.993797      0.982610
    6   0.991125      0.971949  0.996322      0.988310  0.992027      0.974775
    7   0.989648      0.961888  0.995708      0.984068  0.990699      0.965709
    8   0.988171      0.950510  0.995094      0.979240  0.989372      0.955445
    9   0.986695      0.937863  0.994480      0.973835  0.988045      0.944023
    10  0.984729      0.923541  0.993660      0.967661  0.986276      0.931067
    11  0.983010      0.907850  0.992943      0.960832  0.984730      0.916850