代码之家  ›  专栏  ›  技术社区  ›  steff

使用从第二个df中查找的值向pd.df中添加col[重复]

  •  0
  • steff  · 技术社区  · 6 年前

    我正在寻找添加一个新的列到我从第二个df(df2)中查找的df。国防部:

                 code       date  settlement  strike type
    0  CBT_21_G2015_S 2015-01-02    1.343750   126.0    C
    1  CBT_21_G2015_S 2015-01-02    4.359375   131.5    P
    2  CBT_21_G2015_S 2015-01-02   24.671875   102.5    C
    3  CBT_21_G2015_S 2015-01-02    0.015625   110.5    P
    4  CBT_21_G2015_S 2015-01-02    0.015625   101.0    P
    5  CBT_21_G2015_S 2015-01-02    0.015625   140.5    C
    6  CBT_21_G2015_S 2015-01-02   10.671875   116.5    C
    7  CBT_21_G2015_S 2015-01-02    0.015625   123.5    P
    8  CBT_21_F2015_S 2015-01-02    3.875000   131.0    P
    9  CBT_21_F2015_S 2015-01-02    0.015625   145.0    C
    

    第二个df(df2):

                   code expiry_date
    id                             
    319  CBT_21_F2013_S  2012-12-21
    320  CBT_21_F2014_S  2013-12-27
    321  CBT_21_F2015_S  2014-12-26
    324  CBT_21_G2012_S  2012-01-27
    325  CBT_21_G2013_S  2013-01-25
    326  CBT_21_G2014_S  2014-01-24
    327  CBT_21_G2015_S  2015-01-23
    330  CBT_21_H2012_S  2012-02-24
    331  CBT_21_H2013_S  2013-02-22
    332  CBT_21_H2014_S  2014-02-21
    

    要添加到df的列是“code”的“expiry_date”。要查找到期日期: df2.loc[df2.code==df.code].到期日期

    所以期望的输出应该是这样的:

                 code       date  settlement  strike type     expiry
    0  CBT_21_G2015_S 2015-01-02    1.343750   126.0    C 2015-01-23
    1  CBT_21_G2015_S 2015-01-02    4.359375   131.5    P 2015-01-23
    2  CBT_21_G2015_S 2015-01-02   24.671875   102.5    C 2015-01-23
    3  CBT_21_G2015_S 2015-01-02    0.015625   110.5    P 2015-01-23
    4  CBT_21_G2015_S 2015-01-02    0.015625   101.0    P 2015-01-23
    5  CBT_21_G2015_S 2015-01-02    0.015625   140.5    C 2015-01-23
    6  CBT_21_G2015_S 2015-01-02   10.671875   116.5    C 2015-01-23
    7  CBT_21_G2015_S 2015-01-02    0.015625   123.5    P 2015-01-23
    8  CBT_21_F2015_S 2015-01-02    3.875000   131.0    P 2014-12-26
    9  CBT_21_F2015_S 2015-01-02    0.015625   145.0    C 2014-12-26
    

    最简单的方法是什么?

    1 回复  |  直到 6 年前
        1
  •  1
  •   rafaelc    6 年前

    IIUC,你可以使用索引匹配

    df = df.set_index('code')
    df['expiry'] = df2.set_index('code')['expiry_date']
    df.reset_index()
    
        code            date        settlement  strike  type    expiry
    0   CBT_21_G2015_S  2015-01-02  1.343750    126.0   C   2015-01-23
    1   CBT_21_G2015_S  2015-01-02  4.359375    131.5   P   2015-01-23
    2   CBT_21_G2015_S  2015-01-02  24.671875   102.5   C   2015-01-23
    3   CBT_21_G2015_S  2015-01-02  0.015625    110.5   P   2015-01-23
    4   CBT_21_G2015_S  2015-01-02  0.015625    101.0   P   2015-01-23
    5   CBT_21_G2015_S  2015-01-02  0.015625    140.5   C   2015-01-23
    6   CBT_21_G2015_S  2015-01-02  10.671875   116.5   C   2015-01-23
    7   CBT_21_G2015_S  2015-01-02  0.015625    123.5   P   2015-01-23
    8   CBT_21_F2015_S  2015-01-02  3.875000    131.0   P   2014-12-26
    9   CBT_21_F2015_S  2015-01-02  0.015625    145.0   C   2014-12-26