代码之家  ›  专栏  ›  技术社区  ›  Todd Burus

使用数据帧中的现有列分配列值时出现问题

  •  2
  • Todd Burus  · 技术社区  · 5 年前

    我正试图在数据框架中创建一个新列,该列根据另一列中的值来分配值。我使用的代码指定值,但不是我想要的那样。我不知道我错过了什么。

    代码示例如下:

    #define track styles
    short = [4,6,8,9,11,20,24,28,30,33,35]
    inter = [2,3,7,12,13,17,19,25,27,32,34,36]
    long = [5,14,15,21,23,26]
    plate = [1,10,18,31]
    road = [16,22,29]
    
    #input driver and stat info    
    driver1 = input('Choose driver: ')
    
    #read driver data to dataframe
    df = pd.read_csv(driver1 + '_2018.csv')
    
    #add track type
    df['Type'] = ''
    
    for i in range(len(df)):
        if df['Race'][i] in short:
            df['Type'][i] = 'short'
        elif df['Race'][i] in inter:
            df['Type'] = 'intermediate'
        elif df['Race'][i] in long:
            df['Type'] = 'long'
        elif df['Race'][i] in plate:
            df['Type'] = 'plate'
        else:
            df['Type'] = 'road'
    
    print(df.head())
    

    我得到以下输出:

    C:\EclipseWorkspace\csse120\Personal\NASCAR_Projects\Other\driver_review.py:45: SettingWithCopyWarning: 
    A value is trying to be set on a copy of a slice from a DataFrame
    
    See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
      df['Type'][i] = 'short'
       Race  Start  Mid Race      ...       Total Laps  DRIVER RATING          Type
    0     1      5        23      ...              207          105.2  intermediate
    1     2     16         7      ...              325           94.2  intermediate
    2     3     10         2      ...              267          106.1  intermediate
    3     4      5        11      ...              311           80.0  intermediate
    4     5      6         3      ...              200          113.0  intermediate
    
    [5 rows x 20 columns]
    

    请注意,“type”列将返回所有“intermediate”,其中应包括“[”plate“、”intermediate“、”intermediate“、”short“、”long“]。

    1 回复  |  直到 5 年前
        1
  •  1
  •   jezrael    5 年前

    使用 map dictionary -首先用关键字中的新名称和值中的列表创建dict,然后在dict理解中相互交换到平面字典:

    d = {'short':short, 
         'intermediate':inter,
         'long':long,
         'plate':plate,
         'road':road}
    
    d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}
    df['Type'] = df['Race'].map(d1)
    print (df)
       Race  Start  Mid Race  Total Laps  DRIVER RATING          Type
    0     1      5        23         207          105.2         plate
    1     2     16         7         325           94.2  intermediate
    2     3     10         2         267          106.1  intermediate
    3     4      5        11         311           80.0         short
    4     5      6         3         200          113.0          long
    

    如果希望前4个类别中的所有值都不匹配,请设置为 road 去除 从第一个字典添加 fillna 对于重新计算所有不匹配的值:

    d = {'short':short, 
         'intermediate':inter,
         'long':long,
         'plate':plate}
    
    d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}
    df['Type'] = df['Race'].map(d1).fillna('road')
    

    细节 :

    print (d1)
    
    {
        4: 'short', 6: 'short',
        8: 'short', 9: 'short',
        11: 'short',    20: 'short',
        24: 'short',    28: 'short',
        30: 'short',    33: 'short',
        35: 'short',    2: 'intermediate',
        3: 'intermediate',  7: 'intermediate',
        12: 'intermediate', 13: 'intermediate',
        17: 'intermediate', 19: 'intermediate',
        25: 'intermediate', 27: 'intermediate',
        32: 'intermediate', 34: 'intermediate',
        36: 'intermediate', 5: 'long',
        14: 'long', 15: 'long',
        21: 'long', 23: 'long',
        26: 'long', 1: 'plate',
        10: 'plate',    18: 'plate',
        31: 'plate',    16: 'road',
        22: 'road', 29: 'road'
    }