代码之家  ›  专栏  ›  技术社区  ›  Karn Kumar

用均值函数表示误差

  •  1
  • Karn Kumar  · 技术社区  · 6 年前

    KeyError: 'BasePay' 对于 BasePay mean() 功能。

    我的熊猫版是 '0.23.3' 蟒蛇3.6.3

      >>> import numpy as np
      >>> salDataF = pd.read_csv('Salaries.csv', low_memory=False)
        >>> salDataF.head()
           Id       EmployeeName                                        JobTitle    BasePay OvertimePay   OtherPay  ...     TotalPay  TotalPayBenefits  Year  Notes         Agency Status
        0   1     NATHANIEL FORD  GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY  167411.18         0.0  400184.25  ...    567595.43         567595.43  2011    NaN  San Francisco    NaN
        1   2       GARY JIMENEZ                 CAPTAIN III (POLICE DEPARTMENT)  155966.02   245131.88  137811.38  ...    538909.28         538909.28  2011    NaN  San Francisco    NaN
        2   3     ALBERT PARDINI                 CAPTAIN III (POLICE DEPARTMENT)  212739.13   106088.18    16452.6  ...    335279.91         335279.91  2011    NaN  San Francisco    NaN
        3   4  CHRISTOPHER CHONG            WIRE ROPE CABLE MAINTENANCE MECHANIC    77916.0    56120.71   198306.9  ...    332343.61         332343.61  2011    NaN  San Francisco    NaN
        4   5    PATRICK GARDNER    DEPUTY CHIEF OF DEPARTMENT,(FIRE DEPARTMENT)   134401.6      9737.0  182234.59  ...    326373.19         326373.19  2011    NaN  San Francisco    NaN
    
        [5 rows x 13 columns]
        >>> EmpSal = salDataF.groupby('Year').mean()
        KeyboardInterrupt
        >>> salDataF.groupby('Year').mean()
                    Id      TotalPay  TotalPayBenefits  Notes
        Year
        2011   18080.0  71744.103871      71744.103871    NaN
        2012   54542.5  74113.262265     100553.229232    NaN
        2013   91728.5  77611.443142     101440.519714    NaN
        2014  129593.0  75463.918140     100250.918884    NaN
        >>> EmpSal = salDataF.groupby('Year').mean()['BasePay']
    

    错误:KeyError:'BasePay'

    1 回复  |  直到 6 年前
        1
  •  0
  •   jezrael    6 年前

    问题来了 BasePay salDataF.groupby('Year').mean() exclude all non numeric columns 设计的。

    astype :

    salDataF['BasePay'] = salDataF['BasePay'].astype(float)
    

    to_numeric 具有 errors='coerce' 把它们转换成 NaN s码

    salDataF['BasePay'] = pd.to_numeric(salDataF['BasePay'], errors='coerce')
    

    mean :

    EmpSal = salDataF.groupby('Year')['BasePay'].mean()