代码之家  ›  专栏  ›  技术社区  ›  user288609

DataError:在Pandas中创建pivot_表时没有要聚合的数值类型

  •  1
  • user288609  · 技术社区  · 6 年前

    period-1(1998-01-01:1998-02-01) period 2(1998-02-02-1998-05-02) . 我想把这个转换成另一个数据帧,如第二个图所示。第二个数据帧的每个条目是对应时间段中ID的平均值)。

    enter image description here

    enter image description here

    正如这里一位朋友建议的,这里是密码

    import pandas as pd
    
    df=pd.DataFrame({"ID": ["1002", "2001", "1002", "2003", "1002"],
                    "Time": ["1998-01-02", "1998-02-03", "1998-04-05", "1998-02-03", "1998-01-20"],
                     "Price": ["34", "45", "23", "30","60"]})
    print(df)
    
    df.Time=pd.to_datetime(df.Time)
    period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
    df['Price'].apply(pd.to_numeric)
    df['New']='period1'
    
    df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'
    df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
    

    但我收到了下面的错误信息,原因可能是什么?

    ---------------------------------------------------------------------------
    DataError                                 Traceback (most recent call last)
    <ipython-input-62-1d40b807c09c> in <module>()
    ----> 1 df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
    
    ~\Anaconda3\lib\site-packages\pandas\core\frame.py in pivot_table(self, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name)
       5298                            aggfunc=aggfunc, fill_value=fill_value,
       5299                            margins=margins, dropna=dropna,
    -> 5300                            margins_name=margins_name)
       5301 
       5302     def stack(self, level=-1, dropna=True):
    
    ~\Anaconda3\lib\site-packages\pandas\core\reshape\pivot.py in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna, margins_name)
         81 
         82     grouped = data.groupby(keys, observed=dropna)
    ---> 83     agged = grouped.agg(aggfunc)
         84 
         85     table = agged
    
    ~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in aggregate(self, arg, *args, **kwargs)
       4656         axis=''))
       4657     def aggregate(self, arg, *args, **kwargs):
    -> 4658         return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
       4659 
       4660     agg = aggregate
    
    ~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in aggregate(self, arg, *args, **kwargs)
       4087 
       4088         _level = kwargs.pop('_level', None)
    -> 4089         result, how = self._aggregate(arg, _level=_level, *args, **kwargs)
       4090         if how is None:
       4091             return result
    
    ~\Anaconda3\lib\site-packages\pandas\core\base.py in _aggregate(self, arg, *args, **kwargs)
        346         if isinstance(arg, compat.string_types):
        347             return self._try_aggregate_string_function(arg, *args,
    --> 348                                                        **kwargs), None
        349 
        350         if isinstance(arg, dict):
    
    ~\Anaconda3\lib\site-packages\pandas\core\base.py in _try_aggregate_string_function(self, arg, *args, **kwargs)
        302         if f is not None:
        303             if callable(f):
    --> 304                 return f(*args, **kwargs)
        305 
        306             # people may try to aggregate on a non-callable attribute
    
    ~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
       1304         nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
       1305         try:
    -> 1306             return self._cython_agg_general('mean', **kwargs)
       1307         except GroupByError:
       1308             raise
    
    ~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
       3972                             min_count=-1):
       3973         new_items, new_blocks = self._cython_agg_blocks(
    -> 3974             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
       3975         return self._wrap_agged_blocks(new_items, new_blocks)
       3976 
    
    ~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
       4044 
       4045         if len(new_blocks) == 0:
    -> 4046             raise DataError('No numeric types to aggregate')
       4047 
       4048         # reset the locs in the blocks to correspond to our
    
    DataError: No numeric types to aggregate
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   jezrael    6 年前

    有一个问题您忘记了分配回:

    df['Price'] = df['Price'].apply(pd.to_numeric)
    

    df['Price'] = df['Price'].astype(int)
    

    或:

    df['Price'] = pd.to_numeric(df['Price'])
    

    同时,您的代码应该简化为 numpy.where Series.between :

    df.Time=pd.to_datetime(df.Time)
    
    df['Price']= df['Price'].astype(int)
    df['New'] = np.where(df['Time'].between('1998-02-02','1998-05-02'), 'period2', 'period1')
    df1 = df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
    print (df1)
    New   period1  period2
    ID                    
    1002     47.0     23.0
    2001      NaN     45.0
    2003      NaN     30.0