我有一个按日期索引的数据框,我试图根据类别为每个accountid提供分数,如果索引日期上存在该类别值,该数据框将如下所示。
accountid category Smooth Hard Sharp Narrow
timestamp
2018-03-29 101 Smooth 1 NaN NaN NaN
2018-03-29 102 Hard NaN 1 NaN NaN
2018-03-30 103 Narrow NaN NaN NaN 1
2018-04-30 104 Sharp NaN NaN 1 NaN
2018-04-21 105 Narrow NaN NaN NaN 1
循环每个accountid的数据帧并为每个未堆叠的类别分配分数的最佳方法是什么。
下面是数据帧创建脚本。
import pandas as pd
import datetime
idx = pd.date_range('02-28-2018', '04-29-2018')
df = pd.DataFrame(
[[ '101', '2018-03-29', 'Smooth','NaN','NaN','NaN','NaN'], [
'102', '2018-03-29', 'Hard','NaN','NaN','NaN','NaN'
], [ '103', '2018-03-30', 'Narrow','NaN','NaN','NaN','NaN'], [
'104', '2018-04-30', 'Sharp','NaN','NaN','NaN','NaN'
], [ '105', '2018-04-21', 'Narrow','NaN','NaN','NaN','NaN']],
columns=[ 'accountid', 'timestamp', 'category','Smooth','Hard','Sharp','Narrow'])
df['timestamp'] = pd.to_datetime(df['timestamp'])
df=df.set_index(['timestamp'])
print(df)