代码之家 › 专栏 › 技术社区 › piRSquared

重采样多索引

multi-index dataframe pandas python

piRSquared · 技术社区 · 7 年前

我有一个 DataFrame MultiIndex . 第一级是 DatetimeIndex 每周一次。第二层是在第一个级别上跨组一致。

我想按月份对第一个级别进行分组,并在第一周进行行处理。

安装程序

midx = pd.MultiIndex.from_arrays([
    pd.date_range('2018-01-01', freq='W', periods=10).repeat(2),
    list('ABCDEFGHIJ' * 2)
], names=['Date', 'Thing'])

df = pd.DataFrame(dict(Col=np.arange(10, 30)), midx)

预期结果

df

                  Col    
Date       Thing     
2018-01-07 A       10    # This is the first week
           B       11    # of January 2018 
2018-01-14 C       12
           D       13
2018-01-21 E       14
           F       15
2018-01-28 G       16
           H       17
2018-02-04 I       18    # This is the first week
           J       19    # of February 2018
2018-02-11 A       20
           B       21
2018-02-18 C       22
           D       23
2018-02-25 E       24
           F       25
2018-03-04 G       26    # This is the first week
           H       27    # of March 2018
2018-03-11 I       28
           J       29

                  Col    
Date       Thing     
2018-01-07 A       10    # This is the first week
           B       11    # of January 2018 
2018-02-04 I       18    # This is the first week
           J       19    # of February 2018
2018-03-04 G       26    # This is the first week
           H       27    # of March 2018

尝试

df.unstack().asfreq('M', 'ffill').stack()

                   Col
Date       Thing      
2018-01-31 G      16.0
           H      17.0
2018-02-28 E      24.0
           F      25.0

这在几个层面上都是错误的。

事情不是从正确的日期开始的。注意我想要 ['A', 'B'] 从 '2018-01-07' ['G', 'H'] .
asfreq 但这会导致 nan 并皈依 float
我不知道发生了什么事 March 2018

2 回复 | 直到 7 年前

Zero 7 年前

你可以的

In [384]: date = df.index.get_level_values('Date')

In [385]: firstweek = date.to_frame().groupby(date.strftime('%Y-%m')).min()['Date']

In [386]: df[date.isin(firstweek)]
Out[386]:
                  Col
Date       Thing
2018-01-07 A       10
           B       11
2018-02-04 I       18
           J       19
2018-03-04 G       26
           H       27

细节

In [387]: date.to_frame().groupby(date.strftime('%Y-%m')).min()
Out[387]:
              Date
2018-01 2018-01-07
2018-02 2018-02-04
2018-03 2018-03-04

In [400]: fweek = df.assign(dt=date).resample('M', level='Date')['dt'].min()

In [401]: df[date.isin(fweek)]
Out[401]:
                  Col
Date       Thing
2018-01-07 A       10
           B       11
2018-02-04 I       18
           J       19
2018-03-04 G       26
           H       27

DJK 7 年前

如果一个月的第一周只是一个月的前七天,你可以这样过滤

df[df.index.get_level_values(0).day <= 7]

                Col
Date       Thing     
2018-01-07 A       10
           B       11
2018-02-04 I       18
           J       19
2018-03-04 G       26
           H       27

除非你找的第一个星期以星期天结束,否则这样不行。

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

9 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

9 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

9 月前

user29715306 · from_users=和chats=电视节目中的差异

9 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

10 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

10 月前

prayner · 更新嵌套字典包含列表中的项

10 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

10 月前

Dave · 如何在for循环中修改列表值

10 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

10 月前