代码之家 › 专栏 › 技术社区 › Alison LT

将融化的熊猫数据帧转回到广角视图?

pivot pandas python-3.x

Alison LT · 技术社区 · 7 年前

我有一个 DataFrame 调用 df 如下所示:(其中所有值都是字符串):

        id        type       variable
---------------------------------------------
         A         a          item_1
         A         a          item_2
         A         a          item_3
         A         b          item_4
         A         b          item_5
         A         b          item_6
         A         c          item_7
         A         c          item_8
         A         c          item_9

我想将其转换为:

type  a                     |b                       |c
id
------------------------------------------------------------------------------

A     item_1|item_2|item_3 | item_4 | item_5 |item_6| item_7 |item_8 | item_9

基本上,我想要列 type 和 variable 按多级列排列。这显然是一个快照,但基本上每个快照有9个不同的值 id 在里面 df公司

我尝试了以下代码:

df.pivot(index = 'id', columns = 'type', values = 'variable')

但出现以下错误:

ValueError: Index contains duplicate entries, cannot reshape

我相信有一个相当简单的解决方案,我只是没有考虑它!如果有任何帮助,我将不胜感激。谢谢

1 回复 | 直到 7 年前

BENY 7 年前

创建辅助键(通过使用 cumcount )此处用于删除错误 Index contains duplicate

df.assign(helpkey=df.groupby('type').cumcount()).set_index(['id','type','helpkey']).variable.unstack([-2,-1])
Out[138]: 
type          a                       b                       c          \
helpkey       0       1       2       0       1       2       0       1   
id                                                                        
A        item_1  item_2  item_3  item_4  item_5  item_6  item_7  item_8   
type             
helpkey       2  
id               
A        item_9

我们还可以使用 crosstab

pd.crosstab(index=df.id,columns=[df.type,df.groupby('type').cumcount()],values=df.variable,aggfunc='sum')
Out[144]: 
type        a                       b                       c                
col_1       0       1       2       0       1       2       0       1       2
id                                                                           
A      item_1  item_2  item_3  item_4  item_5  item_6  item_7  item_8  item_9

或 pivot_table :

df.assign(helpkey=df.groupby('type').cumcount()).pivot_table(index='id',columns=['type','helpkey'],values='variable', aggfunc='sum')

推荐文章

jbuddy_13 · Python根据一列的唯一值创建多个列

2 年前

Sebastian · 用多重性别特定年龄组旋转人口数据

3 年前

Chris Medlin · SQL-将具有编码文本值的行转换/转置为列

7 年前

Westerlund.io · 使用子查询提高CTE的性能

7 年前

Vijay · 在一列中连接两个具有不同值的表

7 年前

Priyank Patel · SQL转换数据

7 年前

Alison LT · 将融化的熊猫数据帧转回到广角视图?

7 年前

Developer Marius Å½ilÄnas · 按res合作伙伴类别的Odoo销售报告管道过滤器

7 年前

Vinh Ton · 是否可以在不显式命名SQL Server中的每一列的情况下进行多个数据透视?

7 年前

Micho Rizo · 带有名称/值列的oracle listagg

7 年前