我有一个数据框:
+---------+--------+------+
| topic| emotion|counts|
+---------+--------+------+
| dog | sadness| 4 |
| cat |surprise| 1 |
| bird | fear| 3 |
| cat | joy| 2 |
| dog |surprise| 10 |
| dog |surprise| 3 |
+---------+--------+------+
我想为每种不同的情绪创建一个专栏,汇总每个主题和每种情绪的计数,最终得到如下输出:
+---------+--------+---------+-----+----------+
| topic| fear | sadness | joy | surprise |
+---------+--------+---------+-----+----------+
| dog | 0 | 4 | 0 | 13 |
| cat | 0 | 0 | 2 | 1 |
| bird | 3 | 0 | 0 | 0 |
+---------+--------+---------+-----+----------+
这就是我到目前为止在恐惧专栏中尝试过的,但是其他的情绪在每个话题上都不断出现,我怎么能得到像上面这样的结果呢?
agg_emotion = df.groupby("topic", "emotion") \
.agg(F.sum(F.when(F.col("emotion").eqNullSafe("fear"), 1)\
.otherwise(0)).alias('fear'))