代码之家 › 专栏 › 技术社区 › MarkS

列中列表中元素的值计数

itertools list-comprehension pandas python-3.x

MarkS · 技术社区 · 2 年前

我有一个列,其中包含各种大小的列表,但项目数量有限。

print(df['channels'].value_counts(), '\n')

输出:

[web, email, mobile, social]    77733
[web, email, mobile]            43730
[email, mobile, social]         32367
[web, email]                    13751

所以我想要的是网络、电子邮件、移动和社交的总次数。

这些应该是:

web =    77733 + 43730 + 13751            135,214
email =  77733 + 43730 + 13751 + 32367    167,581
mobile = 77733 + 43730 + 32367            153,830
social = 77733 + 32367                    110,100

我尝试了以下两种方法:

sum_channels_items = pd.Series([x for item in df['channels'] for x in item]).value_counts()
print(sum_channels_items)

from itertools import chain
test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
print(test)

这两种方法都会失败,并出现相同的错误(仅显示第二个错误)。

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 416, in <module>
    test = pd.Series(list(chain.from_iterable(df['channels']))).value_counts()
TypeError: 'float' object is not iterable

1 回复 | 直到 2 年前

enke 2 年前

一个选择是 explode ,然后计算值:

out = df['channels'].explode().value_counts()

另一个可能是使用 collections.Counter 。请注意,您的错误表明列中缺少值,因此可以先删除它们:

from itertools import chain
from collections import Counter
out = pd.Series(Counter(chain.from_iterable(df['channels'].dropna())))

推荐文章

Aaron Green · 我的python程序无法识别数据库的存在,即使它在那里

1 年前

danial · 如何在多个字符串的每个位置找到最频繁的字符

2 年前

Henry · 使用Python将json重新格式化为键值对

2 年前

eymentakak · json字典类型错误:字符串索引必须是整数

2 年前

Qubix · 从熊猫数据帧创建相对熵矩阵

2 年前

FÄÅ ÛÅ · 字典、列表和字符串

2 年前

OrbitDuster · 如何使用gmail api在python中打印gmail正文?

2 年前

guiguilecodeur · 如何删除我的词汇表中的重复元素

2 年前

Susheel P M · 这是关于if-else语句[关闭]

2 年前

Slartibartfast · 关于Python版本安装

2 年前