代码之家 › 专栏 › 技术社区 › SBad

df['X'].unique()和TypeError:unshable type:'numpy.ndarray'

group-by pandas python

SBad · 技术社区 · 6 年前

所有人,

我在数据框中有一个列如下所示:

allHoldingsFund['BrokerMixed']
Out[419]: 
78         ML
81       CITI
92         ML
173      CITI
235        ML
262        ML
264        ML
25617      GS
25621    CITI
25644    CITI
25723      GS
25778    CITI
25786    CITI
25793      GS
25797    CITI
Name: BrokerMixed, Length: 2554, dtype: object

尽管列是一个对象。我无法按该列分组,甚至无法提取该列的唯一值。例如,当我这样做时:

allHoldingsFund['BrokerMixed'].unique()

我有个错误

     uniques = table.unique(values)
  File "pandas/_libs/hashtable_class_helper.pxi", line 1340, in pandas._libs.hashtable.PyObjectHashTable.unique
TypeError: unhashable type: 'numpy.ndarray'

我在分组时也会出错。

欢迎任何帮助。谢谢你

3 回复 | 直到 6 年前

Hari_pb 6 年前

首先我建议你检查一下 type 你的 column . 你可以试一下

print (type(allHoldingsFund['BrokerMixed']))

如果这是一个 dataframe series ,您可以尝试

allHoldingsFund['BrokerMixed'].reset_index()['BrokerMixed'].unique()

看看这是否适合你。

jpp 6 年前

看起来你的序列中有一个NumPy数组。但是你不能散列NumPy数组 pd.Series.unique ,就像 set ,依赖于散列。

如果不能确保序列数据仅由字符串组成,则可以在调用 pd系列独特 :

s = pd.Series([np.array([1, 2, 3]), 1, 'hello', 'test', 1, 'test'])

def tuplizer(x):
    return tuple(x) if isinstance(x, (np.ndarray, list)) else x

res = s.apply(tuplizer).unique()

print(res)

array([(1, 2, 3), 1, 'hello', 'test'], dtype=object)

当然,这意味着您的数据类型信息在结果中丢失,但至少您可以看到您的“唯一”NumPy数组,前提是它们是一维的。

Sahil Puri 6 年前

数据列中有数组,可以尝试下列操作

allHoldingsFund['BrokerMixed'].apply(lambda x: str(x)).unique()

推荐文章

July · 如何定义数字间隔,然后四舍五入

1 年前

Community wiki · 对象名称前的单下划线和双下划线的含义是什么?

1 年前

Brian Johnson · 为什么在Python中列出字典列表会引发TypeError?[已关闭]

1 年前

user026 · 如何根据特定窗口的平均值(行数)创建新列?

1 年前

Ashok Shrestha · 需要追踪特定的颜色线并获取坐标

1 年前

Nicote Ool · 在FastApi和Vue3中获得422

1 年前

NeoExceptCodeBad · 如果我有很多垂直线,我如何找到它们的边缘?

1 年前

Abdulaziz · 如何对集合内的列表进行排序[重复]

1 年前

user2743931 · 带有src目录的Python setup.py

1 年前

asmgx · 为什么合并数据帧不能按照python中的预期方式工作

1 年前