代码之家 › 专栏 › 技术社区 › Casebash

在字典中使用非哈希python对象作为键

python

Casebash · 技术社区 · 15 年前

Python不允许将非哈希对象用作其他字典中的键。正如Andrey Vlasovski所指出的,对于使用非嵌套字典作为键的特殊情况,有一个很好的解决方法:

frozenset(a.items())#Can be put in the dictionary instead

有没有一种方法可以使用任意对象作为字典中的键?

例子 :

如何将其用作密钥?

{"a":1, "b":{"c":10}}

在您的代码中,实际上必须使用类似的东西是非常罕见的。如果您认为情况如此,请考虑首先更改数据模型。

精确使用案例

用例正在缓存对仅包含关键字的任意函数的调用。字典中的每个键都是一个字符串(参数的名称),对象可能非常复杂,由分层字典、列表、元组等组成。

相关问题

此子问题已从 the problem here . 这里的解决方案处理字典不分层的情况。

8 回复 | 直到 10 年前

Lennart Regebro 15 年前

不要。我同意安德烈对前面的问题的评论,即把字典作为键,尤其是不嵌套的键是没有意义的。您的数据模型显然非常复杂,字典可能不是正确的答案。你应该试试OO。

reubano 10 年前

基于Chris Lutz的解决方案。

import collections

def hashable(obj):
    if isinstance(obj, collections.Hashable):
        items = obj
    elif isinstance(obj, collections.Mapping):
        items = frozenset((k, hashable(v)) for k, v in obj.iteritems())
    elif isinstance(obj, collections.Iterable):
        items = tuple(hashable(item) for item in obj)
    else:
        raise TypeError(type(obj))

    return items

Casebash 14 年前

基于Chris Lutz的解决方案。请注意,这不会处理由迭代更改的对象,例如流,也不会处理循环。

import collections

def make_hashable(obj):
    """WARNING: This function only works on a limited subset of objects
    Make a range of objects hashable. 
    Accepts embedded dictionaries, lists or tuples (including namedtuples)"""
    if isinstance(obj, collections.Hashable):
        #Fine to be hashed without any changes
        return obj
    elif isinstance(obj, collections.Mapping):
        #Convert into a frozenset instead
        items=list(obj.items())
        for i, item in enumerate(items):
                items[i]=make_hashable(item)
        return frozenset(items)
    elif isinstance(obj, collections.Iterable):
        #Convert into a tuple instead
        ret=[type(obj)]
        for i, item in enumerate(obj):
                ret.append(make_hashable(item))
        return tuple(ret)
    #Use the id of the object
    return id(obj)

pi. 15 年前

如果您真的必须这样做,请使您的对象可以散列。将要放入的内容作为键进行子类化,并提供一个uuu hash_uuuu函数,该函数将返回此对象的唯一键。

举例说明:

>>> ("a",).__hash__()
986073539
>>> {'a': 'b'}.__hash__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable

如果散列值不够唯一,则会发生冲突。可能也很慢。

Laurent Giroud 15 年前

我完全不同意评论和回答说,这不应该因为数据模型的纯粹原因而这样做。

字典将一个对象与另一个使用前一个对象作为键的对象相关联。字典不能用作键,因为它们不可哈希。这并不会降低将字典映射到其他对象的意义/实用性/必要性。

正如我理解的python绑定系统,您可以将任何字典绑定到多个变量(或者反过来,取决于您的术语),这意味着这些变量都知道该字典的唯一“指针”。是否可以将该标识符用作哈希键? 如果您的数据模型确保/强制您不能使用两个内容相同的字典作为键,那么这对我来说似乎是一种安全的技术。

我应该补充一点,我不知道该怎么做。

我不完全知道这应该是一个答案还是一个评论。如果需要,请纠正我。

Community CDub 8 年前

用 recursion 你说什么?

def make_hashable(h):
    items = h.items()
    for item in items:
        if type(items) == dict:
            item = make_hashable(item)
    return frozenset(items)

您可以为任何其他可变类型添加其他类型测试,使其成为可哈希的。这不应该很难。

wberry 14 年前

我在使用基于调用签名缓存以前调用结果的装饰器时遇到了这个问题。我不同意这里关于“你不应该这样做”的评论/回答,但我认为重要的是要认识到当沿着这条路走下去时,可能会出现令人惊讶和意想不到的行为。我的想法是,由于实例是可变的和可散列的,并且似乎不实际地改变这一点,所以创建非散列类型或对象的可散列等价物并没有本质上的错误。当然,这只是我的观点。

对于任何需要Python2.5兼容性的人来说,下面的内容可能会很有用。我是根据先前的答案得出的。

from itertools import imap
tuplemap = lambda f, data: tuple(imap(f, data))
def make_hashable(obj):
  u"Returns a deep, non-destructive conversion of given object to an equivalent hashable object"
  if isinstance(obj, list):
    return tuplemap(make_hashable, iter(obj))
  elif isinstance(obj, dict):
    return frozenset(tuplemap(make_hashable, obj.iteritems()))
  elif hasattr(obj, '__hash__') and callable(obj.__hash__):
    try:
      obj.__hash__()
    except:
      if hasattr(obj, '__iter__') and callable(obj.__iter__):
        return tuplemap(make_hashable, iter(obj))
      else:
        raise NotImplementedError, 'object of type %s cannot be made hashable' % (type(obj),)
    else:
      return obj
  elif hasattr(obj, '__iter__') and callable(obj.__iter__):
    return tuplemap(make_hashable, iter(obj))
  else:
    raise NotImplementedError, 'object of type %s cannot be made hashable' % (type, obj)

koo 12 年前

我同意LennartRegebro的观点,但我经常发现缓存一些函数调用、可调用对象和/或Flyweight对象很有用,因为它们可能使用关键字参数。

但如果你真的想要,试试看 pickle.dumps (或) cPickle 如果python 2.6)是一个快速而肮脏的黑客。它比使用递归调用使项不可变的任何答案都快得多,而且字符串是可哈希的。

import pickle
hashable_str = pickle.dumps(unhashable_object)