代码之家  ›  专栏  ›  技术社区  ›  Amit

Python多处理池映射:AttributeError:无法pickle本地对象

  •  4
  • Amit  · 技术社区  · 6 年前

    我在一个类中有一个方法,它需要在一个循环中做大量的工作,我想把工作扩展到我所有的核心。

    我写了下面的代码,如果我使用普通的 map() ,但与 pool.map() 返回一个错误。

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
    
    class OtherClass:
      def run(sentence, graph):
        return False
    
    class SomeClass:
      def __init__(self):
        self.sentences = [["Some string"]]
        self.graphs = ["string"]
    
      def some_method(self):
          other = OtherClass()
    
          def single(params):
              sentences, graph = params
              return [other.run(sentence, graph) for sentence in sentences]
    
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    
    SomeClass().some_method()
    

    错误1:

    AttributeError:无法pickle local object'SomeClass.someu方法…单身

    为什么不能泡菜 single() 单个() 到全局模块范围(不在类内-使其独立于上下文):

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
    
    class OtherClass:
      def run(sentence, graph):
        return False
    
    
    def single(params):
        other = OtherClass()
        sentences, graph = params
        return [other.run(sentence, graph) for sentence in sentences]
    
    class SomeClass:
      def __init__(self):
        self.sentences = [["Some string"]]
        self.graphs = ["string"]
    
      def some_method(self):
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    
    SomeClass().some_method()
    

    错误2:

    AttributeError:无法在模块上获取属性'single' 主要的 '从'/测试.py'

    1 回复  |  直到 4 年前
        1
  •  65
  •   Darkonaut    5 年前

    错误1:

    AttributeError:无法pickle本地对象 'SomeClass.someu方法…单身

    通过移动嵌套的目标函数,您自己解决了这个错误 single() 到顶层去。

    背景:

    池需要pickle(序列化)它发送给其工作进程的所有内容( IPC ). 酸洗实际上只保存函数的名称,取消酸洗需要按名称重新导入函数。为此,函数需要在顶层定义,嵌套函数不会被子函数导入,并且已经尝试pickle它们会引发异常( more


    错误2:

    AttributeError:无法从中获取模块“main”的属性“single”

    你正在启动游泳池 之前 why? )它与 if __name__ == '__main__':

    import multiprocessing
    
    class OtherClass:
      def run(self, sentence, graph):
        return False
    
    
    def single(params):
        other = OtherClass()
        sentences, graph = params
        return [other.run(sentence, graph) for sentence in sentences]
    
    class SomeClass:
       def __init__(self):
           self.sentences = [["Some string"]]
           self.graphs = ["string"]
    
       def some_method(self):
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    if __name__ == '__main__':  # <- prevent RuntimeError for 'spawn'
        # and 'forkserver' start_methods
        with multiprocessing.Pool(multiprocessing.cpu_count() - 1) as pool:
            print(SomeClass().some_method())
    

    附录

    …我想把工作分散到我所有的核心。

    潜在有用的背景知识 multiprocessing.Pool 分块工作:

    Python multiprocessing: understanding logic behind chunksize

        2
  •  13
  •   Marcell Pigniczki    4 年前

    使用 def 声明。如果声明要在中使用的函数 Pool.map global

    import multiprocessing
    pool = multiprocessing.Pool(multiprocessing.cpu_count() - 1)
    
    class OtherClass:
      def run(sentence, graph):
        return False
    
    class SomeClass:
      def __init__(self):
        self.sentences = [["Some string"]]
        self.graphs = ["string"]
    
      def some_method(self):
          global single  # This is ugly, but does the trick XD
    
          other = OtherClass()
    
          def single(params):
              sentences, graph = params
              return [other.run(sentence, graph) for sentence in sentences]
    
          return list(pool.map(single, zip(self.sentences, self.graphs)))
    
    
    SomeClass().some_method()