代码之家  ›  专栏  ›  技术社区  ›  Jose R

Python多线程(并发未来)在递归结果中,如何正确设置多线程?

  •  0
  • Jose R  · 技术社区  · 2 年前

    我编写了以下Python脚本,该脚本使用多线程执行一个返回字典的函数(我的实际应用程序用于加载和解析,但这里将其简化为字符串操作,以便于显示)。

    我找到的唯一让多线程工作的方法 窗户 是要使用的 if "__main__" == __name__: 在执行之前。然而,这似乎造成了一个问题,即实际函数之后的任何内容都会重复多次,即使它在函数或脚本执行部分之外。

    如何更新脚本,使其不具有这种递归性?(我希望函数只返回字典一次)。我做错了什么?

    这是我的重新调整用途 脚本:

    import concurrent.futures
    from itertools import product
    from time import process_time
    
    # This function generates a dictionary with the string as key and a list of its letters as the value
    def genDict (in_value):
        out_dict = {}
        out_dict[in_value] = list(in_value)
        return(out_dict)
    
    # Generate a list of all combinations of three alphabet letter strings
    # this is not necesarily a best example for multithreading, but makes the point
    # an io example would really accelerate under multithreading
    alphabets = ['a', 'b', 'c', 'd', 'e']
    listToProcess = [''.join(i) for i in product(alphabets, repeat = 4)]
    print('Lenght of List to Process:', len(listToProcess))
    
    # Send the list which is sent to the genDict function multithreaded
    t1_start = process_time()
    dictResult = {}
    if "__main__" == __name__:
        with concurrent.futures.ProcessPoolExecutor(4) as executor:
            futures = [executor.submit(genDict, elem) for elem in listToProcess]
            for future in futures:
                dictResult.update(future.result())
    t1_stop = process_time()
    print('Multithreaded Completion time =', t1_stop-t1_start, 'sec.')
    
    print('\nThis print statement is outside the loop and function but still gets wrapped in')
    print('This is the size of the dictionary: ', len(dictResult))
    

    这是我得到的输出(请注意,时间计算以及最后的print语句被“执行”多次)。输出:

    PS >> & C://multithread_test.py
    Lenght of List to Process: 625
    Lenght of List to Process: 625
    Lenght of List to Process: 625
    Multithreaded Completion time = 0.0 sec.
    Multithreaded Completion time = 0.0 sec.
    
    This print statement is outside the loop and function but still gets wrapped in
    This print statement is outside the loop and function but still gets wrapped in
    
    This is the size of the dictionary:  0
    This is the size of the dictionary:  0
    Lenght of List to Process: 625
    Multithreaded Completion time = 0.0 sec.
    
    This print statement is outside the loop and function but still gets wrapped in
    This is the size of the dictionary:  0
    Lenght of List to Process: 625
    Multithreaded Completion time = 0.0 sec.
    
    This print statement is outside the loop and function but still gets wrapped in
    This is the size of the dictionary:  0
    Multithreaded Completion time = 0.140625 sec.
    
    This print statement is outside the loop and function but still gets wrapped in
    This is the size of the dictionary:  625
    PS >>
    
    1 回复  |  直到 2 年前
        1
  •  1
  •   Tim Roberts    2 年前

    唯一应该在你的 if __name__ guard是全局输入的设置和要执行的功能。就是这样。记住,对于多处理,每个新线程都会启动一个全新的解释器,它会重新运行您的文件,但是 __name__ 设置为其他值。任何超出警戒范围的操作都将在每个过程中再次执行。

    下面是组织此类代码的方法。这是可行的。

    import concurrent.futures
    from itertools import product
    from time import process_time
    
    # This function generates a dictionary with the string as key and a list of its letters as the value
    def genDict (in_value):
        out_dict = {}
        out_dict[in_value] = list(in_value)
        return(out_dict)
    
    def main():
    # Generate a list of all combinations of three alphabet letter strings
    # this is not necesarily a best example for multithreading, but makes the point
    # an io example would really accelerate under multithreading
        alphabets = ['a', 'b', 'c', 'd', 'e']
        listToProcess = [''.join(i) for i in product(alphabets, repeat = 4)]
        print('Lenght of List to Process:', len(listToProcess))
    
    # Send the list which is sent to the genDict function multithreaded
        t1_start = process_time()
        dictResult = {}
        with concurrent.futures.ProcessPoolExecutor(4) as executor:
            futures = [executor.submit(genDict, elem) for elem in listToProcess]
            for future in futures:
                    dictResult.update(future.result())
        t1_stop = process_time()
        print('Multithreaded Completion time =', t1_stop-t1_start, 'sec.')
    
        print('\nThis print statement is outside the loop and function but still gets wrapped in')
        print('This is the size of the dictionary: ', len(dictResult))
    
    if "__main__" == __name__:
        main()