• Python多线程方案


    简介

    • 多进程 Processmultiprocessing
      • 优点:使用多核 CPU 并行运算
      • 缺点:占用资源最多、可启动数目比线程少
      • 适用场景:CPU 密集型
    • 多线程 Threadthreading
      • 优点:相比进程,更轻量级、占用资源少
      • 缺点:
        • 相比进程:多线程并发执行时只能同时使用一个 CPU,不能利用多 CPU(因为 GIL,但因为有 IO 存在,多线程依然可以加速运行)
        • 相比协程:启动数目有限,占用内存资源,有线程切换开销
      • 适用场景:IO 密集型、同时运行任务数不多
    • 多协程 Coroutineasyncio
      • 优点:内存开销最小、启动协程数量多
      • 缺点:支持的库少、实现复杂
      • 适用场景:IO 密集型、需要超多任务运行

    IO 指输入输出,有文件 IO 和网络 IO,如文件读写、数据库读写、网络请求(爬虫)

    好用的多线程目标:

    • 速度快
    • 有返回值
    • 数据同步




    对比

    方案优点缺点耗时/s
    基准33.05
    _thread1. 后台运行
    2. 适合 GUI
    1. 需要程序一直运行
    2. 难以获取返回值
    142.75
    Thread类1. 获取返回值有点麻烦
    2. 数据同步需要用到 Lock 或 Queue
    29.22
    multiprocessing.dummy1. 启动方便
    2. 有返回值
    3. 数据同步
    需先收集参数,编写逻辑有点不同28.81
    线程池1. 启动方便
    2. 有返回值
    3. 数据同步
    需先收集参数,编写逻辑有点不同30.09




    基准

    以简单的文件读写为例,模拟 IO 操作

    def benchmark(n):
        """多线程基准函数"""
        i = 0
        with open('{}.txt'.format(n), 'w') as f:
            for i in range(n * 1000000):
                f.write(str(i) + '\n')
        return i
    
    
    if __name__ == '__main__':
        from timeit import timeit
    
    
        def f():
            for n in range(10):
                print(benchmark(n))
    
    
        print(timeit(f, number=1))
    




    _thread

    import _thread
    
    from tool import benchmark
    
    
    def f():
        for n in range(10):
            print(_thread.start_new_thread(benchmark, (n,)))
    
    
    if __name__ == '__main__':
        f()
        while True:
            pass
    

    缺点:

    1. 需要程序一直运行
    2. 难以获取返回值




    Thread类

    import threading
    
    from tool import benchmark
    
    
    class MyThread(threading.Thread):
        def run(self):
            if self._target is not None:
                self._return = self._target(*self._args, **self._kwargs)
    
        def join(self):
            super().join()
            return self._return
    
    
    def f():
        threads = []
        for n in range(10):
            threads.append(MyThread(target=benchmark, args=(n,)))
        for thread in threads:
            thread.start()
        for thread in threads:
            print(thread.join())
    
    
    if __name__ == '__main__':
        from timeit import timeit
    
        print(timeit(f, number=1))
    

    缺点:

    1. 获取返回值有点麻烦
    2. 数据同步需要用到 LockQueue



    Lock

    import time
    import threading
    from threading import Thread, Lock
    
    lock = Lock()
    
    
    class Account:
        def __init__(self, balance):
            self.balance = balance
    
    
    def draw(account, amount):
        with lock:
            if account.balance >= amount:
                time.sleep(0.1)
                print(threading.current_thread().name, '取钱成功')
                account.balance -= amount
                print(threading.current_thread().name, '余额', account.balance)
            else:
                print(threading.current_thread().name, '取钱失败,余额不足')
    
    
    if __name__ == '__main__':
        account = Account(1000)
        ta = Thread(target=draw, args=(account, 800), name='ta')
        tb = Thread(target=draw, args=(account, 800), name='tb')
        ta.start()
        tb.start()
    



    Queue

    import threading
    from queue import Queue
    
    from tool import benchmark
    
    
    def f(queue):
        n = queue.get()
        print(benchmark(n))
    
    
    if __name__ == '__main__':
        queue = Queue()
        for n in range(10):
            queue.put(n)
    
        for n in range(10):
            thread = threading.Thread(target=f, args=(queue,))
            thread.start()
    

    这种写法数据不同步

    耗时:26.44




    multiprocessing.dummy

    from multiprocessing.dummy import Pool
    
    from tool import benchmark
    
    
    def f():
        n_list = [n for n in range(10)]
        pool = Pool(processes=8)
        results = pool.map(benchmark, n_list)
        pool.close()
        pool.join()
        print(results)
    
    
    if __name__ == '__main__':
        from timeit import timeit
    
        print(timeit(f, number=1))
    




    线程池(推荐)

    线程池

    from concurrent.futures import ThreadPoolExecutor
    
    from tool import benchmark
    
    
    def f():
        n_list = [n for n in range(10)]
        with ThreadPoolExecutor() as executor:
            results = list(executor.map(benchmark, n_list))
            print(results)
    
    
    if __name__ == '__main__':
        from timeit import timeit
    
        print(timeit(f, number=1))
    

    要用多个参数时,可用 lambda 函数进行封装,如

    import time
    
    from concurrent.futures import ThreadPoolExecutor
    
    
    def f(x=1, y=2):
        time.sleep(1)
        return x * y
    
    
    x_list = [1, 2, 3]
    y_list = [4, 5, 6]
    
    with ThreadPoolExecutor() as executor:
        results = list(executor.map(f, x_list, y_list))
        print(results)  # [4, 10, 18]
        results = list(executor.map(lambda y: f(y=y), y_list))
        print(results)  # [4, 5, 6]
    



    进度条

    from concurrent.futures import ThreadPoolExecutor
    
    from tool import benchmark
    
    
    def f():
        n_list = [n for n in range(10)]
        with ThreadPoolExecutor() as executor:
            results = list(executor.map(benchmark, n_list))
            print(results)
    
    
    if __name__ == '__main__':
        from timeit import timeit
    
        print(timeit(f, number=1))
    




    参考文献

    1. threading — Python 文档
    2. _thread — Python 文档
    3. multiprocessing — Python 文档
    4. concurrent.futures — Python 文档
    5. Python并行编程 中文版
    6. IO编程 - 廖雪峰的官方网站
    7. Python3 多线程 | 菜鸟教程
    8. python获取threading多线程的return返回值
    9. Python 获取线程返回值的三种方式
    10. Python 多线程有什么好的方案
    11. 一种Python中模拟多线程计算的方案
    12. Python并发方案深度对比
    13. Python多线程鸡年不鸡肋
  • 相关阅读:
    【etcd】go etcd实战二:分布式锁
    day01
    索引优化分析_预热_JOIN
    VMware 与 SmartX 分布式存储缓存机制浅析与性能对比
    数字图像处理实验记录三(双线性插值和最邻近插值)
    【Web】https 与 http 的区别
    双非温州大学新增电子信息专硕,考408!
    EVPN基本原理
    系统管理员道德规范
    文件批量下载
  • 原文地址:https://blog.csdn.net/lly1122334/article/details/127011043