• python collections 中的 Counter, defaultdict, OrderedDict, namedtuple, deque


    1 Counter

    计数,以下的例子,找出列表中元素的重复次数:

    from collections import Counter
    
    device_temperatures = [13.5, 14.0, 14.0, 14.5, 14.5, 14.5, 15.0, 16.0]
    
    temperature_counter = Counter(device_temperatures)
    
    # Counter({14.5: 3, 14.0: 2, 13.5: 1, 15.0: 1, 16.0: 1})
    print(temperature_counter) 
    print(type(temperature_counter)) # <class 'collections.Counter'>
    print(temperature_counter[13.5]) # 1
    print(temperature_counter[14.0]) # 2
    print(temperature_counter[14.5]) # 3
    print(temperature_counter[15.0]) # 1
    print(temperature_counter[16.0]) # 1
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    2 defaultdict

    例 1

    如下的代码将产生错误 KeyError,因为 my_dict 里没有为 'hi' 的 key:

    my_dict={'hello':5}
    print(my_dict['hi'])
    
    # Traceback (most recent call last):
    #   File "<string>", line 4, in <module>
    # KeyError: 'hi'
    # > 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    与之相反, defaultdict 从不会引发 KeyError
    以下是一个由 tuple 组成的列表 coworkers,列出了每个人就读过的学校:

    coworkers = [('Rolf', 'MIT'), ('Jen', 'Oxford'), ('Rolf', 'Cambridge'), ('Charlie', 'Manchester')] 
    
    • 1

    现在要得到一个dictionary:

    {
    	'Rolf': ['MIT', 'Cmbridge'],
    	'Jen': ['Oxford'],
    	'Charlie': ['Manchester']
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5

    一般的写法:

    coworkers = [('Rolf', 'MIT'), ('Jen', 'Oxford'), ('Rolf', 'Cambridge'), ('Charlie', 'Manchester')] 
    alma_maters = {}  # (alma mater:母校)
    for coworker in coworkers:
        if coworker[0] not in alma_maters:
            alma_maters[coworker[0]] = []
        alma_maters[coworker[0]].append(coworker[1])
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    或者:

    coworkers = [('Rolf', 'MIT'), ('Jen', 'Oxford'), ('Rolf', 'Cambridge'), ('Charlie', 'Manchester')] 
    alma_maters = {}  # (alma mater:母校)
    for coworker, place in coworkers:
        if coworker not in alma_maters:
            alma_maters[coworker] = [] # default value
        alma_maters[coworker].append(place)
        
    print(alma_maters)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    改为使用 defaultdict

    from collections import defaultdict
    
    coworkers = [('Rolf', 'MIT'), ('Jen', 'Oxford'), ('Rolf', 'Cambridge'), ('Charlie', 'Manchester')] 
    # 如果词典中的某一个key不存在,则调用参数里的function
    # 这里 function 是 list,即调用list,得到一个空的列表 []
    alma_maters = defaultdict(list)  
    for coworker, place in coworkers:
        alma_maters[coworker].append(place)
    
    # 如果希望在访问不存在的 key 时,能引发异常, 则添加下面的一行
    # alma_maters.default_factory = None
    # (None 改成 int 时生成 0 值)
            
    print(alma_maters['Rolf'])  # ['MIT', 'Cambridge']
    print(alma_maters['Jen'])  # ['Oxford']
    print(alma_maters['Charlie'])  # ['Manchester']
    print(alma_maters['Anne'])  # []
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    例 2

    from collections import defaultdict
    
    my_company = 'Teclado'
    
    coworkers = ['Jen', 'Li', 'Charlie', 'Rhys']
    other_coworkers = [('Rolf', 'Apple Inc.'), ('Anna', 'Google')]
    
    # 不能直接写 my_company, 因为 defaultdict 接受函数为参数
    # lambda: my_company 返回 my_company
    coworker_companies = defaultdict(lambda: my_company)
    
    for person, company in other_coworkers:
        coworker_companies[person] = company
        
    # coworkers[1] 是 'Li', 输出默认值 Teclado
    print(coworker_companies[coworkers[1]])  
    
    # other_coworkers[0][0] 是 'Rolf', 输出 Apple Inc.
    print(coworker_companies[other_coworkers[0][0]])  
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19

    3 OrderedDict

    这里是 Pascal Case

    顾名思义,OrderedDict 是有序词典,是指键值对的顺序按插入顺序排序。

    from collections import OrderedDict
    
    o = OrderedDict()
    o['Rolf'] = 6
    o['Jose'] = 10
    o['Jen'] = 3
    
    # keys are always in the order in which they were inserted
    # OrderedDict([('Rolf', 6), ('Jose', 10), ('Jen', 3)])
    print(o)  
    
    o.move_to_end('Rolf') # 移到末尾
    # OrderedDict([('Jose', 10), ('Jen', 3), ('Rolf', 6)])
    print(o)  
    
    o.move_to_end('Rolf', last = False) # 移到反向的末尾,即开头
    # OrderedDict([('Rolf', 6), ('Jose', 10), ('Jen', 3)])
    print(o)
    
    o.popitem()
    
    # OrderedDict([('Rolf', 6), ('Jose', 10)])
    print(o)
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    o.popitem(False) : 删除开头的元素

    但从 Python 3.7 开始,dictionary 已经按插入排序,所以 OrderedDict 用处不是特别大。
    Are dictionaries ordered in Python 3.6+?

    4 namedtuple

    namedtuple: 给 tuple 以及 tuple 中的每一个元素都取名字

    如下的代码,account[0]account[1] 分别指的什么不是显而易见:

    account = ('checking', 1850.90)
    
    print(account[0])  # name
    print(account[1])  # balance
    
    • 1
    • 2
    • 3
    • 4

    使用 namedtuple:

    from collections import namedtuple
    
    account = ('checking', 1850.90)
    
    # 第1个参数是 tuple 名称,和定义名称相同
    # 第2个参数是 fields 名称
    Account = namedtuple("Account", ['name', 'balance'])
    
    accountNamedTuple_1 = Account('checking', 1850.90)
    print(accountNamedTuple_1.name, accountNamedTuple_1.balance)  # checking 1850.9
    
    accountNamedTuple_2 = Account._make(account)
    account_name_2, account_balance_2 = accountNamedTuple_2
    print(account_name_2, account_balance_2) # checking 1850.9
    
    accountNamedTuple_3 = Account(*account)
    account_name_3, account_balance_3 = accountNamedTuple_3
    print(account_name_3, account_balance_3) # checking 1850.9
    
    print(accountNamedTuple_1._asdict()['balance']) # 1850.9
    print(accountNamedTuple_2._asdict()['balance']) # 1850.9
    print(accountNamedTuple_3._asdict()['balance']) # 1850.9
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    从 csv 文件或者 database 读取数据时,使用 namedtuple 可使代码更容易理解。

    5 deque

    deque:double ended queue 双端队列,使用deque而非 list 的原因首先是deque效率高,其次它保证线程安全 (thread safe)deque 所有的操作都是线程安全的 ,因此在使用 thread 时可使用 deque

    from collections import deque
    
    friends = deque(('Rolf', 'Charlie', 'Jen', 'Anna'))
    
    friends.append('Jose')
    friends.appendleft('Anthony')
    print(friends)  # deque(['Anthony', 'Rolf', 'Charlie', 'Jen', 'Anna', 'Jose'])
    
    friends.pop()
    print(friends)  # deque(['Anthony', 'Rolf', 'Charlie', 'Jen', 'Anna'])
    
    friends.popleft()  
    print(friends)  # deque(['Rolf', 'Charlie', 'Jen', 'Anna'])
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
  • 相关阅读:
    win10 xbox录屏功能不能录声音怎么办
    江科大STM32 终
    RobotFrameWork自动化测试环境搭建
    Android——gradle插件配置方式——dependencies和plugins
    【uvm】参数化Class中的静态属性
    基于Apache-DButils以及Druid(德鲁伊)与数据库交互实现的一个项目:满汉楼
    被面试官逼问的“Android系统启动流程”,该如何回答?(从原理分析到面试实战)
    CLIP文章精读
    云积天赫AI营销:重塑品牌营销新生态
    详解 localStorage、sessionStorage和cookie区别
  • 原文地址:https://blog.csdn.net/ftell/article/details/125425278