54_Pandas将DataFrame、Series转换为字典 (to_dict)

pandas.DataFrame、pandas.Series可以使用to_dict()方法转换为字典（dict类型对象）。

对于pandas.DataFrame，参数orient可以用来指定pandas.DataFrame的行标签索引、列标签列和值如何分配给字典的键和值。

在 pandas.Series 的情况下，它被转换为以标签作为键的字典。

此处解释以下内容。

pandas.DataFrame to_dict() 方法
- 指定字典的格式：Argument orient
- 转换为 dict 以外的类型：Argument into
从 pandas.DataFrame 的任意两列生成字典
pandas.Series to_dict 方法转换为 dict
- 转换为 dict 以外的类型：Argument into

创建以下 pandas.DataFrame 作为示例。

import pandas as pd
import pprint
from collections import OrderedDict

df = pd.DataFrame({'col1': [1, 2, 3], 'col2': ['a', 'x', '啊']},
                  index=['row1', 'row2', 'row3'])

print(df)
#       col1 col2
# row1     1    a
# row2     2    x
# row3     3    啊
1
2
3
4
5
6
7
8
9
10
11
12

它导入 pprint 以使输出更易于查看，并导入 OrderedDict 以通过参数解释类型规范。

pandas.DataFrame to_dict() 方法

当从 pandas.DataFrame 调用 to_dict() 方法时，默认情况下它将转换为字典（dict 类型对象），如下所示。

d = df.to_dict()

pprint.pprint(d)
# {'col1': {'row1': 1, 'row2': 2, 'row3': 3},
#  'col2': {'row1': 'a', 'row2': 'x', 'row3': '啊'}}

print(type(d))
# 
1
2
3
4
5
6
7
8

指定字典的格式：Argument orient

通过参数orient，可以指定pandas.DataFrame行标签（行名）索引、列标签（列名）列、值值如何分配给字典键和值的格式。

dict

如果 orient=‘dict’，key 是列标签，value 是行标签和值的字典。如果省略了 orient 参数（默认），则为这种格式。

{column -> {index -> value}}

d_dict = df.to_dict(orient='dict')

pprint.pprint(d_dict)
# {'col1': {'row1': 1, 'row2': 2, 'row3': 3},
#  'col2': {'row1': 'a', 'row2': 'x', 'row3': '啊'}}

print(d_dict['col1'])
# {'row1': 1, 'row2': 2, 'row3': 3}

print(type(d_dict['col1']))
# 
1
2
3
4
5
6
7
8
9
10
11

list

如果 orient=‘list’，key 是列标签，value 是值列表。行名信息丢失。

{column -> [values]}

d_list = df.to_dict(orient='list')

pprint.pprint(d_list)
# {'col1': [1, 2, 3], 'col2': ['a', 'x', '啊']}

print(d_list['col1'])
# [1, 2, 3]

print(type(d_list['col1']))
# 
1
2
3
4
5
6
7
8
9
10

series

如果 orient=‘series’，键是列标签，值是 pandas.Series，带有行标签和值。

{column -> Series(values)}

d_series = df.to_dict(orient='series')

pprint.pprint(d_series)
# {'col1': row1    1
# row2    2
# row3    3
# Name: col1, dtype: int64,
#  'col2': row1    a
# row2    x
# row3    啊
# Name: col2, dtype: object}

print(d_series['col1'])
# row1    1
# row2    2
# row3    3
# Name: col1, dtype: int64

print(type(d_series['col1']))
# 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

split

如果orient=‘split’，键为’index’、‘columns’、‘data’，values为行标签、列标签和值列表。

{index -> [index], columns -> [columns], data -> [values]}

d_split = df.to_dict(orient='split')

pprint.pprint(d_split)
# {'columns': ['col1', 'col2'],
#  'data': [[1, 'a'], [2, 'x'], [3, '啊']],
#  'index': ['row1', 'row2', 'row3']}

print(d_split['columns'])
# ['col1', 'col2']

print(type(d_split['columns']))
# 
1
2
3
4
5
6
7
8
9
10
11
12

records

如果 orient=‘records’，它将是一个列表，其元素是字典，其中 key 是列标签，value 是值。行名信息丢失。

[{column -> value}, ... , {column -> value}]

l_records = df.to_dict(orient='records')

pprint.pprint(l_records)
# [{'col1': 1, 'col2': 'a'}, {'col1': 2, 'col2': 'x'}, {'col1': 3, 'col2': '啊'}]

print(type(l_records))
# 

print(l_records[0])
# {'col1': 1, 'col2': 'a'}

print(type(l_records[0]))
# 
1
2
3
4
5
6
7
8
9
10
11
12
13

index

如果 orient=‘index’，则 key 是行标签，value 是列标签和值的字典。

{index -> {column -> value}}

d_index = df.to_dict(orient='index')

pprint.pprint(d_index)
# {'row1': {'col1': 1, 'col2': 'a'},
#  'row2': {'col1': 2, 'col2': 'x'},
#  'row3': {'col1': 3, 'col2': '啊'}}

print(d_index['row1'])
# {'col1': 1, 'col2': 'a'}

print(type(d_index['row1']))
# 
1
2
3
4
5
6
7
8
9
10
11
12

转换为 dict 以外的类型：Argument into

通过为参数指定类型，它可以转换为子类，例如 OrderedDict，而不是字典（dict 类型）。

字典值value中存储的字典类型也将是指定的类型。

od = df.to_dict(into=OrderedDict)

pprint.pprint(od)
# OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)])),
#              ('col2',
#               OrderedDict([('row1', 'a'), ('row2', 'x'), ('row3', '啊')]))])

print(type(od))
# 

print(od['col1'])
# OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)])

print(type(od['col1']))
# 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

从 pandas.DataFrame 的任意两列生成字典

还可以通过从索引和数据列中选择任意两列来创建字典。使用 dict() 和 zip()。

print(df.index)
# Index(['row1', 'row2', 'row3'], dtype='object')

print(df['col1'])
# row1    1
# row2    2
# row3    3
# Name: col1, dtype: int64

d_col = dict(zip(df.index, df['col1']))

print(d_col)
# {'row1': 1, 'row2': 2, 'row3': 3}
1
2
3
4
5
6
7
8
9
10
11
12
13

pandas.Series to_dict 方法转换为 dict

以下面的 pandas.Series 为例。

print(df)
#       col1 col2
# row1     1    a
# row2     2    x
# row3     3    啊

s = df['col1']
print(s)
# row1    1
# row2    2
# row3    3
# Name: col1, dtype: int64

print(type(s))
# 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

当你在 pandas.Series 中调用 to_dict() 方法时，会创建一个字典，其中标签是键，值是值。

d = s.to_dict()
print(d)
# {'row1': 1, 'row2': 2, 'row3': 3}

print(type(d))
# 
1
2
3
4
5
6

转换为 dict 以外的类型：Argument into

即使使用 pandas.Series 的 to_dict() 方法，通过在参数中指定类型 into，您也可以将其转换为子类，例如 OrderedDict，而不是字典（dict 类型）。

od = df['col1'].to_dict(OrderedDict)
print(od)
# OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)])

print(type(od))
# 
1
2
3
4
5
6

相关阅读:
【CTF Web】CTFShow 数据库恶意下载 Writeup（目录扫描+mdb文件泄露+Access脱库）
2.【远程调用框架】Feign远程调用
 关于电影的HTML网页设计-威海影视网站首页-电影主题HTM5网页设计作业成品
 oracle查询数据库内全部的表名、列明、注释、数据类型、长度、精度等
 冷热电气多能互补的微能源网鲁棒优化调度(Matlab代码实现）
AR道具贴纸SDK，创新技术解决方案
 Spring的前置增强，后置增强，异常抛出增强、自定义增强
 MySQL索引特性(上)
云服务器下搭建 NFS 网络文件系统
 Kafka不仅是消息队列而是一个分布式消息处理平台
原文地址：https://blog.csdn.net/qq_18351157/article/details/128018119

54_Pandas将DataFrame、Series转换为字典 (to_dict)