• 云贝教育 |【技术文章】pg缓存插件介绍


    一、pg_buffercache

    主要作用是查看pg的共享池中缓存的对象信息

    1.1 创建扩展

    1. postgres=# create extension pg_buffercache;
    2. CREATE EXTENSION

    1.2 查看视图pg_buffercache

    1. postgres=# \d pg_buffercache
    2. View "public.pg_buffercache"
    3. Column | Type | Collation | Nullable | Default
    4. ------------------+----------+-----------+----------+---------
    5. bufferid | integer | | |
    6. relfilenode | oid | | |
    7. reltablespace | oid | | |
    8. reldatabase | oid | | |
    9. relforknumber | smallint | | |
    10. relblocknumber | bigint | | |
    11. isdirty | boolean | | |
    12. usagecount | smallint | | |
    13. pinning_backends | integer | | |

    1.3 要看当前模式下的表​​​​​​​

    1. postgres=# \d
    2. List of relations
    3. Schema | Name | Type | Owner
    4. --------+----------------+-------+----------
    5. public | pg_buffercache | view | postgres
    6. public | t1 | table | postgres
    7. (2 rows)

    1.4 查看t1表在当前pg缓存中的信息

    1. postgres=# select count(1) from pg_buffercache where relfilenode='t1'::regclass;
    2. count
    3. -------
    4. 0
    5. (1 row)

    ​​​​​​​0表示没有缓存

    1.5 查询一次t1表​​​​​​​

    1. postgres=# select count(1) from t1;
    2. count
    3. -------
    4. 1
    5. (1 row)
    6. postgres=# select count(1) from pg_buffercache where relfilenode='t1'::regclass;
    7. count
    8. -------
    9. 1
    10. (1 row)

    1.6 查看T1表缓存情况​​​​​​​

    1. postgres=# select * from pg_buffercache where relfilenode='t1'::regclass;
    2. bufferid | relfilenode | reltablespace | reldatabase | relforknumber | relblocknumber | isdirty | usagecount | pinning_backends
    3. ----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
    4. 1006 | 16388 | 1663 | 5 | 0 | 0 | f | 1 | 0
    5. (1 row)

    有记录表示被缓存

    isdirty :f表示不是脏块

    1.7 修改表t1数据​​​​​​​

    1. postgres=# update t1 set id=22 where id=1;
    2. UPDATE 1

    1.8 对比缓存的块是否变脏​​​​​​​

    1. postgres=# select * from pg_buffercache where relfilenode='t1'::regclass;
    2. bufferid | relfilenode | reltablespace | reldatabase | relforknumber | relblocknumber | isdirty | usagecount | pinning_backends
    3. ----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
    4. 1006 | 16388 | 1663 | 5 | 0 | 0 | t | 2 | 0
    5. (1 row)

    isdirty :t表示脏块

    二、pg_prevarm

    预热功能使用pg_prevarm函数,方便将数据缓存到OS缓存中或PG缓存中

    比如生产系统中,数据库重启了,此时发起的业务SQL,就会发生物理读

    语法​​​​​​​

    1. pg_prewarm(
    2. regclass, --预热的relation
    3. mode text default 'buffer', --使用预热的方法
    4. fork text derfault 'main', --relation fork被预热
    5. first_block int8 default null, --预热的第一块号
    6. last_block int8 default null --预热的最后一个块号
    7. ) return int8;
    8. prefetch/read:缓存到os cache
    9. buffer:缓存到os cache和pg shared buffers

    参数说明:

    regclass :数据库对像,通常情况为表名

    mode :加载模式

    • prefetch:异步地将数据预加载到操作系统缓存

    • read:最终结果和 prefetch 一样,但它是同步方式,支持所有平台

    • buffer:将数据预加载到数据库缓存

    fork

    main :主表

    fsm:空间空间地图

    vm:可见性地图

    first_block :开始prewarm的数据块

    last_block :最后 prewarm 的数据块

    2.1 创建prewarm插件

    create EXTENSION pg_prewarm

    2.2 在默认shared_buffer参数,创建一张大表

    1. testdb=# show shared_buffers;
    2. shared_buffers
    3. ----------------
    4. 128MB
    5. (1 row)
    6. testdb=# create table t1 ( id int,name varchar(100),c1 varchar(200),c2 varchar(200));
    7. CREATE TABLE
    8. testdb=# insert into t1 select id,md5(id::varchar),md5(md5(id::varchar)),md5(md5(md5(id::varchar))) from generate_series(1,10000000) as id;
    9. INSERT 0 10000000

    2.3 在没有OS和PG缓存的情况下

    (一)不进行prewarm

    1、查看表体积
    1. testdb=# \d t1
    2. Table "public.t1"
    3. Column | Type | Collation | Nullable | Default
    4. --------+------------------------+-----------+----------+---------
    5. id | integer | | |
    6. name | character varying(100) | | |
    7. c1 | character varying(200) | | |
    8. c2 | character varying(200) | | |
    9. testdb=# \dt+ t1
    10. List of relations
    11. Schema | Name | Type | Owner | Persistence | Access method | Size | Descr
    12. iption
    13. --------+------+-------+----------+-------------+---------------+---------+-------------
    14. public | t1 | table | postgres | permanent | heap | 1281 MB |
    15. (1 row)
    16. testdb=# SELECT pg_size_pretty(pg_total_relation_size('t1'));
    17. pg_size_pretty
    18. ----------------
    19. 1281 MB
    20. (1 row)
    2、重启数据库并消除OS缓存​​​​​​​
    1. pg_ctl restart
    2. echo 3 > /proc/sys/vm/drop_caches
    3、查看执行计划​​​​​​​
    1. testdb=# explain analyze select count(*) from t1;
    2. QUERY PLAN
    3. -------------------------------------------------------------------------------------------------------------------------------------------
    4. Finalize Aggregate (cost=217018.73..217018.74 rows=1 width=8) (actual time=3584.950..3585.012 rows=1 loops=1)
    5. -> Gather (cost=217018.52..217018.73 rows=2 width=8) (actual time=3584.897..3584.981 rows=3 loops=1)
    6. Workers Planned: 2
    7. Workers Launched: 2
    8. -> Partial Aggregate (cost=216018.52..216018.53 rows=1 width=8) (actual time=3559.160..3559.160 rows=1 loops=3)
    9. -> Parallel Seq Scan on t1 (cost=0.00..205601.81 rows=4166681 width=0) (actual time=0.193..3373.351 rows=3333333 loops=3)
    10. Planning Time: 4.743 ms
    11. Execution Time: 3586.400 ms
    12. (8 rows)

    缓存1G的数据,耗时3586.400 ms

    (二)进行prewarm

    分别测试read和buffer模式下的效果

    1、重启数据库并消除OS缓存​​​​​​​
    1. pg_ctl restart
    2. echo 3 > /proc/sys/vm/drop_caches
    2、预热数据到OS缓存
    1. testdb=# select pg_prewarm('t1', 'read', 'main');
    2. pg_prewarm
    3. ------------
    4. 163935
    5. (1 row)
    3、查看执行计划
    1. testdb=# explain analyze select count(*) from t1;
    2. QUERY PLAN
    3. ------------------------------------------------------------------------------------------------------------------------------------------
    4. Finalize Aggregate (cost=217018.73..217018.74 rows=1 width=8) (actual time=657.884..658.970 rows=1 loops=1)
    5. -> Gather (cost=217018.52..217018.73 rows=2 width=8) (actual time=657.516..658.959 rows=3 loops=1)
    6. Workers Planned: 2
    7. Workers Launched: 2
    8. -> Partial Aggregate (cost=216018.52..216018.53 rows=1 width=8) (actual time=652.264..652.265 rows=1 loops=3)
    9. -> Parallel Seq Scan on t1 (cost=0.00..205601.81 rows=4166681 width=0) (actual time=0.092..405.615 rows=3333333 loops=3)
    10. Planning Time: 0.126 ms
    11. Execution Time: 658.997 ms
    12. (8 rows)
    ​​​​​​​4、预热到数据库缓存中​​​​​​​
    1. testdb=# select pg_prewarm('t1', 'buffer', 'main');
    2. pg_prewarm
    3. ------------
    4. 163935
    5. (1 row)
    5、查看执行计划
    1. testdb=# explain analyze select count(*) from t1;
    2. QUERY PLAN
    3. ------------------------------------------------------------------------------------------------------------------------------------------
    4. Finalize Aggregate (cost=217018.73..217018.74 rows=1 width=8) (actual time=681.629..683.325 rows=1 loops=1)
    5. -> Gather (cost=217018.52..217018.73 rows=2 width=8) (actual time=681.485..683.319 rows=3 loops=1)
    6. Workers Planned: 2
    7. Workers Launched: 2
    8. -> Partial Aggregate (cost=216018.52..216018.53 rows=1 width=8) (actual time=674.079..674.080 rows=1 loops=3)
    9. -> Parallel Seq Scan on t1 (cost=0.00..205601.81 rows=4166681 width=0) (actual time=0.025..445.632 rows=3333333 loops=3)
    10. Planning Time: 0.039 ms
    11. Execution Time: 683.353 ms
    12. (8 rows)

    缓存1G的数据,耗时683.353 ms

    总结:缓存到OS缓存中和PG缓存中,两者性能差异不大。但比不预热的情况下效果提升明显

    预热

    参数

    耗时

    3586.400 ms

    read

    658.997 ms

    buffer

    683.353 ms

    三、pgfincore( )

    将数据库对象CACHE到OS层面的缓存

    3.1 安装插件​​​​​​​

    1. testdb=# CREATE EXTENSION pgfincore;
    2. CREATE EXTENSION

    3.2 查看对象缓存信息​​​​​​

    1. testdb=# select * from pgfincore ('t1');
    2. relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
    3. --------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
    4. base/16391/33102 | 0 | 4096 | 262144 | 262144 | 1 | 1191325 | | 0 | 0
    5. base/16391/33102.1 | 1 | 4096 | 65726 | 65726 | 1 | 1191325 | | 0 | 0
    6. (2 rows)

    参数说明

    • relpath:文件位置及名称

    • segment:文件段编号

    • os_page_size:OS page或block大小

    • rel_os_pages:对象占用系统缓存需要的页面个数

    • pages_mem:对象已经占用缓存页面个数

    • group_mem:在缓存中连续的页面组的个数

    • os_pages_free:OS剩余的page数

    • databit:加载信息的位图

    pgsysconf与pgsysconf_pretty

    查看当前OS块大小及使用情况

    1. testdb=# select * from pgsysconf();
    2. os_page_size | os_pages_free | os_total_pages
    3. --------------+---------------+----------------
    4. 4096 | 1190139 | 1997572
    5. (1 row)
    6. testdb=# select * from pgsysconf_pretty();
    7. os_page_size | os_pages_free | os_total_pages
    8. --------------+---------------+----------------
    9. 4096 bytes | 4649 MB | 7803 MB
    10. (1 row)

    pgfadvise_willneed

    将数据库对象缓存到OS CACHE

    1. testdb=# select * from pgfadvise_willneed('t1');
    2. relpath | os_page_size | rel_os_pages | os_pages_free
    3. --------------------+--------------+--------------+---------------
    4. base/16391/33102 | 4096 | 262144 | 1190033
    5. base/16391/33102.1 | 4096 | 65726 | 1190033
    6. (2 rows)

    pgfadvise_dontneed

    将数据库对象刷出OS CACHE

    对当前对象设置dontneed标记。dontneed标记的意思就是当操作系统需要释放内存时优先释放标记为dontneed的pages。

    1. testdb=# select * from pgfadvise_dontneed('t1');
    2. relpath | os_page_size | rel_os_pages | os_pages_free
    3. --------------------+--------------+--------------+---------------
    4. base/16391/33102 | 4096 | 262144 | 1452085
    5. base/16391/33102.1 | 4096 | 65726 | 1517801
    6. (2 rows)
    7. testdb=# select * from pgfincore ('t1');
    8. relpath | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit | pages_dirty | group_dirty
    9. --------------------+---------+--------------+--------------+-----------+-----------+---------------+---------+-------------+-------------
    10. base/16391/33102 | 0 | 4096 | 262144 | 0 | 0 | 1517805 | | 0 | 0
    11. base/16391/33102.1 | 1 | 4096 | 65726 | 0 | 0 | 1517805 | | 0 | 0
    12. (2 rows)

  • 相关阅读:
    Spring 中 Bean 的配置细节
    关于HashCode的问题
    ICPC+CCPC写题日记~
    REGION IN TRANSITION问题
    记录一次失败的Ubuntu 20.04 OpenSSL3.0.5升级记录
    国外LEAD收款渠道介绍:Wise收款教程
    linux周六串讲
    全量、增量数据在HBase迁移的多种技巧实践
    保研复习数据结构记(8)--排序
    对话芯动科技 | 助力云游戏 4K级服务器显卡的探索与创新
  • 原文地址:https://blog.csdn.net/yunbee666/article/details/134421177