• 在Bcache上启动OSD报unable to read osd superblock错误


    环境信息

    环境具体信息
    架构LoongArch
    处理器Loongson-3C5000
    内核版本4.19
    操作系统版本lns8
    Ceph版本Nautilus 14.2.22
    Ceph Cluster单机最小集群,一个Monitor,两个OSD,一个Manager
    PAGESIZE16384
    [root@ceph01 ~]# getconf PAGESIZE
    16384
    
    • 1
    • 2

    问题描述

    使用Bcache加速块设备,在上述环境中创建Bcache,并在Bcache上创建OSD。但是systemctl restart ceph-osd@0.service时失败,/var/log/ceph/ceph-osd.0.log日志如下:

    2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
    2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
    2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
    2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
    2023-10-13 05:26:42.705 fff37c0030 -1 osd.0 0 OSD::init() : unable to read osd superblock
    2023-10-13 05:26:42.705 fff37c0030  1 bluestore(/var/lib/ceph/osd/ceph-0) umount
    2023-10-13 05:26:42.705 fff37c0030  4 rocksdb: [db/db_impl.cc:390] Shutdown: canceling all background work
    2023-10-13 05:26:42.705 fff37c0030  4 rocksdb: [db/db_impl.cc:563] Shutdown complete
    2023-10-13 05:26:42.709 fff37c0030  1 bluefs umount
    2023-10-13 05:26:42.709 fff37c0030  1 bdev(0xaac6157500 /var/lib/ceph/osd/ceph-0/block.wal) close
    2023-10-13 05:26:42.989 fff37c0030  1 bdev(0xaac6157880 /var/lib/ceph/osd/ceph-0/block.db) close
    2023-10-13 05:26:43.273 fff37c0030  1 bdev(0xaac6157c00 /var/lib/ceph/osd/ceph-0/block) close
    2023-10-13 05:26:43.509 fff37c0030  1 freelist shutdown
    2023-10-13 05:26:43.509 fff37c0030  1 bdev(0xaac6156000 /var/lib/ceph/osd/ceph-0/block) close
    2023-10-13 05:26:43.709 fff37c0030 -1  ** ERROR: osd init failed: (22) Invalid argument
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    可以看到OSD::init() : unable to read osd superblock,在OSD初始化时,无法读取OSD superblock。

    解决方法

    有两种解决办法:

    1. 将内核参数——PAGESIZE修改为4K。在鲲鹏BoostKit分布式存储使能套件文档中提供了将内核参数——PAGESIZE修改为4K的方法。
    2. (推荐)在loongarch平台16K页大小情况下,OSD采用direct write写superblock到地址8K-12K,采用buffer write写设备标签到地址0-4K,对buffer write操作系统会按页对齐刷盘,superblock和设备标签刚好在同一个页上,刷盘导致superblock被覆盖,无法读出正确的数据。将写设备标签改成direct write修复此问题。

    参考

  • 相关阅读:
    高可用系统架构——关于语雀宕机的思考
    35、CSS进阶——画布背景
    基于Springboot的校园求职招聘系统(有报告)。Javaee项目,springboot项目。
    Decoupled Knowledge Distillation论文阅读+代码解析
    c高级day2 linux指令的补充和shell脚本
    LeetCode 70. 爬楼梯
    src学习记录
    娄底锂电池隔膜研发实验室建设资料科普
    Nginx平滑升级的详细操作方法
    arm64页表tlbi指令api接口
  • 原文地址:https://blog.csdn.net/gengduc/article/details/133970133