• MySQL text 能存多少个字符


    MySQL text 能存多少个字符

    前言

    今天测试给提了个bug, 排查原因是插入数据时字段长度不够导致。我使用的是MySQL8的数据库,在给某个表新增一条数据,其中一个字段submit_info使用的是text的类型预设用来存储大的json字符串。之前对text到底能存多长的字符串没概念,恰好这次预到问题决定调查一下。这里记录一下

    一、字符集字符长度和字节长度

    首先需要知道字符长度和字节长度、和字符集,不同的字符集存储的中文字符时占的字节长度不一样。

    MySQL在当前库中通过 show variables like '%char%';可查看当前数据库的字符集character_set_database->utf8mb4

    show variables like ‘%char%’;

    Variable_nameValue
    character_set_clientutf8mb4
    character_set_connectionutf8mb4
    character_set_databaseutf8mb4
    character_set_filesystembinary
    character_set_resultsutf8mb4
    character_set_serverutf8mb4
    character_set_systemutf8
    character_sets_dirC:\Program Files\MySQL\MySQL Server 8.0\share\charsets\

    utf8mb4字符集下英文占用1个字节长度,一般汉字占3-4个字节长度。可用 length(字段)char_length(字段)进行区分

    select version(), submit_info,length(submit_info),char_length(submit_info) from appro_flow_main where id in(35,36);

    version()submit_infolength(submit_info)char_length(submit_info)
    8.0.18吕吕吕93
    8.0.18aaa33

    根据MySQL官网的资料显示,text的长度是 L < 2^16 = 65536个字节长度,注意这里是字节长度,而不是字符长度(varchar(n)这里的n是字符长度),所以说 text 在 character_set_database->utf8mb4 字符集下,大约能存 65535 / 3 = 21845个汉字 并不多,如果存json串实际上并存储不了多大的json对象尤其是包含中文比较多的情况下。

    String Type Storage Requirements

    In the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value.

    Data TypeStorage Required
    CHAR(*M*)The compact family of InnoDB row formats optimize storage for variable-length character sets. See COMPACT Row Format Storage Characteristics. Otherwise, M × w bytes, <= *M* <= 255, where w is the number of bytes required for the maximum-length character in the character set.
    BINARY(*M*)M bytes, 0 <= *M* <= 255
    VARCHAR(*M*), VARBINARY(*M*)L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require more than 255 bytes
    TINYBLOB, TINYTEXTL + 1 bytes, where L < 2^8
    BLOB, TEXTL + 2 bytes, where L < 2^16
    MEDIUMBLOB, MEDIUMTEXTL + 3 bytes, where L < 2^24
    LONGBLOB, LONGTEXTL + 4 bytes, where L < 2^32
    ENUM('*value1*','*value2*',...)1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum)
    SET('*value1*','*value2*',...)1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum)

    二、验证案例

    (一)表结构DDL和Java Entity

    CREATE TABLE `appro_flow_main` (
      `id` int(9) NOT NULL AUTO_INCREMENT COMMENT '物理主键id',
      `flow_instance_no` varchar(64) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程实例号',
      `flow_instance_name` varchar(128) COLLATE utf8mb4_general_ci NOT NULL COMMENT '流程实例名称',
      `flow_no` varchar(32) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程号',
      `appro_status` char(1) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程审核状态(进行中:1、已结束:0、保存:2、撤销)',
      `submit_info` text COLLATE utf8mb4_general_ci COMMENT '提交信息json格式【元数据格式:name、key、value、type】',
      `sponsor` varchar(128) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程发起人',
      `initi_date` int(8) DEFAULT NULL COMMENT '流程发起日期',
      `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
      `update_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '更新时间',
      `table_id` varchar(128) COLLATE utf8mb4_general_ci DEFAULT NULL,
      PRIMARY KEY (`id`),
      UNIQUE KEY `uk_fin_fn` (`flow_instance_no`,`flow_no`)
    ) ENGINE=InnoDB AUTO_INCREMENT=37 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='审批流程主表'
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    /**
     * 流程主表
     *
     * @author lvzb
     * @date 2022/08/09  10:40
     **/
    @Data
    @Builder
    @NoArgsConstructor
    @AllArgsConstructor
    @TableName("appro_flow_main")
    public class FlowMain implements Serializable {
    
        @TableId(type = IdType.AUTO)
        private Integer id;
        private String flowInstanceNo;
        private String flowInstanceName;
        private String flowNo;
        private String approStatus;
        private String submitInfo;
        private String sponsor;
        private Integer initiateDate;
        private Integer branch;
        private Date createTime;
        private Date updateTime;
    
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27

    (二)Java代码

    使用的mybatis-plus 这里只给出核心代码。注意如果for循环里 i 从 0.0开始的话,第一条insert就会直接抛异常 text的最大长度是65535个字节长度,2^16 = 65536

        @Test
        void createWorld() {
            double num = Math.pow(2, 16);
            System.out.println("长度:" + num);
            StringBuilder sb = new StringBuilder();
            StringBuilder cb = new StringBuilder();
            for (double i = 1.0; i < num; i++) {
                sb.append("吕");
                cb.append("a");
            }
    
            flowMainMapper.insert(FlowMain.builder()
                    .flowNo("" + System.nanoTime())
                    .flowInstanceName("a")
                    .flowInstanceNo("aaa")
                    .submitInfo(cb.toString())
                    .build());
    
            flowMainMapper.insert(FlowMain.builder()
                    .flowNo("" + System.nanoTime())
                    .flowInstanceName("a")
                    .flowInstanceNo("aaa")
                    .submitInfo(sb.toString())
                    .build());
        }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25

    控制台打印如下:

    长度:65536.0
    Creating a new SqlSession
    SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@38f502fc] was not registered for synchronization because synchronization is not active
    2022-10-27 15:57:53.448  INFO 58032 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Starting...
    2022-10-27 15:57:53.922  INFO 58032 --- [           main] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Start completed.
    JDBC Connection [HikariProxyConnection@1440325059 wrapping com.mysql.cj.jdbc.ConnectionImpl@521a506c] will not be managed by Spring
    ==>  Preparing: INSERT INTO appro_flow_main ( flow_instance_no, flow_instance_name, flow_no, submit_info ) VALUES ( ?, ?, ?, ? )
    ==> Parameters: aaa(String), a(String), 360639398372500(String), aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...........
    <==    Updates: 1
    Closing non transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@38f502fc]
    Creating a new SqlSession
    SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@4fc3529] was not registered for synchronization because synchronization is not active
    JDBC Connection [HikariProxyConnection@954492773 wrapping com.mysql.cj.jdbc.ConnectionImpl@521a506c] will not be managed by Spring
    ==>  Preparing: INSERT INTO appro_flow_main ( flow_instance_no, flow_instance_name, flow_no, submit_info ) VALUES ( ?, ?, ?, ? )
    ==> Parameters: aaa(String), a(String), 360640025716500(String), 吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕..............
    Closing non transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@4fc3529]
    
    
    org.springframework.dao.DataIntegrityViolationException: 
    ### Error updating database.  Cause: com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1
    ### The error may exist in demo/mybatis/plus/dao/mapper/FlowMainMapper.java (best guess)
    ### The error may involve demo.mybatis.plus.dao.mapper.FlowMainMapper.insert-Inline
    ### The error occurred while setting parameters
    ### SQL: INSERT INTO appro_flow_main  ( flow_instance_no, flow_instance_name, flow_no,  submit_info )  VALUES  ( ?, ?, ?,  ? )
    ### Cause: com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1
    ; Data truncation: Data too long for column 'submit_info' at row 1; nested exception is com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26

    三、结论

    实际上根据 以上的分析

    • text 可以存储 65535个字节=> 65535 (byte)/1024 ≈ 64KB 大小
    • 在utf8mb4的字符集下 text 最多可以存储 65535 / 3(一个汉字的占用字节长度) = 21845, 2万个左右的中文字符

    由上可知,比较大的JSON串尤其是含中文较多的不适合使用text进行存储。根据需求使用 更长的 MEDIUMTEXTLONGTEXT

    参考资料

  • 相关阅读:
    在Vue中使用vant的方法,踩坑in .node_modulesvantesdialogDialog.js
    Ubuntu编译运行socket.io
    讯飞开放平台--星火认知大模型--开发技术文档--js实例代码详解
    HTML5期末大作业【红色的电影售票平台网站】web前端 html+css+javascript网页设计实例 企业网站制作
    如何使用ARM协处理器CP15在32位ARRCH模式下操作64位寄存器)
    『外卖好评(通用50字)』
    thinkphp5.1 单元测试phpunit使用和常见问题
    buildroot中将编译好的库(Qt,tslib)放入嵌入式linux文件系统
    GBase 8c V3.0.0数据类型——模式可见性查询函数
    CentOS7.9 下修改MariaDB访问端口不能访问
  • 原文地址:https://blog.csdn.net/LvQiFen/article/details/127557657