• 采用python正则表达式修改替换一行文本中的内容


    现在已知一种格式的SQL语句:

    update resource_enus `TREE_CODE` = '1002020', `CODE` = 'Web_3D', `NAME` = 'junjun' where `REFID` = 'a1034ef6632ad';
    update resource_enus `TREE_CODE` = '1003333', `CODE` = 'PR', `NAME` = 'xuanxuan' where `REFID` = 'a1034ef6632ad';
    update resource_enus `TREE_CODE` = '100420', `CODE` = 'Dddddd', `NAME` = 'chichi' where `REFID` = 'a1034ef6632ad';
    
    • 1
    • 2
    • 3

    该SQL语句通过REFID查询到对应的数据,然后对该行数据进行更新,又因为查询数据的条件是REFID属性,但是REFID属性并不是唯一的,所以REFID不能查询到具体的哪一条数据记录。

    通过搜集信息知道了由TREE_CODECODE两个字段可以查询到唯一数据。

    因为上述的SQL语句有几千条,通过手动的方式效率太多低下,所以现在采用python写正则表达式脚本的方式进行替换。

    import os, regex
    #str = "update resource_enus `TREE_CODE` = '100420', `CODE` = 'Dddd11_dd', `NAME` = 'chichi' where `REFID` = '4ef6632ad321';"
    
    pattren1 = regex.compile(r"`TREE_CODE` = '[0-9]*'")
    pattren2 = regex.compile(r"`CODE` = '[0-9a-zA-Z_]*'")
    pattren3 = regex.compile(r"`REFID` = '[0-9a-zA-Z]*'")
    
    def getNewSQL(tmpSQL):
        res1 = regex.findall(pattren1, tmpSQL)
        res2 = regex.findall(pattren2, tmpSQL)
        Query = res1[0] + " AND " + res2[0]
        return regex.sub(pattren3, Query, tmpSQL)
    
    res = ""
    with open(r"D:\VS Code Project\Regrex_Demo\oldSQL.sql", 'r', encoding='utf-8') as f: #以只读的方式打开旧SQL存放的文件
        for line in f.readlines():
            oldSQL = line.strip() #strip()方法用于消除读取到的每行数据末尾的换行符
            res += (getNewSQL(oldSQL) + '\n')
    f.close()
    
    with open(r"D:\VS Code Project\Regrex_Demo\newSQL.sql", 'w', encoding='utf-8') as w: #以只写的方式打开新SQL存放的文件
        w.write(res) #将新SQL存入该文件
    w.close()
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    上述代码中,通过创建三种正则匹配模式,pattren1 用于匹配到类似于TREE_CODE = '100420'的文本;pattren2 于匹配到类似于CODE = 'Dddd11_dd';pattren3用于匹配到类似于REFID = '4ef6632ad321'
    然后分别通过regex.findall()方法找到每一行SQL语句中的TREE_CODE = '100420'部分和CODE = 'Dddd11_dd'部分,之后将他们拼接成Query = TREE_CODE = '100420' AND CODE = 'Dddd11_dd',之后通过regex.sub()方法将Query语句替换掉SQL中的REFID = '4ef6632ad321'部分,从而组成新的SQL语句。

    上诉代码执行完毕之后的在newSQL.sql文件中生成了如下SQL代码:

    update resource_enus `TREE_CODE` = '1002020', `CODE` = 'Web_3D', `NAME` = 'junjun' where `TREE_CODE` = '1002020'AND`CODE` = 'Web_3D';
    update resource_enus `TREE_CODE` = '1003333', `CODE` = 'PR', `NAME` = 'xuanxuan' where `TREE_CODE` = '1003333'AND`CODE` = 'PR';
    update resource_enus `TREE_CODE` = '100420', `CODE` = 'Dddddd', `NAME` = 'chichi' where `TREE_CODE` = '100420'AND`CODE` = 'Dddddd';
    
    • 1
    • 2
    • 3
  • 相关阅读:
    NVMe开发——PCIe复位
    使用Jupyter Notebook调试PySpark程序错误总结
    Neutron — API Service Web 开发框架
    配置两个网关之间通过IPSec VPN互联并通过总部IPSec网关进行NAT后上网
    [高性能] 关于如何高效的往本地写入视频
    论文翻译解读:Logmap:Logic-based and scalable ontology matching
    四、Shell编程_2Shell变量
    14 C++ 二叉树创建 1 头文件 The binary tree creates1 header file
    elasticsearch-7.9.3 单节点启动配置
    centos在docker中搭建靶场
  • 原文地址:https://blog.csdn.net/qq_44887733/article/details/126111349