数据分析之正则表达式

基础匹配规则

符号	示例	描述
literal	hello	匹配字面字符
re1\|re2	he\|she	匹配re1或者re2表达式
.	a.b	匹配任意字符（除了\n之外）
[x-y]	[A-Z]	匹配某一个范围内的单一字符
[^…]	[^abc]，[^a-z]	不匹配字符集出现的字符
^	^hello world	匹配以字符串内容为起始的部分
$	hello world$	匹配以字符串内容为结束的部分

控制匹配次数

符号	示例	描述
*	[A-Za-z0-9]*	匹配0次或者多次前面的字母或者数字的组合
+	[a-z]+\.com	匹配1次或者多次前面的字母或者数字的组合
？	[hello]+	匹配0次或者1次前面的字母或者数字的组合
{N}	[0-9]{3}	匹配N个数字
{M,N}	[0-9]{1,3}	匹配M~N次前面的正则表达式

前后格式匹配

符号	示例	描述
(?<=…)	(?<=hel)[a-z]{2}	匹配hel（hello）后面的两个字符
(?=…)	[a-z]{2}(?=rld)	匹配rld（world）前面两个字符
(?	(?	匹配字符串之前不是192.168.开头的字符
(?!..)	(?!.cn)	匹配非.cn的的前面字符

re包源码注释

The special characters are:
    "."      Matches any character except a newline.
    "^"      Matches the start of the string.
    "$"      Matches the end of the string or just before the newline at
             the end of the string.
    "*"      Matches 0 or more (greedy) repetitions of the preceding RE.
             Greedy means that it will match as many repetitions as possible.
    "+"      Matches 1 or more (greedy) repetitions of the preceding RE.
    "?"      Matches 0 or 1 (greedy) of the preceding RE.
    *?,+?,?? Non-greedy versions of the previous three special characters.
    {m,n}    Matches from m to n repetitions of the preceding RE.
    {m,n}?   Non-greedy version of the above.
    "\\"     Either escapes special characters or signals a special sequence.
    []       Indicates a set of characters.
             A "^" as the first character indicates a complementing set.
    "|"      A|B, creates an RE that will match either A or B.
    (...)    Matches the RE inside the parentheses.
             The contents can be retrieved or matched later in the string.
    (?aiLmsux) Set the A, I, L, M, S, U, or X flag for the RE (see below).
    (?:...)  Non-grouping version of regular parentheses.
    (?P<name>...) The substring matched by the group is accessible by name.
    (?P=name)     Matches the text matched earlier by the group named name.
    (?#...)  A comment; ignored.
    (?=...)  Matches if ... matches next, but doesn't consume the string.
    (?!...)  Matches if ... doesn't match next.
    (?<=...) Matches if preceded by ... (must be fixed length).
    (?<!...) Matches if not preceded by ... (must be fixed length).
    (?(id/name)yes|no) Matches yes pattern if the group with id/name matched,
                       the (optional) no pattern otherwise.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

re.compile()

re.compile(pattern, flags=0)
1

编译一个正则表达式

re.match()

re.match(pattern, string, flags=0)
1

从字符串的起始位置匹配一个模式，匹配失败就返回None.

re.search()

re.search(pattern, string, flags=0)
1

扫描整个字符串并返回第一个匹配成功的。

re.sub()

re.sub(pattern, repl, string, count=0, flags=0)
1

菜鸟教程

相关阅读:
Web3D虚拟人捏脸
走进 Java 接口测试之简单解决写接口脏数据问题
致敬逆行者网页设计作品大学生抗疫感动专题网页设计作业模板疫情感动人物静态HTML网页模板下载
决胜未来：解锁新科技趋势的无尽可能性
一文带你搞懂Redis持久化
leetcode题中的非常见方法归纳
背包问题
JavaEE初阶学习:Servlet
Elasticsearch7.15.2 安装ik中文分词器后启动ES服务报错的解决办法
C语言：函数

原文地址：https://blog.csdn.net/weixin_42213421/article/details/127967125