6.9.re

import re
from toolkit.Help import Help as H

Windows 10
Python 3.7.3 @ MSC v.1915 64 bit (AMD64)
Latest build date 2020.04.11
re version:  2.2.1

re模块的函数

re 模块包含12个函数：

h = H(re)
d = h.dicts
d["function"]

module

['compile',
 'escape',
 'findall',
 'finditer',
 'fullmatch',
 'match',
 'purge',
 'search',
 'split',
 'sub',
 'subn',
 'template']

`match`

尝试在字符串的开头匹配正则表达式，返回Match对象，如果找不到匹配项，则返回None。

注意：如果在字符串的开头就匹配失败，则返回None。

# 匹配部分字符串
content = 'Hello 123'
result = re.match('^Hello\s{1}\d{2}', content)
result

<re.Match object; span=(0, 8), match='Hello 12'>

# 在字符串的开头就匹配失败
content = 'Hello 123'
result = re.match('^hello\s{1}\d{2}', content)
print(result)

None

Match对象提供了一些方法用于选择匹配结果：

[i for i in dir(re.Match) if "_" not in i]

['end',
 'endpos',
 'expand',
 'group',
 'groupdict',
 'groups',
 'lastgroup',
 'lastindex',
 'pos',
 're',
 'regs',
 'span',
 'start',
 'string']

`fullmatch`

尝试在整个字符串匹配正则表达式，返回Match对象，如果找不到匹配项，则返回None。

注意：正则表达式必须能匹配到整个字符串，否则返回None。

# 匹配部分字符串
content = 'Hello 123'
result = re.fullmatch('^Hello\s{1}\d{2}', content)
print(result)

None

# 匹配整个字符串
content = 'Hello 123'
result = re.fullmatch('^Hello\s{1}\d{3}', content)
result

<re.Match object; span=(0, 9), match='Hello 123'>

`search`

扫描字符串以查找与正则表达式匹配的项，并返回Match对象；如果未找到匹配项，则返回None。

注意：只会返回第一个被找到的匹配项。

content = 'Hello 123'
result = re.search('\d', content)
print(result)

<re.Match object; span=(6, 7), match='1'>

`findall`

扫描字符串以查找与正则表达式匹配的项，并返回包含匹配对象的列表；如果未找到匹配项，则返回空列表。

注意：返回所有被找到的匹配项。

content = 'Hello 123'
result = re.findall('\d', content)
print(result)

['1', '2', '3']

`finditer`

扫描字符串以查找与正则表达式匹配的项，并返回包含Match对象的迭代器。

注意：返回所有被找到的匹配项。

content = 'Hello 123'
result = re.finditer('\d', content)
list(result)

[<re.Match object; span=(6, 7), match='1'>,
 <re.Match object; span=(7, 8), match='2'>,
 <re.Match object; span=(8, 9), match='3'>]

`sub`

扫描字符串查找与正则表达式匹配的项，并将匹配项替换为指定的字符串，返回新字符串。

注意：替换所有被找到的匹配项。

content = 'Hello 123'
result = re.sub('\d', "A", content)
result

'Hello AAA'

`subn`

扫描字符串查找与正则表达式匹配的项，并将匹配项替换为指定的字符串，返回新字符串和替换次数。

注意：替换所有被找到的匹配项。

content = 'Hello 123'
result = re.subn('\d', "A", content)
result

('Hello AAA', 3)

`split`

根据正则表达式拆分字符串，字符串在匹配项处被拆分。

注意：所有匹配项的位置都被拆分。

content = 'Hello 123'
result = re.split('[e2]', content)
result

['H', 'llo 1', '3']

`compile`

compile 函数可以将正则字符串编译成RegexObject对象，以便在后面的匹配中复用：

content1 = '2016-12-15 12:00'
content2 = '2016-12-17 12:55'
content3 = '2016-12-22 13:21'
pattern = re.compile('\d{2}:\d{2}')

result1 = re.sub(pattern, '', content1)
result2 = re.sub(pattern, '', content2)
result3 = re.sub(pattern, '', content3)
print(result1, result2, result3)

2016-12-15  2016-12-17  2016-12-22

另外，compile 还可以传入修饰符，例如 re.S 等修饰符，这样在 search、findall 等方法中就不需要额外传了。

`escape`

对正则表达式中的特殊字符进行转义。

print(re.escape("\d"))

\\d

`purge`

Clear the regular expression caches.

`template`

Compile a template pattern, returning a Pattern object.