I often forget, but the Official Document has too much text and it's hard to check it every time, so it's a memorandum. ,. We plan to add more as needed.

import re

By the way, note that from sympy import * will result in a re function that returns the real part.

Match object

To take out the matched part ...

filename = 'Back_in_the_U.S.S.R.m4a'
m = re.match(r'([\w\.-]+?)\.(\w+)$', filename)
print(m.group(0)) #The whole match(m.group()Equivalent to?)
print(m.group(1)) #1st group
print(m.group(2)) #Second group
print(m.groups()) #Tuple the entire group

`out`


Back_in_the_U.S.S.R.m4a
Back_in_the_U.S.S.R
m4a
('Back_in_the_U.S.S.R', 'm4a')

Name the group with (? P <groupname>) and access from the keyword.

filename = 'Back_in_the_U.S.S.R.m4a'
m = re.match(r'(?P<basename>[\w\.]+?)\.(?P<ext>\w+)$', filename)
print(m.group('basename')) #(?P<basename> )String that matches
print(m.group('ext')) #(?P<ext> )String that matches
print(m.groupdict()) #Named groups in a dictionary

`out`


Back_in_the_U.S.S.R
m4a
{'basename': 'Back_in_the_U.S.S.R', 'ext': 'm4a'}

Use the matched string for replacement

For example, to replace \ ruby {reductio} {reductio} in $ \ LaTeX $ with <ruby> reductio <rt> reductio </ rt> </ ruby> ...

print(re.sub(r'\\ruby\{(\w+)\}\{(\w+)\}',
             r'<ruby>\1<rt>\2</rt></ruby>',
             r'\ruby{Reductio ad absurdum}{View Plaza}'))

`out`


<ruby>Reductio ad absurdum<rt>View Plaza</rt></ruby>

Use a group named with (? P <groupname>) Use\ g <groupname>to refer to it.

print(re.sub(r'\\ruby\{(?P<rb>\w+)\}\{(?P<rt>\w+)\}',
             r'<ruby>\g<rb><rt>\g<rt></rt></ruby>',
             r'\ruby{Reductio ad absurdum}{View Plaza}'))

`out`


<ruby>Reductio ad absurdum<rt>View Plaza</rt></ruby>

It's confusing with the html tag, but I got the same result.

List all matched parts

To retrieve all the contents of an HTML ʻem element or strong` element ...

src = r'<em>Axiom of choice</em>Assuming, to any set<strong>You can put the order</strong>．'
re.findall(r'<(em|strong)>(.*?)</\1>', src)

`out`


[('em', 'Axiom of choice'), ('strong', 'You can put the order')]

Replace by passing the matched part to the function

To somehow change cm to m in the text ...

def cm2m(m): #Prepare a function that takes a match object as an argument
    value = m.group(1)
    return str(float(value)/100) + 'm'
print(re.sub(r'(\d+)cm', cm2m, '271cm +314 cm is 585 cm.'))

`out`


2.71m + 3.14m is 5.It is 85m.

I wonder if lambda is good for simple processing that does not require the purpose of defining a function.

print(re.sub(r'(\d+)cm', lambda m: str(float(m.group(1))/100) + 'm', '271cm +314 cm is 585 cm.'))

`out`


2.71m + 3.14m is 5.It is 85m.

Nested parentheses

Use recursion. However, since it is a function that the standard re does not have, use regex. If it is not installed, use pip install regex etc. The following matches \ frac {} {} in $ \ LaTeX $ (maybe).

import regex
pattern_frac = r'\\frac(?<rec>\{(?:[^{}]+|(?&rec))*\}){2}'
m = regex.search(pattern_frac, r'1 + \frac{\int_{a}^{b} f(x)\,dx }{\sum_{k=1}^{n}a_{k}}')
print(m.group())

`out`


\frac{\int_{a}^{b} f(x)\,dx }{\sum_{k=1}^{n}a_{k}}

Features of regular expression modules that I often use personally in Python

Match object

out

out

Use the matched string for replacement

out

out

List all matched parts

out

Replace by passing the matched part to the function

out

out

Nested parentheses

out

`out`

`out`

`out`

`out`

`out`

`out`

`out`

`out`