Introduction

I often forget the operation when I want to handle regular expressions in Python, so make a note of what you often search for. I will add it when the number of frequently searched items increases. If you want to see the comprehensive information, go to Official Documents. Don't forget to import re. For the grammar of regular expressions, I left a note at here for your reference.

Overview

A summary of the functions introduced. Here, pattern represents an arbitrary regular expression object, and match represents an arbitrary match object.

function	Contents
re.compile(r"Regular expressions")	Regular expressionsオブジェクトの生成
pattern.search(String)	String内でpatternに一致した最初のマッチオブジェクト
pattern.finditer(String)	String内でpatternに一致した全てのマッチオブジェクトのイテレータ
match.start( )	Starting index in the search result string
match.end( )	End index in the search result string
match[0]	Search result string

Search for strings (1)

For example, when you want to know if the string I'm a Python beginner A contains" uppercase letters ".

`search_exam_1.py`


import re
string = "I'm a python beginner A" #String definition
pattern = re.compile(r"[A-Z]") #Regular expression pattern definition
result = pattern.search(string) #Search

print(result)
print(result.start())
print(result.end())

<re.Match object; span=(2, 3), match='P'>
2
3

When you use a regular expression, you need to process it so that it is recognized as a regular expression. The process is re.compile (). If you combine the 3rd and 4th lines with re.search (r" [A-Z] ", string), you will end up creating a regular expression object every time there are multiple texts. Also, if there is a \ (yen symbol) in the regular expression, it is necessary to enclose the character string with r" " in order to recognize it correctly. Then Python will recognize that the string inside is raw. For details, refer to the beginning of Official Document.

The object returned to result is called a match object and has the information of the first hit in the search string. It is convenient to be able to get the start and end indexes with the start () and ʻend ()` methods, respectively.

Search for strings (multiple)

When you want to get ** all ** of the "uppercase letters" in the string I'm a Python beginner A.

`search_exam_2.py`


import re
string = "I'm a python beginner A" #String definition
pattern = re.compile("[A-Z]") #Regular expression pattern definition
results = pattern.finditer(string) #Search for strings

for result in results: #Search result iterator(results)In result
    print(result[0]) #Match object item
    print(result.start(), result.end())

If you search for a character string with finditer, the iterator of the match object will be returned, so expand it one by one with a for statement. Multiple searches are realized in this way. What is result [0]? The operation of accessing the 0th group in the match object. If you search for multiple strings, you will have multiple groups in the match object.

Summary

Here, pattern represents an arbitrary regular expression object, and match represents an arbitrary match object.

function	Contents
re.compile(r"Regular expressions")	Regular expressionsオブジェクトの生成
pattern.search(String)	String内でpatternに一致した最初のマッチオブジェクト
pattern.finditer(String)	String内でpatternに一致した全てのマッチオブジェクトのイテレータ
match.start( )	Starting index in the search result string
match.end( )	End index in the search result string
match[0]	Search result string

For now, the search and finditer related items are listed. I will add it again in the future.

Regular expression manipulation with Python