I often forget the operation when I want to handle regular expressions in Python, so make a note of what you often search for. I will add it when the number of frequently searched items increases. If you want to see the comprehensive information, go to Official Documents. Don't forget to import re. For the grammar of regular expressions, I left a note at here for your reference.
A summary of the functions introduced. Here, pattern represents an arbitrary regular expression object, and match represents an arbitrary match object.
function | Contents |
---|---|
re.compile(r"Regular expressions") | Regular expressionsオブジェクトの生成 |
pattern.search(String) | String内でpatternに一致した最初のマッチオブジェクト |
pattern.finditer(String) | String内でpatternに一致した全てのマッチオブジェクトのイテレータ |
match.start( ) | Starting index in the search result string |
match.end( ) | End index in the search result string |
match[0] | Search result string |
For example, when you want to know if the string I'm a Python beginner A
contains" uppercase letters ".
search_exam_1.py
import re
string = "I'm a python beginner A" #String definition
pattern = re.compile(r"[A-Z]") #Regular expression pattern definition
result = pattern.search(string) #Search
print(result)
print(result.start())
print(result.end())
<re.Match object; span=(2, 3), match='P'>
2
3
When you use a regular expression, you need to process it so that it is recognized as a regular expression. The process is re.compile ()
. If you combine the 3rd and 4th lines with re.search (r" [A-Z] ", string)
, you will end up creating a regular expression object every time there are multiple texts.
Also, if there is a \ (yen symbol) in the regular expression, it is necessary to enclose the character string with r" "
in order to recognize it correctly. Then Python will recognize that the string inside is raw. For details, refer to the beginning of Official Document.
The object returned to result
is called a match object and has the information of the first hit in the search string. It is convenient to be able to get the start and end indexes with the start ()
and ʻend ()` methods, respectively.
When you want to get ** all ** of the "uppercase letters" in the string I'm a Python beginner A
.
search_exam_2.py
import re
string = "I'm a python beginner A" #String definition
pattern = re.compile("[A-Z]") #Regular expression pattern definition
results = pattern.finditer(string) #Search for strings
for result in results: #Search result iterator(results)In result
print(result[0]) #Match object item
print(result.start(), result.end())
P
2 3
A
11 12
If you search for a character string with finditer
, the iterator of the match object will be returned, so expand it one by one with a for statement. Multiple searches are realized in this way.
What is result [0]? The operation of accessing the 0th group in the match object. If you search for multiple strings, you will have multiple groups in the match object.
Here, pattern represents an arbitrary regular expression object, and match represents an arbitrary match object.
function | Contents |
---|---|
re.compile(r"Regular expressions") | Regular expressionsオブジェクトの生成 |
pattern.search(String) | String内でpatternに一致した最初のマッチオブジェクト |
pattern.finditer(String) | String内でpatternに一致した全てのマッチオブジェクトのイテレータ |
match.start( ) | Starting index in the search result string |
match.end( ) | End index in the search result string |
match[0] | Search result string |
For now, the search
and finditer
related items are listed. I will add it again in the future.
Recommended Posts