re.search(pattern, string) Find the first matching part of the string
The r at the beginning of the pattern means raw string, and backslash is passed to search as it is.
In [11]: import re
In [12]: str = 'Less but better'
In [13]: match = re.search(r'\s\w\w\w\s',str)
In [14]: print 'found:', match.group()
found: but
If there is no matching string, it will be returned as None, so check it with an if statement.
str = 'Less but better'
match = re.search(r'\s\w\w\w\s',str)
if match:
print 'found:', match.group()
else:
print 'Not found'
re.findall(pattern, string) Returns all matches
str = 'He who moves not forward, goes backward.'
matches = re.findall(r'\w{3}', str)
Take out 3 letters. Search from the next that matches.
In [15]: print matches
['who', 'mov', 'not', 'for', 'war', 'goe', 'bac', 'kwa']
If you want to extract only the words, do as follows.
In [16]: matches = re.findall(r'\w+', str)
In [17]: matches
Out[17]: ['He', 'who', 'moves', 'not', 'forward', 'goes', 'backward']
re.match(pattern, string) Check if it matches the regular expression from the beginning of the string
In [133]: str = 'Information is not knowledge.'
In [134]: match = re.match(r'I',str, re.M|re.I)
In [135]: print match
<_sre.SRE_Match object at 0x0000000018D774A8>
In [136]: print match.group()
I
In [137]: match = re.match(r'i',str)
In [138]: print match
None
Regular expressions | meaning |
---|---|
a,A,9 | Does it match the specified character? |
. | One character other than a line break |
\w | letter(a-zA-Z0-9) |
\W | Other than letters |
\s | Whitespace character(space,tab,return) |
\S | Other than whitespace |
\d | Numbers(0-9) |
\t | tab |
\n | newline |
\r | return |
\b | Character delimiter"xxxx"Double quotes. Does not match unless it encloses the character |
^ | The beginning of the string |
$ | The end of the string |
|Cancel special characters |
[ ] [] Represents the character set. [abc] means a or b or c.
Regular expressions | meaning |
---|---|
* | Repeat 0 or more times |
+ | Repeat one or more times |
? | Repeat 0 or 1 times |
{n} | Repeat n times |
group If you enclose it in (), it can be made into a group, and a part of the matching part can be taken out.
str = 'Change before you have to.'
match = re.search(r'(\w+)\s(\w+)\s(\w+)\s(\w+)\s([\w.]+)',str)
if match:
print 'found:', match.group()
else:
print 'Not found'
group () ... Matched part groups () ... divided into groups group (n) ... fetch the nth group. The very first is group (1).
In [15]: print match.group()
Change before you have to.
In [16]: print match.groups()
('Change', 'before', 'you', 'have', 'to.')
In [17]: print match.group(1)
Change
When using group with findall
In [18]: matches = re.findall(r'(\w+):(\d+)',str)
In [19]: matches
Out[19]: [('aaa', '111'), ('bbbb', '2222'), ('ccccc', '33333')]
Here or Here [Here] I referred to (http://www.tutorialspoint.com/python/python_reg_expressions.htm)
Recommended Posts