[Python] Regular Expressions Regular Expressions

String search

re.search(pattern, string) Find the first matching part of the string

The r at the beginning of the pattern means raw string, and backslash is passed to search as it is.

In [11]: import re

In [12]: str = 'Less but better'

In [13]: match = re.search(r'\s\w\w\w\s',str)

In [14]: print 'found:', match.group()
found:  but

If there is no matching string, it will be returned as None, so check it with an if statement.

str = 'Less but better'
match = re.search(r'\s\w\w\w\s',str)
if match:
    print 'found:', match.group()
else:
    print 'Not found'

re.findall(pattern, string) Returns all matches

str = 'He who moves not forward, goes backward.'
matches = re.findall(r'\w{3}', str)

Take out 3 letters. Search from the next that matches.

In [15]: print matches
['who', 'mov', 'not', 'for', 'war', 'goe', 'bac', 'kwa']

If you want to extract only the words, do as follows.

In [16]: matches = re.findall(r'\w+', str)

In [17]: matches
Out[17]: ['He', 'who', 'moves', 'not', 'forward', 'goes', 'backward']

re.match(pattern, string) Check if it matches the regular expression from the beginning of the string

In [133]: str = 'Information is not knowledge.'

In [134]: match = re.match(r'I',str, re.M|re.I)

In [135]: print match
<_sre.SRE_Match object at 0x0000000018D774A8>

In [136]: print match.group()
I

In [137]: match = re.match(r'i',str)

In [138]: print match
None

Regular expressions

Regular expressions	meaning
a,A,9	Does it match the specified character?
.	One character other than a line break
\w	letter(a-zA-Z0-9)
\W	Other than letters
\s	Whitespace character(space,tab,return)
\S	Other than whitespace
\d	Numbers(0-9)
\t	tab
\n	newline
\r	return
\b	Character delimiter"xxxx"Double quotes. Does not match unless it encloses the character
^	The beginning of the string
$	The end of the string
\|Cancel special characters

[ ] [] Represents the character set. [abc] means a or b or c.

repetition

Regular expressions	meaning
*	Repeat 0 or more times
+	Repeat one or more times
?	Repeat 0 or 1 times
{n}	Repeat n times

group If you enclose it in (), it can be made into a group, and a part of the matching part can be taken out.

str = 'Change before you have to.' 

match = re.search(r'(\w+)\s(\w+)\s(\w+)\s(\w+)\s([\w.]+)',str)
if match:
    print 'found:', match.group()
else:
    print 'Not found'

group () ... Matched part groups () ... divided into groups group (n) ... fetch the nth group. The very first is group (1).

In [15]: print match.group()
Change before you have to.

In [16]: print match.groups()
('Change', 'before', 'you', 'have', 'to.')

In [17]: print match.group(1)
Change

When using group with findall

In [18]: matches = re.findall(r'(\w+):(\d+)',str)

In [19]: matches
Out[19]: [('aaa', '111'), ('bbbb', '2222'), ('ccccc', '33333')]

Here or Here [Here] I referred to (http://www.tutorialspoint.com/python/python_reg_expressions.htm)