Search for strings in Python

How to search for a string in Python to determine if it contains any string and get its position. You can use the re module of the standard library for more flexible processing with regular expressions.

Judgment whether it contains an arbitrary character string

Use the in operator. Returns True if included, False if not included.

text = "I like TWICE"

print("TWICE" in text)
#True

print("twice" in text)
#False

print('TWICE' in text and 'I' in text)
# True

Get the position of any string

Get the position of any string in the string with find ()

If the string specified in the first argument is included in the caller's string, the position of the first character is returned, otherwise **-1 ** is returned.

text = "I like TWICE"

print(s.find('TWICE'))
# 7

print(s.find('XXX'))
# -1

Even if there are multiple searched substrings in the original string, only the position of the first substring is returned. If you want to get all the positions, use a regular expression.

Judgment by regular expression, position acquisition

Use re.search () to determine if a regular expression contains a particular string.

Specify the regular expression pattern character string in the first argument and the target character string in the second argument. You can use metacharacters for the regular expression pattern, but use the search string as it is.

A match object is returned if there is a match, and None is returned if there is no match.

import re

text = 'I like TWICE'

print(re.search('TWICE', text))
# <re.Match object; span=(7, 12), match='TWICE'>

print(re.search('XXX', s))
# None

Since the match object is always judged as True, if you want to make a conditional branch with an if statement, you can specifyre.search ()or its return value as it is as a conditional expression.

Match object method group () returns the matched string, start (), end (), span () returns the tuple of start position, end position, (start position, end position) respectively.

m = re.search('TWICE', text)

print(m.group())
# TWICE

print(m.start())
# 7

print(m.end())
# 12

print(m.span())
# (5, 8)

Get all results with regular expression

re.findall () returns all matching parts as a list of strings. The number of parts where the number of elements (obtained by the built-in function len ()) matches

text = "I am dream" #Ignore the meaning of the word

print(re.findall('am', text))
# ['am', 'am']

print(len(re.findall('am', text)))
# 2

If you want to get the position of all parts, combine re.finditer () with list comprehension notation

print([m.span() for m in re.finditer('am', text)])
# [(2, 4), (8, 10)]

Search for multiple strings with regular expressions

Regular expression patternA|BThenAOrBMatch to.A, BCan be just a string. Same for 3 or moreA|B|CAnd it is sufficient.

text = 'I am Sam Adams'

print(re.findall('Sam|Adams', text))
# ['Sam', 'Adams']

print([m.span() for m in re.finditer('Sam|Adams', text)])
# [(5, 8), (9, 14)]

Take advantage of regular expression patterns

More flexible search with regular expression metacharacters and special sequences

text = 'I am Sam Adams'

print(re.findall('am', text))
# ['am', 'am', 'am']

print(re.findall('[a-zA-Z]+am[a-z]*', text))
# ['Sam', 'Adams']