Here is the regular expression of "time" by python.
pattern_date = r'((0?|1)[0-9]|2[0-3])[:Time][0-5][0-9]Minutes?'
# OK
#1:10
# 1:10
# 01:56
# 10:06
# 12:34
# NG
# 99:99
The environment uses Google Colaboratory. The Python version is below.
import platform
print("python " + platform.python_version())
# python 3.6.9
The regular expression check tool used: https://regex101.com/ While checking here, we will create a regular expression and implement it in the code.
Also, this is easy to understand about Python regular expressions in general. https://qiita.com/luohao0404/items/7135b2b96f9b0b196bf3
Let's write the code immediately. First, import the library for using regular expressions.
import re
First of all 12:34 Let's create a regular expression that matches the string.
pattern = r'12:34'
Of course, this is an exact match, so it matches. Let's check with the code.
pattern = r'12:34'
string = r'12:34'
prog = re.compile(pattern)
result = prog.match(string)
if result:
print(result.group())
# 12:34
The matched string is displayed. After that, for the sake of simplicity, only the regular expression pattern is described.
In addition to "12:34", there are other times such as "01:56" and "10:06". The regular expressions that match these are as follows.
pattern = r'\d\d:\d\d'
The regular expression used is:
letter | Description |
---|---|
\d | Any number |
Example | Matching string |
---|---|
\d\d | 12, 34, 01, 56, 10, 06 |
The regular expression above can be expressed more easily.
pattern = r'\d{2}:\d{2}'
The newly used regular expressions are:
letter | Description |
---|---|
{m} | Repeat m of the previous character m times |
Example | Matching string |
---|---|
\d{2} | 12, 34, 01, 56, 10, 06 |
However, this will result in a character string such as "99:99" that cannot be used as a time.
This time, we will allow only the following conditions as the hh: mm format.
The modified regular expression is as follows.
pattern = r'([01][0-9]|2[0-3]):[0-5][0-9]'
The newly used regular expressions are:
letter | Description |
---|---|
[abc] | a,b,Any letter of c |
Example | Matching string |
---|---|
[01][0-9] | 00~09, 10~19 That is, 00~19 |
2[0-3] | 20~23 |
[0-5][0-9] | 00~09, 10~19, …, 50~59 That is, 00~59 |
I also used the following regular expression:
letter | Description |
---|---|
(abc|efg) | Either abc or efg string |
Example | Matching string |
---|---|
([01][0-9]|2[0-3]) | 00~19 or 20~23 That is, 00~23 |
You now have a regular expression that matches only the above conditions.
However, this does not allow you to take things that are not 0-padded (0-padded), such as "1:10". The modified regular expression is as follows.
pattern = r'((0?[0-9]|1[0-9])|2[0-3]):[0-5][0-9]'
The newly used regular expressions are:
letter | Description |
---|---|
? | Repeat 0 or 1 of the previous character |
Example | Matching string |
---|---|
0?[0-9] | 0~9 or 00~09 |
This can also be written a little shorter, like this:
pattern = r'((0?|1)[0-9]|2[0-3]):[0-5][0-9]'
With this, it is possible to handle the one without 0 padding (0 padding).
Furthermore, let's modify it so that it matches not only ": (colon)" but also "-(hyphen)" and "hour (minute)".
pattern = r'((0?|1)[0-9]|2[0-3])[:Time][0-5][0-9]Minutes?'
This time, I used Python to create a regular expression for "time".
Character strings with a certain pattern, such as dates, times, and amounts, are compatible with regular expressions. Try to extract various character strings with regular expressions.
Recommended Posts