It seems that it is not known unexpectedly, but in the regular expression of the Python 3 standard re module, \ d
also matches so-called double-byte numbers.
I will actually try it.
python
>>> import re
>>> re.findall(r"\d", "012012")
['0', '1', '2', '0', '1', '2']
>>>
\ d
also matches '0'
, '1'
, and '2'
.
The reason why this behavior is not well known is
\ d
matches so-called double-byte numbers.There may be other reasons as well.
The re module documentation also states that it is recommended to use [0-9]
instead of \ d
to match \ d
only to [0-9]
. , but for reasons such as wanting to use the long regular expressions for other languages as they are, \ If you want to keep d
, you can either add flags = re.ASCII
to the argument or add(? A)
to the beginning of the regular expression.
python
>>> import re
>>> re.findall(r"\d", "012012", flags=re.ASCII)
['0', '1', '2']
>>> re.findall(r"(?a)\d", "012012")
['0', '1', '2']
>>>
However, these flags affect the entire regular expression. For more information, please read the re module documentation.
Note that flags =
is omitted.
You can write it as re.findall (r" \ d "," 012012 ", re.ASCII)
, but if you omit it poorly, you may get hooked, so it is strongly recommended not to omit it.
By the way, I myself
(? A)
etc. at the beginning.For reasons such as
python
import regex
RE_DIGITS = regex.compile(r"""(?xa)
\A\d+\Z""")
def is_digits(digits):
if RE_DIGITS.match(digits) is not None:
return True
else:
return False
I like to write like this, but I still feel uneasy when using \ d
, so I try to write[0-9]
as much as possible. (ʻIs_digits () `is just an example, so just in case)
Recommended Posts