When the LANG of the environment that runs a certain Python script changes from ja_JP.UTF-8 to C, an error comes out, so make a note of the countermeasures around that.
#!/usr/bin/python3
# -*- coding: utf8 -*-
import sys
from logging import getLogger, StreamHandler, DEBUG
handler = StreamHandler()
logger = getLogger(__name__)
logger.setLevel(DEBUG)
logger.addHandler(handler)
str = "Hoge"
tf = open('utf8.txt')
s = tf.read()
print(str)
print(s)
logger.debug(str)
logger.debug(s)
It's usually fine to run this with LANG = ja_JP.UTF-8. This is usually because Python is set to look at LANG at startup and change the default I / O encoding. So if you bring this to LANG = C, it will try to interpret UTF-8 as ASCII and give an exception.
First, specify the encoding to read the file.
tf = open('utf8.txt', encoding='utf8')
Specify that the output destination of sys.stdout used for print is UTF-8.
sys.stdout = codecs.getwriter("utf8")(sys.stdout.detach())
sys.stderr = codecs.getwriter("utf8")(sys.stderr.detach())
The logger works as it is, but the output is escaped, so use the modified sys.stderr.
handler = StreamHandler(sys.stderr)
In summary, it looks like this.
#!/usr/bin/python3
# -*- coding: utf8 -*-
import codecs
import locale
import sys
from logging import getLogger, StreamHandler, DEBUG, Formatter
sys.stdout = codecs.getwriter("utf8")(sys.stdout.detach())
sys.stderr = codecs.getwriter("utf8")(sys.stderr.detach())
handler = StreamHandler(sys.stdout)
logger = getLogger(__name__)
logger.setLevel(DEBUG)
logger.addHandler(handler)
str = "Hoge"
tf = open('utf8.txt', encoding='utf8')
s = tf.read()
print(str)
print(s)
logger.debug(str)
logger.debug(s)
Recommended Posts