When a UnicodeEncodeError occurs in Python3
I've been using Python 2 for about 7 years now, but I'm thinking about moving to 3 soon.
In Python3, I heard that the character strings were unified to Unicode, which made it convenient, but I stumbled upon UnicodeEncodeError.
The OS is Ubuntu 14.04.4 LTS and Python is 3.5.2 installed by pyenv. The executed code is Hello World below.
hello_ja.py
# coding: utf-8
print("Hello World")
result
% python hello_ja.py
Traceback (most recent call last):
File "sample.py", line 4, in <module>
print("\u3053\u3093\u306b\u3061\u306f\u4e16\u754c")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128)
In Python2 series, it was necessary to spend a lot of time using codecs, but in 3 series, this should be possible ...
memorandum- #sys in python3.std(in|out|err)Enforce encoding of When I read, it seems that I am looking at the environment variable LANG in the character code selection of Python3 series.
When I checked the environment variables, it was Japanese utf-8.
% export | grep LANG
LANG=ja_JP.UTF-8
If LANG = C, this was the cause, but this time it seems different.
When I went back to the basics and checked if the Japanese environment was installed in the first place, language-pack-ja was not installed. I installed it with the English version image, so it seems that I forgot to put it in.
Install the package and set the default locale according to Change the default locale of Debian / Ubuntu.
% sudo apt-get install language-pack-ja
% sudo update-locale LANG=ja_JP.UTF-8
Now you can print Japanese correctly.
% python hello_ja.py
Hello World
Recommended Posts