When drawing a graph with matplotlib or seaborn, the Japanese characters in the graph may be garbled and become tofu (□□□) as shown below.
This is a matplotlib / seaborn specific issue due to the environment not having Japanese fonts or being properly configured. The solution is also explained in Other articles, but it is a method to solve this on Cloud Pak for Data (hereinafter CP4D).
Environment: CP4D v2.5, v3.0LA
In CP4D, the Python environment to start with Notebook is prepared in advance, and it starts in the initial state every time the runtime is started, so once you set it to the Python environment, it will not be OK in the future. As a temporary measure, we will implement a solution (font download and setting change) in Notebook.
Run the following code at the beginning of your notebook. The font is an example using the IPA font mentioned in the previous article.
# download and install a Japanese font
!cd /tmp; curl -O https://ipafont.ipa.go.jp/IPAexfont/ipaexg00401.zip
!unzip -jo /tmp/ipaexg00401.zip -d ~/.fonts
# register the font
!fc-cache -fv; fc-list
# reset the matplotlib cache
!rm -rf ~/.cache/matplotlib
(Optional) After running the above, verify that IPAex Gothic has been added to the fonts that matplotlib can recognize in the code below. Looking at this, we can see that there was originally only DejaVu Sans in CP4D's default Python environment.
import matplotlib.font_manager;
[matplotlib.font_manager.FontProperties(fname=fname).get_name() for fname in matplotlib.font_manager.get_fontconfig_fonts()]
# -output-
#['DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'DejaVu Sans',
# 'IPAexGothic']
Before drawing the graph, it is OK if you specify font.family in rcParams.
from matplotlib import pyplot as plt
from matplotlib import rcParams
plt.rcParams['font.family'] = 'IPAexGothic'
# download and install a Japanese font
!cd /tmp; curl -O https://ipafont.ipa.go.jp/IPAexfont/ipaexg00401.zip
!unzip -jo /tmp/ipaexg00401.zip -d ~/.fonts
# register the font
!fc-cache -fv; fc-list
# reset the matplotlib cache
!rm -rf ~/.cache/matplotlib
# -output-
#abridgement
Prepare sample data
import pandas as pd
df = pd.DataFrame({
'AIUEO' : [1,2,3,4,5],
'Kakikukeko' : [0.1,0.2,0.3,0.4,0.5],
'Sashisuseso' : [10,20,30,40,50],
'Chinese characters' : [100.1,100.2,100.3,100.4,100.5]
})
df
# -output-
#Aiueo Kakikukeko Sashisuseso Kanji
# 0 1 0.1 10 100.1
# 1 2 0.2 20 100.2
# 2 3 0.3 30 100.3
# 3 4 0.4 40 100.4
# 4 5 0.5 50 100.5
Draw graph
%matplotlib inline
from matplotlib import pyplot as plt
from matplotlib import rcParams
import seaborn as sns
# Specify font
plt.rcParams['font.family'] = 'IPAexGothic'
sns.pairplot(df)
result:
(Addition) From CP4D v3.0.1, it seems that you can create an environment in which fonts are already installed by creating a custom image of the Python environment. Hopefully it can be used as a permanent measure. I will challenge if I have the opportunity.
Recommended Posts