Here are two ways to convert the character code of a file with Python3.
This time, I have a CSV file with the character code "shift-jis", and I will write the code to convert it to "utf-8".
You can read and write files by specifying the character code using codecs.
utf8_test1.py
# -*- coding:utf-8 -*-
import codecs
def main():
# Shift_JIS file path
shiftjis_csv_path = './download/shift_jis_data.csv'
# UTF-8 File path
utf8_csv_path = './download/utf8_data.csv'
#Character code utf-Convert to 8 and save
fin = codecs.open(shiftjis_csv_path, "r", "shift_jis")
fout_utf = codecs.open(utf8_csv_path, "w", "utf-8")
for row in fin:
fout_utf.write(row)
fin.close()
fout_utf.close()
if __name__ == '__main__':
main()
You can convert the character code by calling the nkf command from Python.
utf8_test2.py
# -*- coding:utf-8 -*-
import subprocess
def main():
# Shift_JIS file path
shiftjis_csv_path = './download/shift_jis_data.csv'
# UTF-8 File path
utf8_csv_path = './download/utf8_data.csv'
cmd = "nkf -w %s > %s" % (shiftjis_csv_path, utf8_csv_path)
subprocess.call(cmd, shell=True)
if __name__ == '__main__':
main()
However, nkf is not a Python feature, so you need to have nkf installed in advance.
$ brew install nkf
Recommended Posts