I stumbled on the character code when converting CSV to JSON in Python

Overview

When I made a tool to convert a CSV file to a JSON file with Python, the contents of the JSON file were garbled, so I will leave the correction method.

resource

CSV file → JSON file conversion source (before modification)

csv_to_json.py


import json
import csv

json_list = []

#Read CSV file
with open('./csv_sample.csv', 'r') as f:
    for row in csv.DictReader(f):
        json_list.append(row)

#Write to JSON file
with open('./json_sample.json', 'w') as f:
    json.dump(json_list, f)
# with codecs.open('./json_sample.json', 'w', 'utf-8') as f:
#     json.dump(json_list, f, ensure_ascii=False)

#Load JSON file
with open('./json_sample.json', 'r') as f:
    json_output = json.load(f)

CSV file

csv_sample.csv


name,Ranking,Country of origin
Nadal,2,Spain
Federer,3,Switzerland
Djokovic,1,Serbia

Run csv_to_json.py to generate a JSON file

Run csv_to_json.py

(csv_to_json_tool) bash-3.2$ python csv_to_json.py

JSON file generated after execution

[{"\u540d\u524d": "\u30ca\u30c0\u30eb", "\u9806\u4f4d": "2", "\u51fa\u8eab\u56fd": "\u30b9\u30da\u30a4\u30f3"}, {"\u540d\u524d": "\u30d5\u30a7\u30c7\u30e9\u30fc", "\u9806\u4f4d": "3", "\u51fa\u8eab\u56fd": "\u30b9\u30a4\u30b9"}, {"\u540d\u524d": "\u30b8\u30e7\u30b3\u30d3\u30c3\u30c1", "\u9806\u4f4d": "1", "\u51fa\u8eab\u56fd": "\u30bb\u30eb\u30d3\u30a2"}]

The characters have been garbled. Check the character code of the JSON file with the [file --mime] command.

(csv_to_json_tool) bash-3.2$ file --mime json_sample.json 
json_sample.json: application/json; charset=us-ascii

The character code is us-ascii, and it seems that it was written in a Unicode escaped state.

Fixed garbled JSON file

By specifying "ensure_ascii = False" in the 3rd argument of json.dump when writing a JSON file, it was possible to write with utf-8 and avoid escaping.

csv_to_json.py


import json
import csv

json_list = []

#Read CSV file
with open('./csv_sample.csv', 'r') as f:
    for row in csv.DictReader(f):
        json_list.append(row)

#Write to JSON file
with codecs.open('./json_sample.json', 'w') as f:
    json.dump(json_list, f, ensure_ascii=False)      #"ensure_ascii=False"Specify

#Load JSON file
with open('./json_sample.json', 'r') as f:
    json_output = json.load(f)
    print(json_output)
(csv_to_json_tool) bash-3.2$ file --mime json_sample.json 
json_sample.json: application/json; charset=utf-8

JSON file generated after execution

json_sample.json


[{"name": "Nadal", "Ranking": "2", "Country of origin": "Spain"}, {"name": "Federer", "Ranking": "3", "Country of origin": "Switzerland"}, {"name": "Djokovic", "Ranking": "1", "Country of origin": "Serbia"}]

It was displayed as it was!

Recommended Posts

I stumbled on the character code when converting CSV to JSON in Python
I wrote the code to write the code of Brainf * ck in python
A memorandum because I stumbled on trying to use MeCab in Python
I want to display the progress in Python!
[Python] I want to know the variables in the function when an error occurs!
I want to use Python in the environment of pyenv + pipenv on Windows 10
I tried to graph the packages installed in Python
I tried to touch the CSV file with Python
I want to write in Python! (3) Utilize the mock
I felt that I ported the Python code to C ++ 98.
Stumble when converting bidirectional list to JSON in Go
I want to use the R dataset in python
I want to do something in Python when I finish
"Cython" tutorial to make Python explosive: When C ++ code depends on the library. Preparation
Character code learned in Python
I tried to summarize the code often used in Pandas
I tried to implement the mail sending function in Python
Solve the Japanese problem when using the CSV module in Python.
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
I want to be able to run Python in VS Code
"Cython" tutorial to make Python explosive: When C ++ code depends on the library. Write setup.py.
In the python command python points to python3.8
I wrote the queue in Python
I wrote the stack in Python
How to not escape Japanese when dealing with json in python
[Linux] I want to know the date when the user logged in
I got an AttributeError when mocking the open method in python
Mode line when you open the appropriate Python code in Vim
I want to run the Python GUI when starting Raspberry Pi
After calling the Shell file on Python, convert CSV to Parquet.
Timezone specification when converting a string to datetime type in python
Articles to read when Blender Python script code doesn't work in 2.80
What to do when the value type is ambiguous in Python?
I got an error when I tried to process luigi in parallel on windows, but the solution
Run the output code on the local web server as "A, pretending to be B" in python
A story that didn't work when I tried to log in with the Python requests module
About the error I encountered when trying to use Adafruit_DHT from Python on a Raspberry Pi
I want to output while converting the value of the type (e.g. datetime) that is not supported when outputting json with python
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
I tried to implement PLSA in Python
Data input / output in Python (CSV, JSON)
I tried to implement permutation in Python
When I tried to run Python, it was skipped to the Microsoft Store
How to hide the command prompt when running python in visual studio 2015
"A book to train programming skills to fight in the world" Python code answer example --1.1 Duplicate character string
I stumbled on the Hatena Keyword API
I tried to implement PLSA in Python 2
I want to convert a table converted to PDF in Python back to CSV
I want to batch convert the result of "string" .split () in Python
I want to explain the abstract class (ABCmeta) of Python in detail.
I tried to get the authentication code of Qiita API with Python.
"Cython" tutorial to make Python explosive: When C ++ code depends on the library. First of all, CMake.
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7
I saved the scraped data in CSV!
I tried to implement ADALINE in Python
I tried to develop a Formatter that outputs Python logs in JSON
The file name was bad in Python and I was addicted to import
I wanted to solve ABC159 in Python
I tried to implement PPO in Python
I tried with the top 100 PyPI packages> I tried to graph the packages installed on Python
I get an error when I put a Python plugin in Visual Studio Code under the pyenv environment