What to do when UnicodeDecodeError occurs during read_csv in pandas (pd.read_table ())

When reading a CSV file with pandas, it is very convenient because you only need to read_csv.

import pandas as pd
pd.read_csv("file/to/path")

Normally, there is no problem with the above, but if there are bad characters in the CSV, the following error will be thrown.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 0: invalid start byte

It seems that he is angry, saying "I can't decode it."

Since the character code of CSV created in Excel is "shift-jis", I will try to specify with ʻencoding` of reading for the time being,

import pandas as import pd
pd.read_csv("file/to/path", encoding="shift-jis")

After all it is an error. That's right.

UnicodeDecodeError: 'shift_jis' codec can't decode byte 0x87 in position 0: illegal multibyte sequence

As a solution, it seems that you can read it by specifying ʻignore in codecs.open, ignoring the error, opening it, and pd.read_table`.

with codecs.open("file/to/path", "r", "Shift-JIS", "ignore") as file:
    df = pd.read_table(file, delimiter=",")
    print(df)

It seems that you can pass it as a StreamReaderWriter object as it is without doing file.read ().

I was addicted to it, so I took notes.

Recommended Posts

What to do when UnicodeDecodeError occurs during read_csv in pandas (pd.read_table ())
What to do if a UnicodeDecodeError occurs in pip
What to do if "Unnamed: 0" is added in to_csv-> read_csv in pandas
What to do when ModuleNotFoundError: No module named'XXX' occurs in Python
UnicodeDecodeError in pandas read_csv
What to do if pipreqs results in UnicodeDecodeError
What to do when PermissionError of tempfile.mkstemp occurs
[OSX] [pyenv] What to do when an SSL error occurs in pip
[openpyxl] What to do when IllegalCharacterError appears in pandas.DataFrame.to_excel
[python] What to do when an error occurs in send_keys of headless chrome
What to do when SSL error occurs in pip in Windows10, miniconda, VScode environment
What to do when a Remove Error occurs when updating conda
What to do if a 0xC0000005 error occurs in tf.train.start_queue_runners ()
What to do when an error occurs with import _ssl
What to do when "SSL: CERTIFICATE_VERIFY_FAILED _ssl.c: 1056" appears in Python
What to do when "Invalid HTTP_HOST header" appears in Django
What to do when Ubuntu crashes
What to do when a Missing artifact occurs in a jar that is not defined in pom.xml
What to do if ʻObject arrays cannot be loaded when allow_pickle = False` occurs in numpy.load ()
[Beanstalk] What to do when an error occurs with import uuid
What to do when the value type is ambiguous in Python?
What to do when the result downloaded via scrapy is in English
What to do if an error occurs when importing numpy with VScode
What to do when the warning "The environment is in consistent ..." appears in the Anaconda environment
What to do when a warning message is displayed in pip list
[Python] What to do if you get a ModuleNotFoundError when importing pandas using Jupyter Notebook in Anaconda
What to do to get google spreadsheet in python
What to do if CERTIFICATE_VERIFY_FAILED occurs when nltk.download () is done on macOS pyhon
What to do when a warning appears around Python integration in Neovim's CheckHealth
What to do if a Unicode Encode Error occurs in Sublime Text Python
What to do when "TypeError: data type not understood" appears in python's numpy.zeros
What to do if abort is displayed when inputting camera video in OpenCV
What to do when [Errno 2] No such file or directory appears in Python
What to do when the graph does not appear in jupyter (ipython) notebook
What to do if a version error occurs in the selenium Chrome driver
[Python] Type Error:'WebElement' object is not iterable What to do when an error occurs
What I do when imitating embedded go in python
What to do if pip install fails in Xcode 5.1
[Go 1.13] What to do when unexpected directory layout: appears
UnicodeDecodeError: What to do when'shift_jis' codec can't decode byte
curl: (60) What to do when Issuer certificate is invalid.
What to do when is not in the sudoers file.This incident will be reported.
What to do when gdal_merge creates a huge file
What to do when only the window is displayed and nothing is displayed in pygame Note
What to do when raise ValueError, "unsupported hash type"
What to do if you get an error when importing matplotlib in Python (Mac)
What to do when Python starts up in Anaconda does not come out unexpectedly
What to do when "cannot import name xxx" [Python]
I want to do something in Python when I finish
What to do when you can't bind CaboCha to Python
What to do when there is no response due to Proxy setting in Python web scraping
What to do if you get an error when running "certbot renew" in CakePHP environment
What to do when no display name occurs when unittesting Python + Tkinter on Github Actions Memo
[AWS] What to do when you want to pip with Lambda
What to do if ʻarguments [0] .scrollIntoView ();` fails in python selenium
What to do when Japanese is not displayed on matplotlib
What to do if pip gives a DistributionError in Homebrew
What to do when PyCharm font is strange or garbled
What to do when Unalignable boolean Series provided as indexer
What to do if you get "coverage unknown" in Coveralls
What to do if package installation fails when deploying to heroku