[Python] Get the character code of the file

Module to get the character code when reading a file

-Since it is troublesome to check and set the character code every time the file is read, I created a module to acquire it automatically. -It is especially useful when importing csv files containing Japanese created in Excel. -It also supports importing files on the net. -By setting the return value to encoding at the time of opening, it works without problems so far.

def check_encoding(file_path):

'''Get the character code of the file''' from chardet.universaldetector import UniversalDetector import requests

    detector = UniversalDetector()

    if file_path[:4] == 'http':
        r = requests.get(file_path)
        for binary in r:
            detector.feed(binary)
            if detector.done:
                break
        detector.close()

    else:
        with open(file_path, mode='rb') as f:
            for binary in f:
                detector.feed(binary)
                if detector.done:
                    break
        detector.close()

    print("  ", detector.result, end=' => ')
    print(detector.result['encoding'], end='\n')

    return detector.result['encoding']

-It seems that csv including Japanese has many Shift_JIS, so it seems better to convert it to more general-purpose cp932 in the next model. -By entering the return value obtained in the first model as an argument, the optimum character code name can be obtained as the return value.

def change_encoding(encoding):

'''Convert encoding sjis relation to cp932''' if encoding in ['Shift_JIS', 'SHIFT_JIS', 'shift_jis', 'sjis', 's_jis']: encoding = 'cp932'

    return encoding

Supervised, thank you.

Recommended Posts

[Python] Get the character code of the file

Convert the character code of the file with Python3

[Python] [chardet] Automatic detection of character code of file

Get the update date of the Python memo file.

[Python] Get the official file path of the shortcut file (.lnk)

Get the return code of the Python script from bat

python character code

Check the existence of the file with python

[Python3] Rewrite the code object of the function

Read the file by specifying the character code.

Get the EDINET code list in Python

[PowerShell] Get the reading of the character string

[Python3] Understand the basics of file operations

the zen of Python

About Python3 character code

Get the contents of git diff from python

[Python] Read the source code of Bottle Part 2

[Python] Get / edit the scale label of the figure

[Python] Get the main topics of Yahoo News

Get the caller of a function in Python

[Python] Read the source code of Bottle Part 1

[Python] Get the last updated date of the website

Links and memos of Python character code strings

Code for checking the operation of Python Matplotlib

[Python] Get the day of the week (English & Japanese)

I tried to get the authentication code of Qiita API with Python.

Get country code with python

Get the variable name of the variable as a character string.

Template of python script to read the contents of the file

Towards the retirement of Python2

Summary of python file operations

How to get the number of digits in Python

Download the file in Python

Get the size of the image file on the web (Python3, no additional library required)

[Python] Get the text of the law from the e-GOV Law API

[python] Get the list of classes defined in the module

About the ease of Python

Let's break down the basics of TensorFlow Python code

Get the number of digits

Explain the code of Tensorflow_in_ROS

Get the size (number of elements) of UnionFind in Python

[Python] Get the list of ExifTags names of Pillow library

Get the operation status of JR West with Python

[Python] Get the number of views of all posted articles

Get the URL of the HTTP redirect destination in Python

About the features of Python

Character code learned in Python

The Power of Pandas: Python

Try to get the function list of Python> os package

Get the MIME type in Python and determine the file format

Get the number of specific elements in a python list

[Note] Import of a file in the parent directory in Python

The process of making Python code object-oriented and improving it

Mass generation of QR code with character display by Python

System trade starting with Python3: Get the latest program code

Google search for the last line of the file in Python

Get the index of each element of the confusion matrix in Python

Get the source of the page to load infinitely with python.

Extract the xz file with python

The story of Python and the story of NaN

Easy encryption of file contents (Python)