I want to read CSV line by line while converting the field type (while displaying the progress bar) and process it.

An example of reading CSV line by line while converting the field type while displaying the progress bar and doing something. The progress bar is click.

I will paste the code for the time being (It's also an example of how to write an iterator that can be used with with somehow ...)

Click here for field_converter.py: https://gist.github.com/naoyat/3db8cd96c8dcecb5caea This is the one from the previous article "I want to batch convert the result of" string ".split () in Python".

csv_iterator.py


import sys
import csv
import click
from field_converter import FieldConverter

class CSV_Iterator:
    def __init__(self, path, skip_header=False, with_progress_bar=False,
                 field_converter=None):
        self.path = path
        self.with_progress_bar = with_progress_bar
        self.field_converter = field_converter

        self.f = open(path, 'r')
        self.line_count = sum(1 for line in self.f)

        self.f.seek(0)  # rewind
        self.r = csv.reader(self.f, dialect='excel')
        if skip_header:
            self.r.next()
            self.line_count -= 1

        print '(%d lines)' % (self.line_count,)

        if self.with_progress_bar:
            self.bar = click.progressbar(self.r, self.line_count)

    def __iter__(self):
        return self

    def next(self):
        try:
            if self.with_progress_bar:
                fields = self.bar.next()
            else:
                fields = self.r.next()
            if self.field_converter:
                try:
                    fields = self.field_converter.convert(fields)
                except:
                    print sys.exc_info()
            return fields
        except:
            raise StopIteration

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type:
            return False
        if self.with_progress_bar:
            print
        self.f.close()
        return True

I will put it on the gist. https://gist.github.com/naoyat/b1290d917638c412e140

Example of use.

example.py


from csv_iterator import CSV_Iterator

def foobar(csv_path):
    with CSV_Iterator(csv_path,
                  skip_header=True,
                  with_progress_bar=True,
                  field_converter=FieldConverter(int, int, 'iso-8859-1', 'iso-8859-1', float)) as line:
        for id, uid, title, query, target in line:
            ...

Recommended Posts

I want to read CSV line by line while converting the field type (while displaying the progress bar) and process it.
I want to display the progress bar
I want to save the photos sent by LINE to S3
I want to get the file name, line number, and function name in Python 3.4
I want to display the progress in Python!
[Introduction] I tried to implement it by myself while explaining the binary search tree.
I want to replace the variables in the python template file and mass-produce it in another file.
[Introduction] I tried to implement it by myself while explaining to understand the binary tree
Read the csv file and display it in the browser
I want to find the intersection of a Bezier curve and a straight line (Bezier Clipping method)
I want to output while converting the value of the type (e.g. datetime) that is not supported when outputting json with python
I want to use only the normalization process of SudachiPy
[Go] I want to separate endpoints by Read / Write to DB
I want to know the features of Python and pip
I want to make the Dictionary type in the List unique
I want to map the EDINET code and securities number
I want Sphinx to be convenient and used by everyone
I want to know the legend of the IT technology world
Since the Excel date read by pandas.read_excel was a serial number, I converted it to datetime.datetime
Scraping the list of Go To EAT member stores in Fukuoka prefecture and converting it to CSV
I want to create a histogram and overlay the normal distribution curve on it. matplotlib edition
Scraping the list of Go To EAT member stores in Niigata prefecture and converting it to CSV
I want to record the execution time and keep a log.
I want to know the weather with LINE bot feat.Heroku + Python
I want to read the html version of "OpenCV-Python Tutorials" OpenCV 3.1 version
Read an Excel sheet and loop it line by line Python VBA
I want to create a pipfile and reflect it in docker
I want to make the second line the column name in pandas
I want to connect remotely to another computer, and the nautilus command
I want to revive the legendary Nintendo combination by making full use of AI and HR Tech!
I want to receive the configuration file and check if the JSON file generated by jinja2 is a valid JSON
[Golang] I want to add omitempty to the json tag of the int type field of the structure so that it will be ignored if 0 is entered.