I saw a person who is struggling to write a double loop by himself to read csv like the following in Python, so I will summarize four typical reading methods.
sample.csv
1,2,3
4,5,6
7,8,9
10,11,12
In addition, StringIO is used here to emulate file reading, but when actually using it, please read it as an open function or a character string of the file name as appropriate.
How to use the standard Python library CSV. There may be a merit that it works only with the standard library, but in reality it is unlikely that you will handle data containing only numerical values as a double list.
Note that each element is still a string, so we put map (int, row)
in between to convert it as a number.
from io import StringIO
import csv
s = """1,2,3
4,5,6
7,8,9
10,11,12"""
with StringIO(s) as csvfile:
csvreader = csv.reader(csvfile)
rows = []
for row in csvreader:
rows.append(list(map(int, row)))
#or list comprehension
with StringIO(s) as csvfile:
csvreader = csv.reader(csvfile)
rows = [list(map(int, row)) for row in csvreader]
import numpy as np
arr = np.array(rows)
numpy
loadtxt and [genfromtxt](https://numpy.org/doc/stable/reference/generated /numpy.genfromtxt.html?highlight=genfromtxt#numpy.genfromtxt). Both have similar interfaces, but genfromtxt is a little more sophisticated because it can replace missing values.
numpy.loadtxt
from io import StringIO
import numpy as np
s = """1,2,3
4,5,6
7,8,9
10,11,12"""
with StringIO(s) as csvfile:
arr = np.loadtxt(csvfile, delimiter=",", dtype=int)
numpy.genfromtxt
from io import StringIO
import numpy as np
s = """1,2,3
4,5,6
7,8,9
10,11,12"""
with StringIO(s) as csvfile:
arr = np.genfromtxt(csvfile, delimiter=",", dtype=int)
pandas
np.genfromtxt can also handle missing values, but pandas may be easier to find information.
pandas.read_csv
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
This time, it is treated as csv without header, so put the option of header = None
.
from io import StringIO
import pandas as pd
s = """1,2,3
4,5,6
7,8,9
10,11,12"""
with StringIO(s) as csvfile:
df = pd.read_csv(csvfile, header=None, dtype=int)
arr = df.values
Please let me know if there is another convenient method.
Recommended Posts