This is my first post. I'm new to python. I would like to combine multiple csv files of the common format discharged from the equipment etc. into one file, and I will introduce the code I made by researching various things.
2020/04/06 postscript Effective when the header is one line and all the data is clean. Note that an error may occur if there is a row with no data.
It is assumed that there are a lot of csv with the same format in the test folder in the execution location.
import pandas as pd
import glob
#Get the list of files specified by the path in list format.(Here, below the test file one level below)
csv_files = glob.glob('test/*.csv')
#Show list of files to read
for a in csv_files:
print(a)
#Prepare a list to add the contents of the csv file
data_list = []
#Scan the list of files to read
for file in csv_files:
data_list.append(pd.read_csv(file))
#Combine all lists in row direction
#axis=0:Join in row direction, sort
df = pd.concat(data_list, axis=0, sort=True)
df.to_csv("test/total1.csv",index=False)
--Get the list of files specified by glob --Read the file with pd.read_csv () (read with header = 0) --Add the contents of all lists with a for loop --Merge everything in the list with pd.concat --Finally, write to csv file with df.to_csv ()
I was concerned about the processing of the header, but if you use pandas, it is convenient because it will join things with the same column only with data without worrying about it, and I was able to understand the behavior of pandas.
--In this case, the file with a seemingly different number of columns could not be read well. --Solved by reading line by line. (Separate article)
Recommended Posts