How to calculate the sum or average of time series csv data in an instant
TL;DR
- Use pandas to handle time series csv
Thing you want to do
- https://qiita.com/kazutxt/items/a23412625fa3e1f35443
- I want to find the daily and monthly averages from the data (CSV time series data) accumulated here.
input.csv
#date(Up to minutes),USD rate,JPY rate,EUR rate
#There are 144 rows of data per day every 10 minutes
# 2020/2/24 18:00 ~ 2020/6/11 23:10
yyyy/mm/dd HH:MM,NN.N(data),NN.N,NN.N
yyyy/mm/dd HH:MM,NN.N(data),NN.N,NN.N
yyyy/mm/dd HH:MM,NN.N(data),NN.N,NN.N
Trying to reinvent the wheel
First, write the code and ask (try to ask)
- Open the file
- Consider the first column as a date and time and convert it to date type
- Accumulate values until the boundary between the same month and day is crossed
- Divide the accumulated data by the number of accumulated data to calculate the average
- File output
Elements that seem to be troublesome
- The first and last days are not 144
- 30 days and 31 days are mixed in one month
- Type conversion from character to date required
Find the invented wheel
- I found a wheel when I searched for
python csv time series
csvcalc.py
import pandas as pd
df = pd.read_csv("input.csv",index_col=0,parse_dates=True)
df.resample('D').mean().to_csv("output.csv")
- D is daily. M for monthly
- mean is the average calculation. Sum to calculate the sum
- to_csv is a file output
out.csv
2020/2/25,13.99777778,4.343472222,13.19909722
2020/2/26,13.55291667,3.572986111,7.519791667
2020/2/27,12.31902778,4.954513889,9.513611111
・ ・ ・
The wheels even output the graph
- It seems that you can do it with df.plot ()
- It doesn't work on mac. Sorry
Conclusion
- Do not reinvent the wheel (except for study purposes)
- Find the wheel