If you use various libraries with python, I thought, "You can do a little thing with a little code, and you can make a little script with a little 5 steps, which is convenient." So I just listed python and other commands. I may come up with this, but I will post a 10-step script on an irregular basis.
As ** 5th **, I would like to post a resample of time series data using pandas. </ font>
I think there are many situations where you want to aggregate monthly or quarterly data on sales, financial product prices, and various other time-series data. Knowing that it can be done using pandas resample, I would like to post an example of totaling annually and quarterly using exchange rate data as an example, without forgetting myself.
【environment】 Linux: debian10 python: 3.7.3 pandas: 1.0.3 jupyterlab: 2.1.0
For the data, I would like to use fred's exchange rate data.
If you have the following data
DATEDEXJ,PUS 2015-01-01,NaN 2015-01-02,120.20 2015-01-05,119.64
** Resample syntax on average ** data frame .resample (rule ='frequency code'). mean () </ font> -Frequency code: Annual ('Y'), Quarterly ('Q'), Monthly ('M'), Weekly ('W') -As an example, when storing the annual average of the data frame "df_dexjpus" in the data frame "df_year" df_year = df_dexjpus.resample(rule='Y').mean()
The code ran in jupyter.
python3.7
#Exchange rate resample(Annual / Quarterly / Monthly / Weekly)
#Forex data frame(df_dexjpus)
# infile = ('./dexjpus_20200417.csv')
import pandas as pd
import matplotlib.pyplot as plt
df_dexjpus = pd.read_csv('./dexjpus_20200417.csv' ,index_col='DATE' ,parse_dates=['DATE'])
#Annual
df_year = df_dexjpus.resample(rule='Y').mean()
#2017 data
df_2017 = df_dexjpus.loc['2017-01-01' : '2017-12-31']
#2017 quarterly exchange rate average
df_quarter = df_2017.resample(rule='Q').mean()
** 2-1. Import library, read data **
python3.7
#Exchange rate resample(Annual / Quarterly / Monthly / Weekly)
#Forex data frame(df_dexjpus)
# infile = ('./dexjpus_20200417.csv')
import pandas as pd
import matplotlib.pyplot as plt
df_dexjpus = pd.read_csv('./dexjpus_20200417.csv' ,index_col='DATE' ,parse_dates=['DATE'])
df_dexjpus
DEXJPUS
DATE
2015-01-01 NaN
2015-01-02 120.20
2015-01-05 119.64
2015-01-06 118.26
2015-01-07 119.5
** 2-2. I'll take a look every year. ** **
python3.7
#Annual
df_year = df_dexjpus.resample(rule='Y').mean()
df_year
DEXJPUS
DATE
2015-12-31 121.049084
2016-12-31 108.656932
2017-12-31 112.098554
2018-12-31 110.397390
2019-12-31 109.018835
2020-12-31 108.775333
** 2-3. Let's take a look at 2017 as an example. ** **
python3.7
#2017 data
df_2017 = df_dexjpus.loc['2017-01-01' : '2017-12-31']
DEXJPUS
DATE
2017-01-31 114.872105
2017-02-28 112.911579
2017-03-31 112.916522
2017-04-30 110.091000
2017-05-31 112.243636
2017-06-30 110.914091
2017-07-31 112.417000
2017-08-31 109.826957
2017-09-30 110.776000
2017-10-31 112.914762
2017-11-30 112.819000
2017-12-31 112.940500
** 2-4. Finally, I would like to look at the data for each quarter of 2017. ** **
python3.7
#2017 quarterly exchange rate average
df_quarter = df_2017.resample(rule='Q').mean()
df_quarter
DEXJPUS
DATE
2017-03-31 113.524098
2017-06-30 111.113906
2017-09-30 110.950476
2017-12-31 112.891803
** This is a resample of time series data using 5.pandas. ** **
Recommended Posts