http://pandas.pydata.org/pandas-docs/stable/groupby.html As I was reading here, suddenly a function called rolling () came out. I was confused even if I looked at the API Reference, so what kind of method is it with a simple example? I'll grab it.
First, try using it appropriately.
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: s = pd.Series(range(0,7))
In [4]: s
Out[4]:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
dtype: int64
In [5]: s.rolling(window=3, min_periods=3).mean()
Out[5]:
0 NaN
1 NaN
2 1.0
3 2.0
4 3.0
5 4.0
6 5.0
dtype: float64
As you can see from Out [5], for each element of Series s
(including that element), calculate the average value of the previous three byrolling ()
andmean ()
. Can be done. That is, the moving average value is calculated.
For example, looking at ʻindex = 3, the values of the elements of ʻindex = 1,2,3
are 1,2,3
, respectively, so the average value of those three, 2.0
, is output. ..
For the elements of ʻindex = 0, 1`, since there are not enough elements required before that, the calculation is not performed and Nan is output.
Window
determines the number of elements to be calculated by going back to ʻindex, and
min_periods` specifies the minimum number of elements required to obtain a valid calculation result. So, if you want to get the average of 4 elements and output the result if there are at least 2 elements, you can specify as follows.
In [6]: s.rolling(window=4, min_periods=2).mean()
Out[6]:
0 NaN
1 0.5
2 1.0
3 1.5
4 2.5
5 3.5
6 4.5
dtype: float64
Also, if center = True
is set, window
elements including ʻindex` as the starting point and before and after it are calculated.
In [7]: s.rolling(window=3, min_periods=3, center=True).mean()
Out[7]:
0 NaN
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 NaN
dtype: float64
Recommended Posts