About Python code for simple moving average assuming the use of Numba

The most basic moving averages of technical indicators, of which the simple moving average (SMA) is just an average, but it is used to calculate many technical indicators other than SMA. In fact, of the 30 or so technical indicators posted on GitHub, 40% use SMA.

This time, I would like to specialize in that SMA and compare some Python code.

Preparation

Since we have updated the complete set of Python packages, the versions of Python and the packages used are as follows.

First of all Random walk in Python Make a random walk of 100,000 samples with reference to. This is the input data for SMA.

import numpy as np
import pandas as pd
from numba import jit

dn = np.random.randint(2, size=100000)*2-1
gwalk = np.cumprod(np.exp(dn*0.01))*100

Implementation of pandas by rolling and mean

The simplest implementation of SMA is with pandas. It can be easily written using the Series methods rolling and mean.

def SMA1(x, period):
    return pd.Series(x).rolling(period).mean()

As a common specification, enter the input time series and SMA period in the argument. Since comparisons will be made according to the difference in period, measure at period = 20,200.

%timeit y1_20 = SMA1(gwalk, 20)
%timeit y1_200 = SMA1(gwalk, 200)
100 loops, best of 3: 6.02 ms per loop
100 loops, best of 3: 6.01 ms per loop

In the case of pandas, there seems to be no difference in execution speed depending on the period.

Implementation of scipy with lfilter

Comparison of moving average calculation time written in Python Let's implement it using scipy's filter function lfilter, referring to.

from scipy.signal import lfilter
def SMA2(x, period):
    return lfilter(np.ones(period), 1, x)/period

Let's measure the execution time in the same way.

%timeit y2_20 = SMA2(gwalk, 20)
%timeit y2_200 = SMA2(gwalk, 200)
100 loops, best of 3: 5.53 ms per loop
100 loops, best of 3: 10.4 ms per loop

Since lfilter is a general-purpose filter function, not dedicated to SMA, the execution time seems to change depending on the period. Shorter periods are faster than pandas, but longer periods are slower.

Implementation by for statement (1)

Let's write the SMA calculation formula directly using the for statement. Of course, it is obvious that it will be slow if it is left as it is, so use numba to speed it up as the title says.

@jit
def SMA3(x, period):
    y = np.zeros(len(x))
    for i in range(len(y)):
        for j in range(period):
            y[i] += x[i-j]
    return y/period
%timeit y3_20 = SMA3(gwalk, 20)
%timeit y3_200 = SMA3(gwalk, 200)
100 loops, best of 3: 3.07 ms per loop
10 loops, best of 3: 32.3 ms per loop

I am using a for statement, but if the period is 20 due to the effect of speeding up numba, it is the fastest so far. However, since it is proportional to the period, if it is 200, it will be 10 times slower, and it will be the slowest.

Implementation by for statement (2)

The final implementation is a method that takes advantage of the characteristics of SMA. Since SMA simply adds samples, it only calculates by subtracting the old sample value and adding the new sample value using the calculation result of one sample before.

@jit
def SMA4(x, period):
    y = np.empty(len(x))
    y[:period-1] = np.nan
    y[period-1] = np.sum(x[:period])
    for i in range(period, len(x)):
        y[i] = y[i-1]+x[i]-x[i-period]
    return y/period

We will add the samples until they are ready for the period, but after that, we only need to add three data. The execution speed is as follows.

%timeit y4_20 = SMA4(gwalk, 20)
%timeit y4_200 = SMA4(gwalk, 200)
1 loop, best of 3: 727 µs per loop
1000 loops, best of 3: 780 µs per loop

It has the fastest result of any implementation so far. The result was almost the same even if the period was extended.

As mentioned above, assuming the speedup by numba, it was found that the speed of SMA is quite high even if the for statement is used.

Recommended Posts

About Python code for simple moving average assuming the use of Numba
Code for checking the operation of Python Matplotlib
About the ease of Python
About the features of Python
Comparison of exponential moving average (EMA) code written in Python
About the basics list of Python basics
I compared the calculation time of the moving average written in Python
Wrap (part of) the AtCoder Library in Cython for use in Python
About the virtual environment of python version 3.7
[Python3] Rewrite the code object of the function
Pandas of the beginner, by the beginner, for the beginner [Python]
[Python] Get the character code of the file
Tips for Python beginners to use the Scikit-image example for themselves 6 Improve Python code
[Python] Read the source code of Bottle Part 2
The story of low learning costs for Python
A note about the python version of python virtualenv
[Python] Read the source code of Bottle Part 1
Image processing? The story of starting Python for
[Note] About the role of underscore "_" in Python
About the behavior of Model.get_or_create () of peewee in Python
Convert the character code of the file with Python3
About the * (asterisk) argument of python (and itertools.starmap)
Use data class for data storage of Python 3.7 or higher
[Python] Calculate the average value of the pixel value RGB of the object
Let's break down the basics of TensorFlow Python code
Get the return code of the Python script from bat
Tips for speeding up python code correctly with numba
Calculate the regression coefficient of simple regression analysis with python
Let's use the Python version of the Confluence API module.
Let's use the open data of "Mamebus" in Python
[Python] Use the Face API of Microsoft Cognitive Services
A reminder about the implementation of recommendations in Python
python memo (for myself): About the development environment virtualenv
[Python] Code for measuring ambient light RGB of APDS9960
Python code for k-means method in super simple case
About Python for loops
the zen of Python
About Python, for ~ (range)
About Python3 character code
Use hash to lighten collision detection of about 1000 balls in Python (related to the new coronavirus)
Experienced people who mainly use PHP talk about the 4th day of starting Python (subprocess)
A simple Python implementation of the k-nearest neighbor method (k-NN)
Enable the virtualenv Python virtual environment for Visual Studio Code
Check the operation of Python for .NET in each environment
[python] How to use the library Matplotlib for drawing graphs
Consideration for Python decorators of the type that passes variables
I didn't know how to use the [python] for statement
I just wrote the original material for the python sample code
The process of making Python code object-oriented and improving it
Tips for Python beginners to use the Scikit-image example for themselves
[Python] The biggest weakness / disadvantage of Google Colaboratory [For beginners]
Google search for the last line of the file in Python
Experienced people who mainly use PHP talk about the 5th day of starting Python (selenium) PHP vs Python
[Python] How to use the for statement. A method of extracting by specifying a range or conditions.
About Fabric's support for Python 3
About the Python module venv
Explain the code of Tensorflow_in_ROS
About the enumerate function (python)
Python code memo for yourself
About various encodings of Python 3
2.x, 3.x character code of python