Introduction

Notes on speeding up Python code with Numba So, I found that Numba is effective in speeding up technical indicator functions using for statements, but there are other indicators that make heavy use of if statements.

One of them is Parabolic SAR. This is not a particularly unusual indicator, but rather popular. However, since the ascending mode and descending mode are switched and the step width changes, it cannot be described by the for statement alone. This was the last time I ported MetaTrader's technical indicators to Python.

This time is a memo when speeding up this.

Parabolic SAR Python code

import numpy as np
import pandas as pd
dataM1 = pd.read_csv('DAT_ASCII_EURUSD_M1_2015.csv', sep=';',
                     names=('Time','Open','High','Low','Close', ''),
                     index_col='Time', parse_dates=True)

def iSAR(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].copy()
    for i in range(1,len(df)):
        last_period += 1    
        if dir_long == True:
            Ep1 = df['High'][i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, df['High'][i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > df['Low'][i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = df['Low'][i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, df['Low'][i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < df['High'][i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSAR(dataM1, 0.02, 0.2)

The for statement is single, but it takes some time.

1 loop, best of 3: 1min 19s per loop

Speed up with Numba

First, let's speed up with Numba. Just change the pandas array to a numpy array and add @jit.

from numba import jit
@jit
def iSARjit(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].values.copy()
    High = df['High'].values
    Low = df['Low'].values
    for i in range(1,len(SAR)):
        last_period += 1    
        if dir_long == True:
            Ep1 = High[i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = Low[i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSARjit(dataM1, 0.02, 0.2)

1 loop, best of 3: 1.43 s per loop

It's about 55 times faster. There are few code fixes, so it's a decent result.

Accelerate with Cython

Next, try speeding up with Cython. I thought Cython was a hassle to set up, but with Jupyter notebook, it was fairly easy to install. However, since it uses an external compiler, you need to install Visual C ++. I had to match the version of Anaconda that I built, so I installed the following compiler this time.

Visual Studio Community 2015

The first is when you just set up Cython without changing the code.

%load_ext Cython

%%cython
cimport numpy
cimport cython
def iSAR_c0(df, step, maximum):
    last_period = 0
    dir_long = True
    ACC = step
    SAR = df['Close'].values.copy()
    High = df['High'].values
    Low = df['Low'].values
    for i in range(1,len(SAR)):
        last_period += 1    
        if dir_long == True:
            Ep1 = High[i-last_period:i].max()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = Low[i-last_period:i].min()
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSAR_c0(dataM1, 0.02, 0.2)

result

1 loop, best of 3: 1.07 s per loop

Cython is a little faster with the same code.

Next, when you add a variable type declaration with cdef.

%%cython
cimport numpy
cimport cython
def iSARnew(df, double step, double maximum):
    cdef int last_period = 0
    dir_long = True
    cdef double ACC = step
    cdef numpy.ndarray[numpy.float64_t, ndim=1] SAR = df['Close'].values.copy()
    cdef numpy.ndarray[numpy.float64_t, ndim=1] High = df['High'].values
    cdef numpy.ndarray[numpy.float64_t, ndim=1] Low = df['Low'].values
    cdef double Ep0, Ep1
    cdef int i, N=len(SAR)
    for i in range(1,N):
        last_period += 1    
        if dir_long == True:
            Ep1 = max(High[i-last_period:i])
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = max([Ep1, High[i]])
            if Ep0 > Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] > Low[i]:
                dir_long = False
                SAR[i] = Ep0
                last_period = 0
                ACC = step
        else:
            Ep1 = min(Low[i-last_period:i])
            SAR[i] = SAR[i-1]+ACC*(Ep1-SAR[i-1])
            Ep0 = min([Ep1, Low[i]])
            if Ep0 < Ep1 and ACC+step <= maximum: ACC+=step
            if SAR[i] < High[i]:
                dir_long = True
                SAR[i] = Ep0
                last_period = 0
                ACC = step
    return SAR

%timeit y = iSARnew(dataM1, 0.02, 0.2)

Result is

1 loop, best of 3: 533 ms per loop

was. It's about twice as fast. It may be faster if you tune it, but it can make your code less readable, so I'll leave it here.

Summary

In the case of only the for statement, Numba also has the effect of speeding up considerably, but if the if statement is also included, the effect will decrease. If you want to make it a little faster, you may want to use Cython, with some code modifications.

I tried speeding up Python code including if statements with Numba and Cython

Introduction

Parabolic SAR Python code

Speed up with Numba

Accelerate with Cython

Summary