The LLVM project has released an FFI called Dragon FFI. Blog shows an example of calling C language from Python.
By the way, FFI is an abbreviation for Foreign Function Interface, which is a mechanism that allows you to use functions defined in one programming language and another. For example, Rust, which was created to replace C, implements FFI to call C.
I played with DragonFFI's Python Wrapper, pydffi.
For Python, I used "Python 3.6.3 :: Anaconda, Inc." installed by pyenv. The library was easy to install with pip.
pip install pydffi
I didn't really want to do anything, so I implemented a Fibonacci sequence function in Python and C. The idea is that a function written in C should have a fast execution speed, so it would be interesting to see the numerical value.
def f(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return f(n-1)+f(n-2)
print(f(30))
pydffi
Define a function f like this.
int f(int n) {
if (n == 0) {
return 0;
} else if (n == 1) {
return 1;
} else {
return f(n-1) + f(n-2);
}
}
Call the function f defined in C from Python. With pydffi, it was very simple to write!
import pydffi
with open("cfunc/fibonacci_opt.c") as f:
c_src = "".join(f.readlines())
F = pydffi.FFI()
CU = F.compile(c_src)
print(int(CU.funcs.f(30)))
Executed at N = 30. It's a function that hasn't been optimized at all, so it's natural that C language is faster.
Python functions | C function(pydffi) |
---|---|
0.3381[sec] | 0.0496[sec] |
By the way, if you optimize Python by memoization, it looks like this. (fast!) The algorithm is important.
Python(Memoization) |
---|
0.00005[sec] |
memo = [0] * 1000
def f(n):
if n == 0:
return 0
elif n == 1:
return 1
if memo[n]:
return memo[n]
m = f(n-1)+f(n-2)
memo[n] = m
return m
print(f(30))
I wanted another sample, so I implemented matrix multiplication, so-called matmul. This time, numpy is also a comparison target.
A = [random.random() for _ in range(N) for _ in range(N)]
B = [random.random() for _ in range(N) for _ in range(N)]
C = [0.0 for _ in range(N) for _ in range(N)]
for i in range(N):
for j in range(N):
for k in range(N):
C[i * N + j] += A[i * N + k] * B[k * N + j]
pydffi
void matmul(double *A, double *B, double *C, int N) {
int i, j, k;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
for (k = 0; k < N; k++) {
C[i * N + j] += A[i * N + k] * B[k * N + j];
}
}
}
}
# read c source
with open("cfunc/matmul.c") as f:
c_src = "".join(f.readlines())
# initialize
FFI = pydffi.FFI()
CU = FFI.compile(c_src)
# create array objects & set values
arr_A = pydffi.CArrayObj(FFI.arrayType(FFI.DoubleTy, N*N))
arr_B = pydffi.CArrayObj(FFI.arrayType(FFI.DoubleTy, N*N))
arr_C = pydffi.CArrayObj(FFI.arrayType(FFI.DoubleTy, N*N))
for i in range(N*N):
arr_A.set(i, A[i])
arr_B.set(i, B[i])
arr_C.set(i, 0.0)
# execute c matmul
start = time.time()
CU.funcs.matmul(arr_A, arr_B, arr_C, N)
print("C(FFI):{:.5f}[sec]".format(time.time() - start))
numpy
np_A = np.array(A).reshape(N, N)
np_B = np.array(B).reshape(N, N)
np_C = np.matmul(np_A, np_B)
N=256 After all, C functions are fast. And numpy is the fastest, as expected. Well, C is just a triple loop, so there is room for more optimization. (⇒ For those who are interested in optimizing matmul, here will be helpful)
Python functions | C function(pydffi) | numpy |
---|---|---|
7.1067[sec] | 0.0329[sec] | 0.0281[sec] |
N=1024
Python functions | C function(pydffi) | numpy |
---|---|---|
No measurement | 7.4422[sec] | 0.0769[sec] |
Recommended Posts