I think there are many ways to speed up Python.
There may be other reasons, but the reason for making C an API.
Especially for the last item, since recent setuptools has evolved and it has become easier to build and wheel, there seems to be room to reconsider API.
I feel that there is demand in niches, such as when a person like me who cannot write C at all wants to make effective use of a small-scale C program given by an expert.
As a sample, let's use an algorithm called [Eratosthenes Sieve](https://ja.wikipedia.org/wiki/Eratosthenes Sieve). I don't know the details, so I'd like you to refer to the link destination. One of the algorithms of the primality test method is to search for prime numbers less than or equal to the specified integer. For example, if the given integer is 10, [2, 3, 5, 7] will be searched.
For the C code, I referred to here.
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#define MAX 1000000
double gettimeofday_sec()
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec + (double)tv.tv_usec*1e-6;
}
double prim(int n)
{
double start,end;
start = gettimeofday_sec();
int i, j;
int p[n];
for(i=0 ; i<n ; i++) p[i] = 0;
p[0] = 1;
for(i=2 ; i<=n/2 ; i++){
for(j=2 ; i*j<=n ; j++){
if(p[i*j-1] == 0)
p[i*j-1] = 1;
}
}
end = gettimeofday_sec();
return end - start;
}
Since it is too much to output all the search results, I made it a function that returns the processing time.
virtualenv
It is strongly recommended to create a virtual environment in advance.
This time, we will proceed on the assumption that virtualenvwrapper
is installed.
The virtual environment name is ʻapi_test. Specify the version of python you want to build in the keyword argument
--python =of
mkvirtualenv. This will greatly affect the build of the package (building the wheel) described later, so specify the Python to which the package will be distributed. In my case, Path is
/usr/bin/python3.5`, but please adjust it to your own environment.
mkvirtualenv api_test --python=/usr/bin/python3.5
Create an arbitrary directory and finally it will have the following configuration.
.
├── api_sample
│ ├── __init__.py
│ └── py
│ └── __init__.py
├── prim.c
└── setup.py
If you are not familiar with creating files and directories, you can also clone from my github repository.
git clone https://github.com/drillan/python3_c_api_sample.git
Create and edit prim.c
directly under the directory.
Incorporate the Python API.
#include <Python.h>
All user-visible symbols defined in Python.h
seem to have the prefix Py
or PY
.
Some systems have preprocessor definitions that affect the definition of standard headers, so it seems that Python.h
must be included before any standard header.
This time, we will treat the prim ()
function of the above C sample code as a Python module.
static PyObject *
prim(PyObject *self, PyObject *args)
{
int n;
if(!PyArg_ParseTuple(args, "i", &n))
return NULL;
double start,end;
start = gettimeofday_sec();
int i, j;
int p[n];
for(i=0 ; i<n ; i++) p[i] = 0;
p[0] = 1;
for(i=2 ; i<=n/2 ; i++){
for(j=2 ; i*j<=n ; j++){
if(p[i*j-1] == 0)
p[i*j-1] = 1;
}
}
end = gettimeofday_sec();
return Py_BuildValue("d", end - start);
}
The self
argument is passed the module if it is a module level function, and the method is passed an object instance.
The ʻargsargument is a pointer to the Python tuple object that contains the argument. Each element in the tuple corresponds to each argument in the argument list at the time of the call. This kind of handling is necessary because the arguments are given by Python. I'm also using
PyArg_ParseTuple ()because the given arguments need to be converted to C type. The second argument,
" i ", refers to an int type,
" d "is a double type, and
" s "is a char type. Finally,
Py_BuildValue ()` returns it to the type that Python can accept again.
In this way, in order to convert a C function into a Python API, it is necessary to be aware of the flow of Python-> C-> Python.
Register the prim ()
function created above in the method table so that it can be called as a Python method.
static PyMethodDef methods[] = {
{"prim", prim, METH_VARARGS},
{NULL, NULL}
};
The three entries, in turn, are the method name, a pointer to the C implementation, and a flag bit indicating how to make the call.
METH_VARARGS
is a calling convention typically used in methods of type PyCFunction, where arguments to the function are given in tuple format.
If you want to give keyword arguments, specify METH_KEYWORDS
.
docstring
Normally it seems to be generated using PyDoc_STRVAR ()
.
It can be registered as a docstring when registering the module described later.
PyDoc_STRVAR(api_doc, "Python3 API sample.\n");
Register the module name etc. so that it can be imported from Python. Specify the module name, docstring, module memory area, and method.
static struct PyModuleDef cmodule = {
PyModuleDef_HEAD_INIT,
"c", /* name of module */
api_doc, /* module documentation, may be NULL */
-1, /* size of per-interpreter state of the module,
or -1 if the module keeps state in global variables. */
methods
};
This time, the module name is c
.
Specifying the third -1
seems to mean that the module does not support sub-interpreters due to its global state. I don't know myself, so if you want to know more, please refer to PEP 3121.
The PyInit_c ()
function is called when the c
module is imported.
The PyModule_Create ()
function creates the module defined above and returns it to the initialization function. (Something like __init__.py
?)
Note that the Pyinit_name ()
function has different names depending on the module name.
prim.c
#include <Python.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#define MAX 1000000
double gettimeofday_sec()
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec + (double)tv.tv_usec*1e-6;
}
static PyObject *
prim(PyObject *self, PyObject *args)
{
int n;
if(!PyArg_ParseTuple(args, "i", &n))
return NULL;
double start,end;
start = gettimeofday_sec();
int i, j;
int p[n];
for(i=0 ; i<n ; i++) p[i] = 0;
p[0] = 1;
for(i=2 ; i<=n/2 ; i++){
for(j=2 ; i*j<=n ; j++){
if(p[i*j-1] == 0)
p[i*j-1] = 1;
}
}
end = gettimeofday_sec();
return Py_BuildValue("d", end - start);
}
static PyMethodDef methods[] = {
{"prim", prim, METH_VARARGS},
{NULL, NULL}
};
PyDoc_STRVAR(api_doc, "Python3 API sample.\n");
static struct PyModuleDef cmodule = {
PyModuleDef_HEAD_INIT,
"c", /* name of module */
api_doc, /* module documentation, may be NULL */
-1, /* size of per-interpreter state of the module,
or -1 if the module keeps state in global variables. */
methods
};
PyInit_c(void)
{
return PyModule_Create(&cmodule);
}
__init__.py
You can build it as it is and import it, but be aware of the package and register it in ʻapi_sample / __ init__.py`.
api_sample/__init__.py
import c
Since it's a big deal, I'll write the same code in Python and compare the speeds.
Write the code equivalent to prim.c
in ʻapi_sample / py / __ init__.py` and save it.
Please note that the file name is the same as the above, but the hierarchy is different.
api_sample/py/__init__.py
import time
MaxNum = 1000000
def prim(n):
start = time.time()
prime_box = [0 for i in range(n)]
prime_box[0], prime_box[1] = 1, 1
for i in range(n)[2:]:
j = 1
while i * (j + 1) < n:
prime_box[i * (j + 1)] = 1
j += 1
end = time.time()
return end - start
if __name__ == '__main__':
print(prim(MaxNum))
It's finally a build, but it seems that setuptools
will build it just by writing it in setup.py
. It's a convenient time.
So create setup.py
.
setup.py
from setuptools import setup, Extension
c = Extension('c', sources=['prim.c'])
setup(name="api_sample", version="0.0.0",
description="Python3 API Sample",
packages=['api_sample'], ext_modules=[c])
Is it the point that it is built as an extension of Python by specifying the location of the C source code with setuptools.Extension ()
?
By specifying the module name specified above in the keyword argument ʻext_modules of
setuptools.setup () `, it can be called as a Python module.
python setup.py build
Doing the above will build the required modules and save them in the build
directory.
Let's install and use it immediately.
python setup.py install
pip freeze
If you can install it without any problem, the output will be as follows.
api-sample==0.0.0
Let's start with the Python code.
python -c "from api_sample import py; print(py.prim(1000000))"
It was about this in my environment
5.6855926513671875
Then the C code.
python -c "from api_sample import c; print(c.prim(1000000))"
0.0776968002319336
It's much faster!
Create a wheel.
python setup.py bdist_wheel
In my environment, a file called ʻapi_sample-0.0.0-cp35-cp35m-linux_x86_64.whl` was created.
Executing the above will create a wheel file in the dist
directory according to the execution environment.
This is the main reason why we recommend virtualenv, and it is very convenient to distribute because by switching the virtual environment and recreating the wheel, the package is created according to the environment.
The Windows environment is often the bottleneck for packages that need to be built.
Normally, Visual Studio is required, but you can build it by installing Visual C ++ Build Tools.
By executing python setup.py bdist_wheel
with this installed, the wheel for Windows will be built and it will be possible to distribute it to users who do not have a build environment.
You can also create a wheel that supports both 32bit / 64bit by installing Python for 32bit and 64bit and building in each virtualenv environment.
If you register the wheel for each platform in PyPI, users can easily install it with pip install
.
Recommended Posts