Python Advent Calendar Day 21 article. I'm very sorry for being late m (_ _;) m
Today, I would like to introduce "Python in Practice (PiP)" that I am currently reading.
Python in Practice (PiP) is written for Pythonista who wants to improve the programming ability of Python. This is a book that was written. It was also selected for the 2014 "Jolt Jolt Awards: The Best Books". Reference: Which is the best IT book of the past year? "Jolt Awards: The Best Books" 2014 edition announced
This book is aimed at Python programmers who want to broaden and deepen their Python knowledge so that they can improve the quality, reliability, speed, maintainability, and usability of their Python programs. Quote: p.1, l.1
The book deals with the following four themes.
--Design patterns for elegant coding --Improved processing speed using parallel processing and Cython --High level networking --Graphics
Today, I would like to introduce the chapter "5. Extending Python" that focuses on improving processing speed. * Fold the contents to the end mm
Extending Python The Extending Python chapter summarizes some tips for improving Python processing performance.
--Use PyPy
-PyPy uses Built-in JIT (Just in Time compiler), and the execution time is overwhelmingly longer than using CPython for programs that take a long time to process. Will be shorter.
However, please note that the execution time may be longer for programs with short processing time due to the influence of compile time.
--Use C or C ++ for Time-critical processing
--By writing C or C ++ code in a form that can be referenced from a Python program, you can benefit from the overwhelming processing power of C and C ++.
The simplest way to use C or C ++ code in Python is to use the Python C interface.
If you want to use the existing C or C ++ library, SWIG or SIP that provides a wrapper for using C or C ++ in Python It is common to use tools such as (: //www.riverbankcomputing.com/software/sip). If you want to use C ++, you can also use boost :: python
.
See also CFFI (C Foreign Function Interface for Python) for the latest information in this area.
--Compile Python code into C code using Cython
-Cython is a Python-based language extended to handle static data types. The Cython source code is translated as C / C ++ code and compiled as a Python extension module. Cython is a very useful tool if you are conscious of speeding up, because you can compile most Python code, and the code you write will usually run much faster.
--Please refer to the official page for details (cut out)
--Use ctypes
to access the C library
From this, we will focus on the method of accessing the C library using ctypes
and introduce the detailed usage.
Accessing C Libraries with ctypes
One of Python's standard modules, ctypes
, allows access to stand-alone shared libraries written in C or C ++. (Represented by .so
on Linux, .dylib
on OS X, and .DLL
on Windows.)
Let's actually see how to use the ctype
module.
Here, as an example, we use the hyphen
library that inserts a hyphen that represents a spelling in a given word. (This library itself is used by OpenOffice.org and LibreOffice.)
e.g. input: extraordinary, output: ex-traor-di-nary
Specifically, use the following functions in the hyphen
library. (Detailed explanation of each function is omitted.)
hyphen.h
//Create a HyphenDict pointer from a dictionary file for hyphen processing
HyphenDict *hnj_hyphen_load(const char *filename);
//For memory release
void hnj_hyphen_free(HyphenDict *hdict);
//Hyphenate word according to HyphenDict pointer
int hnj_hyphen_hyphenate2(HyphenDict *hdict, const char *word, int word_size, char *hyphens, char *hyphenated_word, char ***rep, int **pos, int **cut);
Let's use this library in Python right away!
First, find the shared library hyphen
to use.
libhyphen.so
on Linux, hyphen.dylib
on OS X, and hyphen.dll
on Windows where the path passes.Hyphenate1.py
import ctypes
class Error(Exception):
pass
_libraryName = ctypes.util.find_library("hyphen")
if _libraryName is None:
raise Error("cannot find hyphenation library")
_LibHyphen = ctypes.CDLL(_libraryName)
It's so simple that it doesn't need much explanation, but the ctypes.util.find_library ()
function is looking for a shared library and it's loaded by the ctypes.CDLL ()
function.
After loading the library, create Python wrappers for the functions in the library. The general method is to assign the functions in the library to Python variables.
After assigning a function to a variable, you need to specify the type of the argument and the type of return.
e.g. Example of hnj_hyphen_load
Hyphenate1.py
_load = _LibHyphen.hnj_hyphen_load
_load.argtypes = [ctypes.c_char_p]
_load.restype = ctypes.c_void_p
e.g. Example of hnj_hyphen_hyphenate2
Hyphenate1.py
_int_p = ctypes.POINTER(ctypes.c_int)
_char_p_p = ctypes.POINTER(ctypes.c_char_p)
_hyphenate = _LibHyphen.hnj_hyphen_hyphenate2
_hyphenate.argtypes = [
ctypes.c_void_p, # HyphenDict *hdict
ctypes.c_char_p, # const char *word
ctypes.c_int, # int word_size
ctypes.c_char_p, # char *hyphens
ctypes.c_char_p, # char *hyphenaated_word
_char_p_p, # char ***rep
_int_p, # int **pos
_int_p # int **cut
]
_hyphenate.restype = ctypes.c_int
Let's use these to create a private Python function.
Hyphenate1.py
def hyphenate(word, filename, hyphen='-'):
originalWord = word
hdict = _get_hdict(filename)
word = word.encode("utf-8")
word_size = ctypes.c_int(len(word))
hyphens = ctypes.create_string_buffer(word)
hyphenated_word = ctypes.create_string_buffer(len(word) * 2)
rep = _char_p_p(ctypes.c_char_p(None))
pos = _int_p(ctypes.c_int(0))
cut = _int_p(ctypes.c_int(0))
if _hyphenate(hdict, word, word_size, hyphens, hyphenated_word, rep, pos, cut):
raise Error("hyphenation failded for '{}'".format(originalWord))
return hyphenated_word.value.decode("utf-8").replace("=", hyphen)
Like this. ctypes.create_string_buffer
is a function that creates a Cchar
based on the number of bytes.
Encoding processing is performed because it is necessary to pass a byte to UTF-8 to the function for hyphen processing.
The _get_hdict ()
function can be written as follows.
It is a simple file load process.
Hyphenate1.py
_hdictForFilename = {}
def _get_hdict(filename):
if filename not in _hdictForFilename:
hdict = _load(ctypes.create_string_buffer(filename.encode("utf-8")))
if hdict is None:
raise Error("failed to load '{}'".format(filename))
_hdictForFilename[filename] = hdict
hdict = _hdictForFilename.get(filename)
if hdict is None:
raise Error("failed to load '{}'".format(filename))
return hdict
You are ready to call the C library from Python. If you actually use the function, you should get the following output.
>>> hyphenate('extraordinary', '/path/to/dictfile')
u'ex-traor-dinary'
In this way, the C library can be used casually from Python, so you may consider leaving the processing to the C library for the part where the processing is inevitably heavy.
This time, I picked up the C language extension part from PiP and introduced it. PiP is written in very simple English, so it is recommended for those who are not good at English. In particular, the first chapter on design patterns is a cross-linguistic basic story, so I think that there are many stories that will be helpful to those who are using other languages.
We are planning to have a reading session for this book at PyLadies Tokyo at the beginning of the year, so if you are interested, please contact us (promotion).
Recommended Posts