I've been worried about "Python-ness" for the past year, and when I realized it, I was reading the implementation of range
.
Let's use xrange
instead of range
! It was a super derailment from the story of the Python 2 era.
Repository: https://hg.python.org/cpython Mirror: https://github.com/python/cpython (I'm not sure about Mercurial, so I'm git all below)
https://github.com/python/cpython/tree/master/Objects There are various interesting things in the line. All of them are long and unreadable.
Source files for various builtin objects
It seems that there was a story that range
and xrange
are different in Python2. It seems that Python 3 doesn't have to worry about it.
It seems that there is something called 2to3, and there is also a program that properly converts xrange
to range
.
https://github.com/python/cpython/blob/master/Lib/lib2to3/fixes/fix_xrange.py
This time https://github.com/python/cpython/blob/28b093620c3ae9be02449ea0cecd8b4d649c80d2/Objects/rangeobject.c I Read. There are 1246 lines, but the difficulty level is probably low.
3〜4: include
The header file is https://github.com/python/cpython/tree/master/Include It is in.
What to read in rangeobject.c
typedef struct PyMemberDef
Two of.
The array of PyMemberDef
seems to define the names, types and offsets of the members of the C structure.
rangeobject
6〜19: struct rangeobject
The structure rangeobject
have. It looks like a range.
PyObject_HEAD
and PyObject
are described in Common Object Structures for all object types. Is an extension of PyObject
.
The value PY_SSIZE_T_MAX appears in the comment. Py_ssize_t
is a signed integer unlike ssize_t
. ssize_t
is mentioned in PEP 353 --Using ssize_t as the index type | Python.org.
21〜39: validate_step(PyObject *step)
Auxiliary function for validation steps. Step 0 is not suitable as a range, so it seems to be checking it. If no step is specified, it is regarded as 1.
41〜42: compute_range_length(PyObject *start, PyObject *stop, PyObject *step)
→ Line 151
44〜64: make_range_object(PyTypeObject *type, PyObject *start, PyObject *stop, PyObject *step)
Create a rangeobject
from the arguments. It is the following range_new that really creates rangeobject
, and make_range_object is called if there is no problem after checking various things with range_new.
Since rangeobject
is an object, it is an extension of PyObject
. So, after creating PyObject
, give the value of range (start etc.).
What is PyObject_New (type, typeobj)
? (Type *) _PyObject_New (typeobj)
[indicates](https://github.com/python/cpython/blob/28b093620c3ae9be02449ea0cecd8b4d649c80d2/Include/objimpl.h#L134 -L135), which creates PyObject
by allocating memory. For more information, follow object.c. Abbreviation.
66〜129: range_new(PyTypeObject *type, PyObject *args, PyObject *kw)
Make various checks properly and make rangeobject
. If an inappropriate value is passed, give up creating it, allocate the allocated memory, and return NULL
.
There was a comment like this.
/* XXX(nnorwitz): should we error check if the user passes any empty ranges?
range(-10)
range(0, -5)
range(0, 5, -1)
*/
131〜139: PyDoc
Documents for range (stop) and range (start, stop [, step]) are written using a macro called PyDoc_STRVAR. PyDoc_STRVAR
#define PyDoc_VAR(name) static char name[]
To arrive at. It's a macro for inline documents.
141〜149: range_dealloc(rangeobject *r)
Release of members of struct rangeobject
→ Release of structure.
151〜227: compute_range_length(PyObject *start, PyObject *stop, PyObject *step)
Calculates and returns the number of items in range (lo, hi, step).
The algorithm seems to be the same as get_len_of_range (lo, hi, step) starting from line 865, and the detailed story is written there. The difference is that the argument is PyObject
and that PyObject
is regarded as PyLong
and the story goes on.
229〜233: range_length(rangeobject *r)
Returns the member length as Py_ssize_t
.
The PyLong_AsSsize_t used here seems to return Py_ssize_t
from an int type object.
235〜248: compute_item(rangeobject *r, PyObject *i)
Calculate the value from start, i and step, and get the next value. The comment also says return r-> start + (i * r-> step)
. As it is.
250〜307: compute_range_item(rangeobject *r, PyObject *arg)
Those who call the above compute_item.
Before that, i of compute_item (rangeobject * r, PyObject * i) is issued from arg (actually it seems to be PyLong
) and the range is checked.
309〜319: range_item(rangeobject *r, Py_ssize_t i)
The side that calls the above compute_range_item. This takes care of function arguments and references.
321〜358: compute_slice(rangeobject *r, PyObject *_slice)
Make a slice of rangeobject
. It also handles failures and releases references.
360〜409: range_contains_long(rangeobject *r, PyObject *ob)
In the next range_contains, range_index, range_count
PyLong_CheckExact(ob) || PyBool_Check(ob)
Is executed with the check mark. PyLong_CheckExact and PyBool_Check /boolobject.h#L12) is also a type-checking macro with the same meaning.
As for the contents, is ob simply within the range indicated by the range object (= does it exist)?
411〜419: range_contains(rangeobject *r, PyObject *ob)
(PyLong_CheckExact(ob) || PyBool_Check(ob))
Then the above range_contains_Use long, otherwise_PySequence_IterSearchMake a judgment with.
421〜465: range_equals(rangeobject *r0, rangeobject *r1)
A function that compares two rangeobject
s. A comment outlines where to look in order to make a decision.
/*
if r0 is r1:
return True
if len(r0) != len(r1):
return False
if not len(r0):
return True
if r0.start != r1.start:
return False
if len(r0) == 1:
return True
return r0.step == r1.step
*/
467〜495: range_richcompare(PyObject *self, PyObject *other, int op)
The argument is PyObject
, but there is a check to see if other is PyRange_Type
.
op points to the comparison operator (https://github.com/python/cpython/blob/28b093620c3ae9be02449ea0cecd8b4d649c80d2/Include/object.h#L927), and this function targets Py_NE
and Py_EQ
only.
Eventually call range_equals.
497〜550: range_hash(rangeobject *r)
Hash function for rangeobject
. Return it with the type Py_hash_t. The values used to get the hash are len (), start and step. The processing is changed according to the value of len (r).
/* Hash function for range objects. Rough C equivalent of
if not len(r):
return hash((len(r), None, None))
if len(r) == 1:
return hash((len(r), r.start, None))
return hash((len(r), r.start, r.step))
*/
552〜570: range_count(rangeobject *r, PyObject *ob)
PyLong_CheckExact(ob) || PyBool_Check(ob)
At the time of range_contains_long, otherwise_PySequence_Count with IterSearch.
572〜602: range_index(rangeobject *r, PyObject *ob)
PyLong_CheckExact(ob) || PyBool_Check(ob)
At the time of range_contains_long, otherwise_PySequence_Search with IterSearch.
In case of range_contains_long, after checking that it exists
idx = PyNumber_FloorDivide(tmp, r->step);
The location is calculated and put out.
604〜613: range_as_sequence
Register various functions according to PySequenceMethods. here
PySequenceMethods |
range_as_sequence |
---|---|
lenfunc sq_length |
range_length → line 229 |
ssizeargfunc sq_item |
range_item → line 309 |
objobjproc sq_contains |
range_contains → line 411 |
Only are registered.
615〜633: range_repr(rangeobject *r)
Output the output string in PyUnicode. When step is 1, step is omitted.
635〜641: range_reduce(rangeobject *r, PyObject *args)
It seems to be an auxiliary function of pickle. Py_BuildValue (\ _Py_BuildValue_SizeT Give the information of struct rangeobject
to Python / modsupport.c # L441-L450)).
643〜662: range_subscript(rangeobject* self, PyObject* item)
item is If you go through PyIndex_Check (essentially int), you can use compute_range_item. If you go through PySlice_Check (essentially slice), you can use compute_slice. Returns the element. Otherwise it is an error.
665〜669: range_as_mapping
This time, it is registered according to PyMappingMethods. 2 out of 3 functions are implemented.
PyMappingMethods |
range_as_mapping |
---|---|
lenfunc mp_length |
range_length → line 229 |
binaryfunc mp_subscript |
range_subscript → line 643 |
671: range_iter(PyObject *seq)
→ Line 1076
672: range_reverse
→ Line 1134
674〜682: PyDoc
Documents for reverse, count and index.
684〜690: range_methods
PyMethodDef links functions on Python with functions implemented in C, argument information, and documentation. thing.
Here, the following functions are linked.
Built-in | Implementation in C |
---|---|
__reversed__ | range_reverse →1134 |
__reduce__ | range_reduce →635 |
count | range_count →552 |
index | range_index →572 |
692〜697: range_members
PyMemberDef represents the name, type and offset of the members of the C structure. Here, it is an array of start, stop, and step information.
699〜738: PyRange_Type
A type definition called PyRange_Type
. There are various setting items, but in the case of range, the following information is registered There is.
PyTypeObject |
PyRange_Type |
---|---|
const char *tp_name |
"range" |
Py_ssize_t tp_basicsize |
sizeof(rangeobject) |
destructor tp_dealloc |
(destructor)range_dealloc →141 |
reprfunc tp_repr |
(reprfunc)range_repr →615 |
PySequenceMethods *tp_as_sequence |
&range_as_sequence →604 |
PyMappingMethods *tp_as_mapping |
&range_as_mapping →665 |
hashfunc tp_hash |
(hashfunc)range_hash →497 |
getattrofunc tp_getattro |
PyObject_GenericGetAttr |
unsigned long tp_flags |
Py_TPFLAGS_DEFAULT |
const char *tp_doc |
range_doc |
richcmpfunc tp_richcompare |
range_richcompare →467 |
getiterfunc tp_iter |
range_iter →1076 |
struct PyMethodDef *tp_methods |
range_methods →684 |
struct PyMemberDef *tp_members |
range_members →692 |
newfunc tp_new |
range_new →66 |
By the way, PyVarObject_HEAD_INIT seems to be related to defining the initial segment of PyObject
.
rangeiterobject
740〜753: struct rangeiterobject
The definition is similar to rangeobject
, but slightly different.
755〜764: rangeiter_next(rangeiterobject *r)
Get the next value by adding index * step to start. It is being cast to unsigned and returned to avoid overflow.
766〜770: rangeiter_len(rangeiterobject *r)
Calculate the length with len --index.
772〜773: PyDoc
length_hint_doc. It seems to be a private method that gives an estimate of the length of the list.
775〜802: rangeiter_reduce(rangeiterobject *r)
Function for pickle. It takes more time than range object because there is conversion to PyLong
.
804〜817: rangeiter_setstate(rangeiterobject *r, PyObject *state)
Set the value in index.
819〜820: PyDoc
doc for reduce and setstate. Both seem to be pickle related.
822〜830: rangeiter_methods
Built-in | Implementation in C |
---|---|
__length_hint__ | rangeiter_len →766 |
__reduce__ | rangeiter_reduce →775 |
__setstate__ | rangeiter_setstate → 804 |
832〜863: PyRangeIter_Type
PyTypeObject |
PyRangeIter_Type |
---|---|
const char *tp_name |
"range_iterator" |
Py_ssize_t tp_basicsize |
sizeof(rangeiterobject) |
destructor tp_dealloc |
(destructor)PyObject_Del |
getattrofunc tp_getattro |
PyObject_GenericGetAttr |
unsigned long tp_flags |
Py_TPFLAGS_DEFAULT |
getiterfunc tp_iter |
PyObject_SelfIter |
iternextfunc tp_iternext |
(iternextfunc)rangeiter_next →755 |
struct PyMethodDef *tp_methods |
rangeiter_methods →822 |
I wonder why there are PyObject_Del and PyObject_DEL ...
865〜891: get_len_of_range(long lo, long hi, long step)
Returns the number of items in range (lo, hi, step).
/* -------------------------------------------------------------
If step > 0 and lo >= hi, or step < 0 and lo <= hi, the range is empty.
Else for step > 0, if n values are in the range, the last one is
lo + (n-1)*step, which must be <= hi-1. Rearranging,
n <= (hi - lo - 1)/step + 1, so taking the floor of the RHS gives
the proper value. Since lo < hi in this case, hi-lo-1 >= 0, so
the RHS is non-negative and so truncation is the same as the
floor. Letting M be the largest positive long, the worst case
for the RHS numerator is hi=M, lo=-M-1, and then
hi-lo-1 = M-(-M-1)-1 = 2*M. Therefore unsigned long has enough
precision to compute the RHS exactly. The analysis for step < 0
is similar.
---------------------------------------------------------------*/
The comment description is longer than the code. The point is just dividing the section by the number of steps.
893〜915: fast_range_iter(long start, long stop, long step)
Make a rangeiter. If it is longer than C long type, NULL is returned as an error.
longrangeiterobject
917〜923: struct longrangeiterobject
This is PyObject
.
925〜929: longrangeiter_len(longrangeiterobject *r, PyObject *no_args)
Use PyNumber_Subtract to subtract the index from len.
931〜958: longrangeiter_reduce(longrangeiterobject *r)
For pickle. Calculate stop and calculate the value by creating a range object with make_range_object.
960〜987: longrangeiter_setstate(longrangeiterobject *r, PyObject *state)
Put state in index.
989〜997: longrangeiter_methods
Built-in | Implementation in C |
---|---|
__length_hint__ | longrangeiter_len →925 |
__reduce__ | longrangeiter_reduce →931 |
__setstate__ | longrangeiter_setstate → 960 |
999〜1007: longrangeiter_dealloc(longrangeiterobject *r)
References are deleted in the order of struct member → longrangeiterobject
.
1009〜1041: longrangeiter_next(longrangeiterobject *r)
After all, it is calculated by adding step * index to start.
1043〜1074: PyLongRangeIter_Type
PyTypeObject |
PyRangeIter_Type |
---|---|
const char *tp_name |
"longrange_iterator" |
Py_ssize_t tp_basicsize |
sizeof(longrangeiterobject) |
destructor tp_dealloc |
(destructor)longrangeiter_dealloc →999 |
getattrofunc tp_getattro |
PyObject_GenericGetAttr |
unsigned long tp_flags |
Py_TPFLAGS_DEFAULT |
getiterfunc tp_iter |
PyObject_SelfIter |
iternextfunc tp_iternext |
(iternextfunc)longrangeiter_next →1009 |
struct PyMethodDef *tp_methods |
longrangeiter_methods →989 |
1076〜1132: range_iter(PyObject *seq)
There is a pattern to make a rangeiter from rangeobject
seq and a pattern to make a longrangeiter. The boundary is whether start, stop, and range can be represented by C long type.
1134〜1246: range_reverse(PyObject *seq)
This is also checked to see if it fits in a long. Since it is reverse, step also needs to check -step. The rest is generated by replacing start and stop well.
It was a long time, but I feel like I somehow understood the atmosphere of CPython. I read a decent C language for the first time and learned it. (I just did Shoboi C as a procedural language at university)
So, after all, what is it like Python? The feeling of Python is difficult.
Recommended Posts