Let's summarize Python's iterators and generators.
(Addition 2018.12.25: Completely replaced with Python 3 grammar)
--Iterator: An interface that allows you to retrieve elements iteratively (https://docs.python.org/2/library/stdtypes.html#iterator-types) --Generator: A type of iterator that processes each element when it tries to retrieve it and generates the element. In Python, it seems that it often refers to the implementation using the yield statement.
Any Python built-in collection (list, tuple, set, dict, etc.) can be iterated, but iterative processing using the built-in collection requires pre-populating the collection, so in the following cases: I think there are cases where you want to implement an iterator or generator yourself.
--Infinitely repeating iterations --It is difficult to calculate / acquire all the elements in advance in terms of calculation cost / processing time / memory usage, etc.
When you put an object in a context that expects an iterator, such as for
in, the object's__iter__ ()
method is first called, requiring it to return an iterator implementation. The object obtained by this return value is called the method __next__ ()
. __next__ ()
will be called until you get a StopIteration
exception.
It is not different from list normally, but it is an example of implementation that returns a list of numbers given at the time of instantiation in order.
sample1.py
class MyIterator(object):
def __init__(self, *numbers):
self._numbers = numbers
self._i = 0
def __iter__(self):
# __next__()Is implemented by self, so it returns self as it is
return self
def __next__(self): #Next for Python2(self)Defined in
if self._i == len(self._numbers):
raise StopIteration()
value = self._numbers[self._i]
self._i += 1
return value
my_iterator = MyIterator(10, 20, 30)
for num in my_iterator:
print('hello %d' % num)
Result is
hello 10 hello 20 hello 30
Will be.
In this example, __iter__ ()
returns self, but when the processing for iteration is likely to be complicated, implement another implementation class for iteration and generate such an object. It is also possible to return it.
Using the built-in function ʻiter () `you can see that built-in types such as list are also implemented according to this rule.
>>> hoge = [1, 2, 3]
>>> hoge_iter = iter(hoge)
>>> hoge_iter.__next__()
1
>>> hoge_iter.__next__()
2
>>> hoge_iter.__next__()
3
>>> hoge_iter.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
--The __iter__ ()
method is called when an iteratorization is requested for an object.
--The __next__ ()
method returns a new value each time it is called
--The __next__ ()
method raises a StopIteration
exception on calls when there are no more values to return.
Yield can be hard to understand if you're not used to it, but it works simple. There is no need to define a class when implementing a generator using yield. Let's define the following generator function.
my_generator.py
def my_generator():
yield 1
yield 2
yield 3
This is a generator that generates three values, 1
, 2
, and 3
in order. Note that ** return statements cannot be used inside generator functions **.
Generators are often used in the following cases where computational costs are an issue.
--Cannot calculate all the values to be repeated in advance --Calculate the generation for each element to save the calculation cost
A generator function becomes an iterator object by making a function call.
gen = my_generator()
gen.__next__() # 1
gen.__next__() # 2
gen.__next__() # 3
gen.__next__() # StopIteration
Yield returns control to the side that called next (). Let's check the processing flow with the print statement as shown below.
generator_sample.py
def my_generator():
print('before yield')
yield 1
print('yielded 1')
yield 2
print('yielded 2')
yield 3
print('yielded 3, finished')
def main():
gen = my_generator()
print('start')
v1 = gen.__next__()
print('called __next__(), v1=%s' % v1)
v2 = gen.__next__()
print('called __next__(), v2=%s' % v2)
v3 = gen.__next__()
print('called __next__(), v3=%s' % v3)
v4 = gen.__next__() # should be exception
main()
The execution result is as follows.
start
before yield
called __next__(), v1=1
yielded 1
called __next__(), v2=2
yielded 2
called __next__(), v3=3
yielded 3, finished
Traceback (most recent call last):
File "./generator_sample.py", line 21, in <module>
main()
File "./generator_sample.py", line 19, in main
v4 = gen.__next__() # should be exception
StopIteration
--Use yield in generator implementation --The value comes out as many times as the yield --Return cannot be used in the generator function (yield and return cannot coexist in one function) --When you call a generator function, it becomes an iterator object.
Iterables can be easily linked with built-in features such as list / tuple / set / list comprehensions.
For example, you can easily convert the above simple generator to a list object with a value like [1, 2, 3]
by passing it to the list ()
function. The same is true for tuple and set.
def my_generator():
yield 1
yield 2
yield 3
def my_generator2():
yield 10
yield 10
yield 20
print(list(my_generator())) # => [1, 2, 3]
print([v * 2 for v in my_generator()]) # => [2, 4, 6]
print(set(my_generator2())) # => set([10, 20])
Of course, not only generators implemented with yield
but also iterators implemented with__next__ ()
and return
can work with built-in features as well.
itertools
Introducing Python because there is a library called itertools that allows you to easily perform various operations by combining iterator objects. I will leave it. I think this is mainly used for embedded data such as list / tuple without implementing an iterator by yourself, so it is convenient even if you do not implement an iterator.
For example, it is easy to [list all combinations] of [1, 2, 3]
and ['a','b']
(http://qiita.com/tomotaka_ito/items/5a545423eac654a5b6f5). I can do it.
more_itertools
There is also a non-standard PyPI library called more_itertools. It contains a lot of useful functions that are not included in itertools. There are chunked
, which lumps every N pieces, and ʻilen`, which counts the number by turning an iterator.
You need to install it to use it.
$ pip install more-itertools
There is also Qiita article that introduces and explains itertools / more_itertools, so I think it will be helpful.
Once the generator is turned in a for
loop, the elements will not appear in the second and subsequent for
loops.
If you want to be able to call the generator function as many times as you like without any side effects, You may find it useful to use the technique described in I want to iterate a Python generator many times.
--The iterator interface in Python is __next__ ()
, and you can see that there are no more elements in the StopIteration exception.
--__iter__ ()
is called when the object is evaluated as an iterator context
--Generators are a type of iterator and are implemented using yield
--Generator function becomes an iterator object when called
――It may be convenient to implement an iterator when it repeats infinitely or when it is not possible to calculate everything in advance.
--More convenient when combined with list / tuple / set / list comprehension etc.
--ʻItertools` is also very convenient
Recommended Posts