Python iterators and generators

Let's summarize Python's iterators and generators.

(Addition 2018.12.25: Completely replaced with Python 3 grammar)

--Iterator: An interface that allows you to retrieve elements iteratively (https://docs.python.org/2/library/stdtypes.html#iterator-types) --Generator: A type of iterator that processes each element when it tries to retrieve it and generates the element. In Python, it seems that it often refers to the implementation using the yield statement.

Any Python built-in collection (list, tuple, set, dict, etc.) can be iterated, but iterative processing using the built-in collection requires pre-populating the collection, so in the following cases: I think there are cases where you want to implement an iterator or generator yourself.

--Infinitely repeating iterations --It is difficult to calculate / acquire all the elements in advance in terms of calculation cost / processing time / memory usage, etc.

Implementation of iterators by class

When you put an object in a context that expects an iterator, such as for in, the object's__iter__ ()method is first called, requiring it to return an iterator implementation. The object obtained by this return value is called the method __next__ (). __next__ () will be called until you get a StopIteration exception.

It is not different from list normally, but it is an example of implementation that returns a list of numbers given at the time of instantiation in order.

sample1.py


class MyIterator(object):
	def __init__(self, *numbers):
		self._numbers = numbers
		self._i = 0
	def __iter__(self):
		# __next__()Is implemented by self, so it returns self as it is
		return self
	def __next__(self):  #Next for Python2(self)Defined in
		if self._i == len(self._numbers):
			raise StopIteration()
		value = self._numbers[self._i]
		self._i += 1
		return value
        
my_iterator = MyIterator(10, 20, 30)
for num in my_iterator:
	print('hello %d' % num)

Result is

hello 10 hello 20 hello 30

Will be.

In this example, __iter__ () returns self, but when the processing for iteration is likely to be complicated, implement another implementation class for iteration and generate such an object. It is also possible to return it.

Using the built-in function ʻiter () `you can see that built-in types such as list are also implemented according to this rule.

>>> hoge = [1, 2, 3]
>>> hoge_iter = iter(hoge)
>>> hoge_iter.__next__()
1
>>> hoge_iter.__next__()
2
>>> hoge_iter.__next__()
3
>>> hoge_iter.__next__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Iterator summary

--The __iter__ () method is called when an iteratorization is requested for an object. --The __next__ () method returns a new value each time it is called --The __next__ () method raises a StopIteration exception on calls when there are no more values to return.

Implementing a generator using yield

Yield can be hard to understand if you're not used to it, but it works simple. There is no need to define a class when implementing a generator using yield. Let's define the following generator function.

my_generator.py



def my_generator():
	yield 1
	yield 2
	yield 3

This is a generator that generates three values, 1, 2, and 3 in order. Note that ** return statements cannot be used inside generator functions **.

Generators are often used in the following cases where computational costs are an issue.

--Cannot calculate all the values to be repeated in advance --Calculate the generation for each element to save the calculation cost

A generator function becomes an iterator object by making a function call.

gen = my_generator()
gen.__next__()  # 1
gen.__next__()  # 2
gen.__next__()  # 3
gen.__next__()  # StopIteration

Yield returns control to the side that called next (). Let's check the processing flow with the print statement as shown below.

generator_sample.py



def my_generator():
	print('before yield')
	yield 1
	print('yielded 1')
	yield 2
	print('yielded 2')
	yield 3
	print('yielded 3, finished')

def main():
	gen = my_generator()
	print('start')
	v1 = gen.__next__()
	print('called __next__(), v1=%s' % v1)
	v2 = gen.__next__()
	print('called __next__(), v2=%s' % v2)
	v3 = gen.__next__()
	print('called __next__(), v3=%s' % v3)
	v4 = gen.__next__()  # should be exception

main()

The execution result is as follows.

start
before yield
called __next__(), v1=1
yielded 1
called __next__(), v2=2
yielded 2
called __next__(), v3=3
yielded 3, finished
Traceback (most recent call last):
  File "./generator_sample.py", line 21, in <module>
    main()
  File "./generator_sample.py", line 19, in main
    v4 = gen.__next__()  # should be exception
StopIteration

Generator summary

--Use yield in generator implementation --The value comes out as many times as the yield --Return cannot be used in the generator function (yield and return cannot coexist in one function) --When you call a generator function, it becomes an iterator object.

list / tuple / set / list comprehension and iterator

Iterables can be easily linked with built-in features such as list / tuple / set / list comprehensions.

For example, you can easily convert the above simple generator to a list object with a value like [1, 2, 3] by passing it to the list () function. The same is true for tuple and set.


def my_generator():
	yield 1
	yield 2
	yield 3

def my_generator2():
	yield 10
	yield 10
	yield 20

print(list(my_generator()))  # => [1, 2, 3]
print([v * 2 for v in my_generator()])  # => [2, 4, 6]
print(set(my_generator2()))  # => set([10, 20])

Of course, not only generators implemented with yield but also iterators implemented with__next__ ()and return can work with built-in features as well.

itertools

Introducing Python because there is a library called itertools that allows you to easily perform various operations by combining iterator objects. I will leave it. I think this is mainly used for embedded data such as list / tuple without implementing an iterator by yourself, so it is convenient even if you do not implement an iterator.

For example, it is easy to [list all combinations] of [1, 2, 3] and ['a','b'] (http://qiita.com/tomotaka_ito/items/5a545423eac654a5b6f5). I can do it.

more_itertools

There is also a non-standard PyPI library called more_itertools. It contains a lot of useful functions that are not included in itertools. There are chunked, which lumps every N pieces, and ʻilen`, which counts the number by turning an iterator.

You need to install it to use it.

$ pip install more-itertools

There is also Qiita article that introduces and explains itertools / more_itertools, so I think it will be helpful.

Use the generator over and over again

Once the generator is turned in a for loop, the elements will not appear in the second and subsequent for loops.

If you want to be able to call the generator function as many times as you like without any side effects, You may find it useful to use the technique described in I want to iterate a Python generator many times.

Summary

--The iterator interface in Python is __next__ (), and you can see that there are no more elements in the StopIteration exception. --__iter__ () is called when the object is evaluated as an iterator context --Generators are a type of iterator and are implemented using yield --Generator function becomes an iterator object when called ――It may be convenient to implement an iterator when it repeats infinitely or when it is not possible to calculate everything in advance. --More convenient when combined with list / tuple / set / list comprehension etc. --ʻItertools` is also very convenient

Reference link

Recommended Posts

Python iterators and generators
Python list comprehensions and generators
[Python] A rough understanding of iterators, iterators, and generators
Generate Fibonacci numbers with Python closures, iterators, and generators
Let's review the language specifications around Python iterators and generators
[python] Compress and decompress
Python and numpy tips
[Python] pip and wheel
Batch design and python
Python packages and modules
Vue-Cli and Python integration
Ruby, Python and map
python input and output
Python and Ruby split
Python3, venv and Ansible
Python asyncio and ContextVar
Programming with Python and Tkinter
Encryption and decryption with Python
Python: Class and instance variables
3-3, Python strings and character codes
Python 2 series and 3 series (Anaconda edition)
Python on Ruby and angry Ruby on Python
Python indentation and string format
Python real division (/) and integer division (//)
Install Python and Flask (Windows 10)
About python objects and classes
About Python variables and objects
Apache mod_auth_tkt and Python AuthTkt
Å (Ongustromu) and NFC @ Python
Understand Python packages and modules
# 2 [python3] Separation and comment out
Python shallow copy and deep copy
Python and ruby slice memo
Python installation and basic grammar
I compared Java and Python!
Python shallow and deep copy
About Python, len () and randint ()
About Python datetime and timezone
Install Python 3.7 and Django 3.0 (CentOS)
Python environment construction and TensorFlow
Python class variables and instance variables
Ruby and Python syntax ~ branch ~
[Python] Python and security-① What is Python?
Stack and Queue in Python
python metaclass and sqlalchemy declareative
Fibonacci and prime implementations (python)
Python basics: conditions and iterations
Python bitwise operator and OR
Python debug and test module
Python list and tuples and commas
Python variables and object IDs
About Python and regular expressions
python with pyenv and venv
Unittest and CI in Python
Maxout description and implementation (Python)
[python] Get quotient and remainder
Python 3 sorted and comparison functions
[Python] Depth-first search and breadth-first search
Identity and equivalence Python is and ==
Source installation and installation of Python
Python or and and operator trap