(I tried to register on the Advent calendar late, but it was already full, so I posted it as a regular article)
There were some things I thought I knew about the language specifications around Python iterators and generators, and there were quite a few features that I had added but didn't know about, so I've summarized them here.
In this article, ʻit is treated as a variable that points to an iterator, and
Klassis treated as a user-defined class, unless otherwise specified. Treat
x` as a variable pointing to an object.
The way to retrieve the next element from an iterator has changed between Python 2 and 3.
In Python2 it is ʻit.next ()and in Python3 it is
next (it). If you want to implement an iterator-like class yourself, you can implement
Klass.next in Python2 and
Klass.next` in Python3. (In this article, we will use the Python3 format below.)
__iter__
that returns an iterator is called an iterable.For iterable objects, you can create an iterator, such as ʻiter (x). It can also be specified after in in a for statement, or on the right-hand side of the in operator (the in operator will try
contains if it exists, or
iterif it doesn't). Examples of iterables include list, tuple, dict, str and file objects. The iterator itself is also iterator. Alternatively, the object of the class that implements the
getitemmethod is also iterable. The iterator ʻiter (x)
created from an object of a class that does not implement __iter__
and implements __getitem__
isx [0]
,x [1] each time next is called. It returns
, ... and throws aStopIteration
exception when ʻIndexError` is thrown.
Continue with next (it)
, and eventually a StopIteration
exception will be thrown when the next element is gone.
When implementing Klass.__next__
, throw aStopIteration
exception if there is nothing more to return.
A "generator" is a function that returns an iterator, similar to a regular function, but with a yield statement. The generator itself is not an iterator. Also, a "generator expression" is an expression that returns an iterator, similar to list comprehensions, but enclosed in parentheses instead of square brackets.
iter(it) is it
When ʻit is an iterator, ʻiter (it)
should return ʻititself. That is, if you implement an iterator you should say something like
Klass .__ iter __ (self): return self. In a for statement,
for x in it:and
for x in iter (it): are expected to be equivalent. The following is an example of what happens when ʻit
and ʻiter (it)` are different.
it and iter(it)
print(sys.version) # ==> 3.4.1 (default, May 23 2014, 17:48:28) [GCC]
# iter(it)Returns it
class Klass:
def __init__(self):
self.x = iter('abc')
def __iter__(self):
return self
def __next__(self):
return next(self.x)
it = Klass()
for x in it:
print(x) # ==> 'a', 'b', 'c'
# iter(it)Does not return it
class Klass2(Klass):
def __iter__(self):
return iter('XYZ')
it = Klass2()
for x in it:
print(x) # ==> 'X', 'Y', 'Z'
print(next(it)) # ==> 'a'
When iter is called with two arguments, it still returns an iterator, but the behavior is very different. If there are two arguments, the first argument must be a callable object (a function or other object with call methods), not iterable. The iterator returned by this calls callable with no arguments each time it calls next. Throws a StopIteration exception if the returned result is equal to sentinel. If you write it like a generator with pseudo code, it will behave like this.
2-argument iter
def iter_(callable, sentinel):
while 1:
a = callable()
if a == sentinel:
raise StopIteration
else:
yield a
Python's Official Document states that it is useful, for example, to read a file until a blank line appears.
Quoted from official documentation
with open('mydata.txt') as fp:
for line in iter(fp.readline, ''):
process_line(line)
In the generator
v = (yield x)
If you write like, you can get the value to v when the generator restarts.
If the generator is restarted by normal next, v will be None.
If you call the send method instead of next, like gen.send (a)
, the generator will restart and v will contain a. It then returns if the value is yielded, as it did when calling next, and throws a StopIteration exception if nothing is yielded.
The Official Document gives an example of a counter with a value change function.
Quoted from official documentation
def counter(maximum):
i = 0
while i < maximum:
val = (yield i)
# If value provided, change counter
if val is not None:
i = val
else:
i += 1
By the way. You can't use send suddenly, and you have to do next at least once before you can use send. (TypeError: can't send non-None value to a just-started generator
.) As you can see from where the sent value goes, if it has never been next, the value goes. Probably because there is no place.
generator.throw(type[, value[, traceback]])
Allows you to raise an exception where the generator was interrupted.
If the generator yields any value, it returns it and throws a StopIteration exception if nothing yields. The thrown exception will propagate as is if it is not processed. (Honestly, I can't think of any effective usage)
Raises a GeneratorExit exception where the generator was interrupted. If a GeneratorExit or StopIteration exception is thrown, generator.close () ends there. If any value is returned, a RuntimeError will be raised. If the generator was originally closed, do nothing. Written in pseudo code, it looks like this?
generator.Close-like processing
def generator_close(gen):
try:
gen.throw(GeneratorExit)
except (GeneratorExit, StopIteration):
return
throw RuntimeError
I can't think of any use for this either. There is no guarantee that it will be called, so you can't write something that is supposed to be called. In addition, I could not find any document that clearly states that, but it seems that a GeneratorExit exception is thrown to the generator when breaking with a for statement.
GeneratorExit exception occurs when breaking with for statement
def gen():
for i in range(10):
try:
yield i
except GeneratorExit:
print("Generator closed.")
raise
for i in gen():
break # ==> "Generator closed." is printed.
You can return expr sequentially by writing yield from expr
(expr is an expression that returns an iterable) in the generator.
Without considering send, the following two codes are equivalent.
Delegation to sub-generator
def gen1():
yield from range(10)
yield from range(20)
def gen2():
for i in range(10):
yield i
for i in range(20):
yield i
If there is send, the sent value is passed to the subgenerator.
In Python, iterators are often used, but they often stopped at what they thought they knew about the specification and old knowledge. Therefore, I reviewed the language specifications based on the official documentation. There were quite a few things I didn't know, but to be honest, most of them don't come up with useful uses. If there is something like "There are other specifications like this" or "I use this function like this", please write it in the comment section.
Recommended Posts