The Python for statement is a so-called iterative statement, and is often used to extract and process each element of list in order from the beginning.
The syntax is as follows.
for x in [1, 2, 3, 4, 5]:
# ...
If you put list after in, you can process using each element.
Other than list, the ones that are often used are range (1, 6)
and (1, 2, 3, 4, 5)
.
[1, 2, 3, 4, 5]
, range (1, 6)
and (1, 2, 3, 4, 5)
are all 1 in x when applied to the above syntax example. , 2, 3, 4, 5 are assigned in order.
It's easy to imagine the processing flow of [1, 2, 3, 4, 5]
and (1, 2, 3, 4, 5)
.
However, it seems that range (1, 6)
is assigned to x even though 2 to 5 are not written.
Why.
"Because range is a function, and if you specify 1 and 6 as arguments,[1, 2, 3, 4, 5]
is created?"
I understand. But actually, what is created by the range function is not a list. It's not a tuple either.
It's a range object.
Hmm ... that's right ... so?
... Isn't it wondering why you bother to create a range object? Don't you care why it's a list? Is that so.
It's more intuitive for beginners to be a list, so this is just a trap in the belief that "users don't have to worry about the contents of the black box." Let's update our intuition here.
The reason for creating a range object, or the generalized idea of ** iteration **, makes programming fun when you master it, so please master it.
So far, we have specified three different objects after the in in the for statement.
list, tuple, and range objects.
These three can be turned with a for statement.
But what about the number 123456?
for i in 123456:
# ...
Unfortunately I get an error. What is this difference?
Are you looking at whether the person who parses Python code is an object that can be turned with a for statement?
It's not wrong, but it's not fully explained.
Then you are deciding where to look in the object to turn it with the for statement.
Actually, there is something like ** proof ** that indicates a companion that can be turned in the for statement, and when you specify it after in in the for statement, you are looking at whether there is that proof. Such peers are referred to here as ** iterable **.
for sentence man "Proof Arne! OK!" list-chan "Thank you!" tuple-chan "Thank you!" range-chan "Thank you!"
Then what is the proof?
When using the for statement, internally apply the ** iter function ** (described later) to the ** iterable ** specified after in, and convert it to ** iterator ** (described later). Then apply the next function to that iterator ** as much as possible ** (discussed below). This is the internal situation of the for statement. (There is another pattern of internal circumstances, but I will omit it.)
??????
If you don't understand, please take notes and read on.
An iterator is an object that returns itself when the iter function is applied and some value when the next function is applied.
The iter function internally calls the object's __iter__
method.
The next function internally calls the object's __next__
method.
class MyIter:
def __iter__(self):
return self
def __next__(self):
return 1
my_iter = MyIter()
# iter(my_iter) == my_iter.__iter__() == my_iter
# next(my_iter) == my_iter.__next__()
This alone is already an iterator. The range object is also an iterator and has a proper __iter__
method and a __next__
method.
In other words, an object that has both a __iter__
method and a __next__
method is called an iterator.
In other words, the iterator requires you to have a __iter__
method and a __next__
method.
This idea of "requiring to have a XX method" is called an interface in Java or a protocol in Swift. If you do not accept the request, you will get an error. Please remember that the ** constraint ** that is required of users at the language specification level is important in various situations.
The iter function converts the passed value to ** iterator **. Internally, it calls the object's __iter__
method. If it's already an iterator as explained above, it returns itself.
Objects other than iterators can also have __iter__
methods.
list is not an iterator because it doesn't have a __next__
method, but if it has a __iter__
method and the iter function is applied, it will return an iterator called ** list_iterator **.
And here, the object that can return an iterator with the __iter__
method is called ** iterable **. In other words, an object that can be converted to an iterator with the iter function is called an iterable.
Yes, so list is iterable.
The iterator itself is also iterable because it returns itself in the iter function.
I will explain the internal circumstances of the for statement again.
The for statement applies the iter function to the iterable specified after in, converts it to an iterator, and then applies the ** next function ** to that iterator as much as possible.
Applying the next function as much as possible means that if you don't set an end, you're in an infinite loop, and in many cases you're implementing an end (there's also an infinite iterator).
Having an end means that each time you use the next function, the state of the object approaches the end.
Here is a concrete example.
list_iter = iter([1,2,3,4,5])
a = next(list_iter)
# a => 1
b = next(list_iter)
# b => 2
c = next(list_iter)
# c => 3
d = next(list_iter)
# d => 4
e = next(list_iter)
# e => 5
list is iterable, and when the iter function is applied, it returns an iterator called list_iterator.
When the next function is applied to list_iterator, it seems to output the waiting element.
(You don't have to worry about what the implementation is doing here, but for the moment, list_iterator probably has an index inside, and applying the next function will return the elements of the current index and add the index to + Will be 1.)
The last element 5
has been assigned to ʻe`. So what if we apply the next function again here?
f = next(list_iter)
# => StopIteration
An exception called ** StopIteration ** was thrown. If you are not handling exceptions, the program will end here.
"As much as possible" means until this Stop Iteration is thrown.
In other words, inside the for statement, if an exception called StopIteration is thrown, it will exit the block.
Iterators are useful in many ways, and there is no single reason.
I think there are various contexts such as semantics, lazy evaluation, memory efficiency, constraints, and generalization.
It's very difficult to explain everything, so the most informative story is the advantage of memory efficiency. It also describes the semantics that I personally like and the benefits of generalization.
As an explanation of memory efficiency, it seems good to take the range object as an example.
This is the answer to the first question, "Why do you bother to create a range object?"
Let's say a list is created with the range function.
In that case, you want to handle it in the range of 1 to 100,000,000.
Then range (1, 100000001) will have a list of [1, 2, ..., 100000000]
.
This puts a lot of memory pressure. If the size of the int type is 64bit, the size of this list will be 6,400,000,000bit (= 800,000,000byte ≒ 800MB). (Not exactly, but it's definitely going to be a fucking big one)
The way to solve this compression problem is to make range an iterator.
Specifically, it has "starting value", "current value (= starting value)", and "ending value" inside, and returns the "current value" each time the next function is called, and "current". You can repeat it just by adding +1 to the value. And if you throw ** StopIteration ** when the "current value" becomes the "end value", the for statement will end. (Specifically, it is exactly the same as the C language style for statement)
I think the implementation will look like this: (Not equivalent to range.)
class Range:
def __init__(self, start, end):
self.i = start
self.end = end
def __iter__(self):
return self
def __next__(self):
if self.i == self.end:
raise StopIteration
else:
x = self.i
self.i += 1
return x
for i in Range(1, 100):
print(i)
# 1
# 2
# ...
# 99
#End here
The memory allocated by this iterator is about 192 bits including the size of the internal values self.i
, self.end
, x
, etc. (It's a mess because it's broken in various ways, but it's definitely smaller.)
Compared to the list, the distance from the earth to the moon and the width of the tennis court are different.
Implementing iterators often saves memory rather than creating a new list.
This is an advantage of iterator memory efficiency.
I personally think that it is very important to interpret various things as repeatable.
For example, I want the following to be iterable:
These should come out one after another in the for statement. I want you to come out. I have a feeling.
Even in my own class, there are many situations where this seems to be refreshing if it can be turned with a for statement.
If you can specify iterable as an argument of a function or something, it will be more versatile.
For example, any iterable should be able to be converted to a list. It is tedious to define each type like list (items: dict), list (items: tuple), etc. So it would be nice to be able to specify a type with a proof of iterable.
x = list(iterable)
In fact, list has such an implementation.
As a more elegant example, heapsort
sorted_list = list(heap)
Wouldn't it be beautiful if it could be expressed like this?
In fact, Python functions often specify iterable. There are so many max, min, map, str join methods, functools module reduce functions and much more.
If you have the abstract idea of being repeatable in your business, you may be asked, "Is this a kind of iterable business logic?" (Citation needed)
Turn-based games such as Othello can also be considered iterable. Therefore, the following implementation (expression) is possible.
othello = Othello()
for turn in othello:
point = input()
turn.put(point)
#If the board is filled with stones or ends halfway, you can exit the for statement by throwing Stop Iteration.
The concept of an iterator is common to various programming languages as a ** repeater ** and is well known as an "object that can be specified in a ** for each statement **". Normally, you don't know about iterators, and you tend to think that for each statement is a syntax prepared for limited things such as ArrayList, but in fact, in any language, if you implement a proof (interface) that can be turned with for each, your own class But you can turn it with a for each statement.
Recommended Posts