python iterates over objects, iterators, and generators

Posted by Scriptor on Thu, 03 Feb 2022 12:09:46 +0100

python iterates over objects, iterators, and generators

1, Introduction

1. About iterators and generators

As for the two concepts of iterator and generator in Python, most programmers think they are similar in function. In the official documents of python, iterators are sometimes regarded as generators. In fact, there are some differences between the two.

The built-in function iter() can be used to generate an iterator, and the built-in function next() can be used to obtain the value in the iterator.

First, conceptually:
In a typical iterator design pattern, the iterator is used to traverse the collection and produce elements from it. The iterator cannot modify the values in the data source and can only output them intact, while the generator may generate values without traversal.

Second, in terms of interface:
python's iterator protocol defines two methods__ iter__ And__ next__, Generators implement both methods, so from this point of view, all generators are iterators.

Third, from the aspect of realization:
The python language structure of generator can be written in two ways: 1. The function with yield keyword is called generator function. 2. Use builder expressions.

2. About iteratable objects

Why is it iterative?

Before introducing iteratable objects, let's first understand the reason why sequences in python can be iterated. Sequences can be iterated because of the built-in function of iter(), which has the following functions:

(1) Check whether the object implements__ iter__ Method. If it is implemented, call it to get an iterator

(2) If not__ iter__ Method, but it is implemented__ getitem__ Method, python will create an iterator and try to get the elements in order (from index 0).

(3) If all the above attempts fail, python will throw a TypeError exception

What is an iteratable object?

According to the reason why the python sequence can be iterated above, any object in Python can return an iterator as long as it defines an iterator__ iter__ Method, or a method that supports subscript indexes is defined__ getitem__ Method, then it is an iteratable object. Simply put, an iteratable object is any object that can provide an iterator.

2, Iteratable object

The iteratable object can use the built-in iter() function to obtain the object of the iterator. If the object implements the function that can return the iterator__ iter__ Method, then the object can be iterated. Sequences can be iterated; Realized__ getitem__ Method, and its parameters are zero based indexes. This kind of object can also be iterated.

According to the above definition of iteratable object, we conduct the following code experiments

import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence(object):

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)  # re. The findall function returns a list of strings to get the matching words
    
    # We only achieved__ getitem__ method
    def __getitem__(self, index):
        return self.words[index]
    
    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)
    
s = Sentence('My favorite language is python !')
print(s)
# We iterate and print the value in s. if the printing is successful, the Sentence class can iterate
for word in s: print(word)
Sentence('My favorite ...e is python !')
My
favorite
language
is
python

In the above code experiment, we did not overload__ itet__ Method, but overloaded__ getitem__ Method, our custom Sentence class can still be iterated, which shows that our previous definition of iteratable objects is correct.

Let's verify whether we can use the built-in function of iter() to directly obtain the iterator of the object,

s1 = Sentence('Hi Python')
it = iter(s1)  # Gets the iterator object of the sequence
print(next(it))
print(next(it))
Hi
Python

The results show that I can really get the iterator of the sequence object.

3, Iterator

The standard iterator needs to implement the following methods:

(1) __next__ Returns the next available element. If there is no element, a StopIteration exception is thrown.

(2) __iter__ Return self to use iterators where iteratable objects should be used, such as in a for loop.

Now let's rewrite the code of the previous column according to the iterative standard.

import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence(object):  # Iterative object implementation__ iter__

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return "Sentence(%s)" % reprlib.repr(self.text)
    
    def __iter__(self):
        return SentenceIterator(self.words)
    
class SentenceIterator(object):  # Iterator implementation__ next__ And__ iter__

    def __init__(self, words):
        self.words = words
        self.index = 0
    
    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration
        self.index += 1
        return word
    
    def __iter__(self):
        return self

s1 = Sentence('My favorite language is Python!')
print(s1)
print('================================')

for word in s1: print(word)

it = iter(s1)
print('================================')
print(next(it))
print(next(it))

Sentence('My favorite ...ge is Python!')
================================
My
favorite
language
is
Python
================================
My
favorite

Errors often occur when building iterator objects and iterators because they are confused. You know, iteratable objects have one__ iter__ Method, instantiating a new iterator each time; And iterators need to implement__ next__ Method to return a single element, in addition to implementing__ iter__ Method to return the iterator itself.

Summary:
An iteratable object must not be its own iterator, that is, an iteratable object must be implemented__ iter__ Method, but it cannot be implemented__ next__ method.
On the other hand, the iterator should be able to iterate all the time__ iter__ Method should return itself.

4, Generator and generator function

1. Generator function

First, let's introduce what a generator function is. In the definition of ordinary functions, you don't need to return, but use the yield keyword to "generate" values. This function is a generator function. When the generator function is called, a generator object is returned, that is, the generator function is the generator factory.

Define a simple generator function:

def gen_123():
    for i in [1,2,3]:
        yield i

it = gen_123()
print(next(it))
print(next(it))
print(next(it))
1
2
3

In the above code, in order to achieve the same function and conform to the python programming style, the generator function is used to replace the sentinceiterator class.

import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence(object):  # Iterative object implementation__ iter__

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return "Sentence(%s)" % reprlib.repr(self.text)
    
    def __iter__(self): # Replace with generator function
        for word in self.words:
            yield word

s2 = Sentence('My favorite language is Pythooooon!')
print(s2)
print('================================')

for word in s2: print(word)
print('================================')

it = iter(s2)
print(next(it))
print(next(it))
Sentence('My favorite ...s Pythooooon!')
================================
My
favorite
language
is
Pythooooon
================================
My
favorite

Note that we do not define the iterator class here, but use the generator function to replace the iterator class, which is more python

2. Generator expression

Generator expression is similar to list generation, dictionary generation, etc., except that [] is replaced by ()

it = (x for x in [1,2,3,4])
print(type(it))
print(next(it))
print(next(it))
print(next(it))
print(next(it))
<class 'generator'>
1
2
3
4

5, Reference

https://www.liaoxuefeng.com/wiki/1016959663602400/1017318207388128
https://www.runoob.com/python3/python3-iterator-generator.html
https://py.eastlakeside.cn/book/DataStructures/generators.html
<FluentPython>

Topics: Python Back-end