Fundamentals of Python

Posted by rejoice on Mon, 10 Jan 2022 09:29:21 +0100

iteration

1.1 concept of iteration

The process of using the for loop to traverse the value is called iteration. For example, the process of using the for loop to traverse the list to obtain the value

for value in [2, 3, 4]:
    print(value)

1.2 iteratable objects

Standard concept: defined in class__ iter__ Method, and the object created using this class is an iteratable object

Simple memory: objects that use the for loop to traverse values are called iteratable objects, such as lists, tuples, dictionaries, sets, range s, and strings

1.3 judge whether the object is an iterative object

from collections import Iterable

result = isinstance((3, 5), Iterable)
print("Is the tuple an iteratable object:", result)

result = isinstance([3, 5], Iterable)
print("Is the list an iteratable object:", result)

result = isinstance({"name": "Zhang San"}, Iterable)
print("Is the dictionary an iteratable object:", result)

result = isinstance("hello", Iterable)
print("Is the string an iteratable object:", result)

result = isinstance({3, 5}, Iterable)
print("Is the collection an iteratable object:", result)

result = isinstance(range(5), Iterable)
print("range Is it an iteratable object:", result)

result = isinstance(5, Iterable)
print("Is the integer an iteratable object:", result)


result = isinstance(5, int)
print("Is the integer int Type object:", result)

class Student(object):
    pass

stu = Student()
result = isinstance(stu, Iterable)

print("stu Is it an iteratable object:", result)

result = isinstance(stu, Student)

print("stu Is it Student Object of type:", result)

1.4 user defined iteratable objects

Implement in class__ iter__ method

Custom iteratable type code

from collections import Iterable
class MyList(object):

    def __init__(self):
        self.my_list = list()

    
    def append_item(self, item):
        self.my_list.append(item)

    def __iter__(self):
        
        pass

my_list = MyList()
my_list.append_item(1)
my_list.append_item(2)
result = isinstance(my_list, Iterable)

print(result)

for value in my_list:
    print(value)

Execution results:

Traceback (most recent call last):
True
  File "/Users/hbin/Desktop/untitled/aa.py", line 24, in <module>
    for value in my_list:
TypeError: iter() returned non-iterator of type 'NoneType'

It can be seen from the execution results that iterators are required to traverse the iteratable objects to obtain data in turn

summary

Provide a__ iter__ The created object is an iteratable object, which needs an iterator to complete data iteration

2. Iterator

2.1 custom iterator object

Custom iterator object: defined in class__ iter__ And__ next__ Method is an iterator object

from collections import Iterable
from collections import Iterator

class MyList(object):

    def __init__(self):
        self.my_list = list()
    
    def append_item(self, item):
        self.my_list.append(item)

    def __iter__(self):     
        my_iterator = MyIterator(self.my_list)
        return my_iterator

class MyIterator(object):

    def __init__(self, my_list):
        self.my_list = my_list  
        self.current_index = 0
        result = isinstance(self, Iterator)
        print("MyIterator Is the object created an iterator:", result)

    def __iter__(self):
        return self

    def __next__(self):
        if self.current_index < len(self.my_list):
            self.current_index += 1
            return self.my_list[self.current_index - 1]
        else:           
            raise StopIteration

my_list = MyList()
my_list.append_item(1)
my_list.append_item(2)
result = isinstance(my_list, Iterable)
print(result)

for value in my_list:
    print(value)

Operation results:

True
MyIterator Is the object created an iterator: True
1
2

2.2 iter() function and next() function

ITER function: get the iterator of the iteratable object and call the iterator on the iteratable object__ iter__ method

Next function: get the next value in the iterator and call the next value on the iterator object__ next__ method

class MyList(object):

    def __init__(self):
        self.my_list = list()
    
    def append_item(self, item):
        self.my_list.append(item)

    def __iter__(self):      
        my_iterator = MyIterator(self.my_list)
        return my_iterator

class MyIterator(object):

    def __init__(self, my_list):
        self.my_list = my_list
        self.current_index = 0

    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current_index < len(self.my_list):
            self.current_index += 1
            return self.my_list[self.current_index - 1]
        else:           
            raise StopIteration

my_list = MyList()
my_list.append_item(1)
my_list.append_item(2)
my_iterator = iter(my_list)
print(my_iterator)

while True:
    try:
        value = next(my_iterator)
        print(value)
    except StopIteration as e:
        break

2.3 essence of for loop

Traversal is an iteratable object

The essence of the for item in iteratable loop is to first obtain the iterator of the iteratable object through the iter() function, and then continuously call the next() method on the obtained iterator to obtain the next value and assign it to item. When an exception of StopIteration is encountered, the loop ends.

Iterators are traversed

The iterator of the for item in Iterator loop continuously calls the next() method to get the next value and assign it to item. When an exception of StopIteration is encountered, the loop ends.

2.4 application scenarios of iterators

We found that the core function of the iterator is to return the next data value through the call of the next() function. If the data value returned each time is not read in an existing data set, but calculated and generated by the program according to a certain law, it means that you can no longer rely on an existing data set, that is, you don't need to cache all the data to be iterated at one time for subsequent reading in turn, This can save a lot of storage (memory) space.

For example, there is a famous Fibonacci sequence in mathematics. The first number in the sequence is 0 and the second number is 1. Each subsequent number can be obtained by adding the first two numbers:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...

Now we want to pass for in... Loop to iterate over the first n numbers in the iterative Fibonacci sequence. Then we can implement the Fibonacci sequence with an iterator. Each iteration generates the next number through mathematical calculation.

class Fibonacci(object):

    def __init__(self, num):   
        self.num = num   
        self.a = 0
        self.b = 1   
        self.current_index = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.current_index < self.num:
            result = self.a
            self.a, self.b = self.b, self.a + self.b
            self.current_index += 1
            return result
        else:
            raise StopIteration

fib = Fibonacci(5)

for value in fib:
    print(value)

Execution results:

0
1
1
2
3

Summary

The function of an iterator is to record the location of the current data in order to get the value of the next location

3. Generator

3.1 concept of generator

A generator is a special kind of iterator that does not need to be written like the above class__ iter__ () and__ next__ () method, which is more convenient to use. It can still use the next function and for loop to get values

3.2 method of creating generator 1

The first method is very simple. Just change the [] of a list generator to ()

my_list = [i * 2 for i in range(5)]
print(my_list)

my_generator = (i * 2 for i in range(5))
print(my_generator)

for value in my_generator:
    print(value)

Execution results:

[0, 2, 4, 6, 8]
<generator object <genexpr> at 0x101367048>
0
2
4
6
8

3.3 method of creating generator 2

If you see the yield keyword in the def function, it is the generator

def fibonacci(num):
    a = 0
    b = 1
    
    current_index = 0
    print("--11---")
    while current_index < num:
        result = a
        a, b = b, a + b
        current_index += 1
        print("--22---")
        
        yield result
        print("--33---")

fib = fibonacci(5)
value = next(fib)
print(value)
value = next(fib)
print(value)

value = next(fib)
print(value)

In the way of using the generator implementation, we will use the iterator__ next__ The basic logic implemented in the method is implemented in a function, but the return of the value returned in each iteration is replaced by yield. At this time, the newly defined function is no longer a function, but a generator.

Simply put: as long as there is a yield keyword in def, it is called a generator

3.4 the generator uses the return keyword

def fibonacci(num):
    a = 0
    b = 1
    
    current_index = 0
    print("--11---")
    while current_index < num:
        result = a
        a, b = b, a + b
        current_index += 1
        print("--22---")
        
        yield result
        print("--33---")
        return "Hee hee"

fib = fibonacci(5)
value = next(fib)
print(value)

try:
    value = next(fib)
    print(value)
except StopIteration as e:
    
    print(e.value)

Tips:

There is no syntax problem with using the return keyword in the generator, but when the code executes the return statement, it will stop the iteration and throw an exception to stop the iteration.

3.5 comparison between yield and return

The function that uses the yield keyword is no longer a function, but a generator. (the function that uses yield is the generator)

When the code is executed to yield, it will be suspended, and then the result will be returned. The next time the generator is started, it will continue to execute at the suspended position

Each time you start the generator, you will return a value. Multiple starts can return multiple values, that is, yield can return multiple values

Return can only return a value once. When the code executes the return statement, it stops the iteration and throws an exception to stop the iteration

3.6 use the send method to start the generator and pass parameters

The send method can pass parameters when starting the generator

def gen():
    i = 0
    while i<5:
        temp = yield i
        print(temp)
        i+=1

Execution results:

In [43]: f = gen()

In [44]: next(f)
Out[44]: 0

In [45]: f.send('haha')
haha
Out[45]: 1

In [46]: next(f)
None
Out[46]: 2

In [47]: f.send('haha')
haha
Out[47]: 3

In [48]:

Note: if the first startup generator uses the send method, the parameter can only be passed in to None. Generally, the first startup generator uses the next function

Summary

  • There are two ways to create a generator. Generally, the yield keyword method is used to create a generator
  • The feature of yield is that when the code is executed to yield, it will pause, return the result, start the generator again, and continue to execute at the suspended position

4. Synergetic process

4.1 concept of collaborative process

Co process, also known as micro thread and fiber process, also known as user level thread, completes multi tasks without opening up threads, that is, multi tasks are completed in the case of single thread, and multiple tasks are executed alternately in a certain order. Generally speaking, as long as you see only one yield keyword in def, it is a co process

Collaborative process is also a way to realize multitasking

Code implementation of collaborative yield

Simple implementation of collaborative process

import time

def work1():
    while True:
        print("----work1---")
        yield
        time.sleep(0.5)

def work2():
    while True:
        print("----work2---")
        yield
        time.sleep(0.5)

def main():
    w1 = work1()
    w2 = work2()
    while True:
        next(w1)
        next(w2)

if __name__ == "__main__":
    main()

Operation results:

----work1---
----work2---
----work1---
----work2---
----work1---
----work2---
----work1---
----work2---
----work1---
----work2---
----work1---
----work2---
...ellipsis...

Summary

The tasks executed between collaborative processes are executed alternately in a certain order

5,greenlet

5.1 introduction to Greenlet

In order to better use the collaborative process to complete multi tasks, the Green let module in python encapsulates it, making it easier to switch tasks

Install the greenlet module using the following command:

pip3 install greenlet

Multitasking using collaboration

import time
import greenlet

#Task 1
def work1():
    for i in range(5):
        print("work1...")
        time.sleep(0.2)
        #Switch to collaboration 2 to execute the corresponding task
        g2.switch()

#Task 2
def work2():
    for i in range(5):
        print("work2...")
        time.sleep(0.2)
        #Switch to the first collaboration to execute the corresponding task
        g1.switch()

if __name__ == '__main__':
    #Create a collaboration and specify the corresponding task
    g1 = greenlet.greenlet(work1)
    g2 = greenlet.greenlet(work2)

    #Switch to the first collaboration to execute the corresponding task
    g1.switch()

Operation effect

work1...
work2...
work1...
work2...
work1...
work2...
work1...
work2...
work1...
work2...

6,gevent

6.1 introduction to gevent

greenlet has implemented the collaborative process, but it needs manual switching. Here is a third-party library that is more powerful than greenlet and can automatically switch tasks, gevent.

The principle of gevent's internally encapsulated greenlet is that when a greenlet encounters IO (referring to input and output, such as network, file operation, etc.), such as accessing the network, it will automatically switch to other greenlets, wait until the IO operation is completed, and then switch back to continue execution when appropriate.

Because the IO operation is very time-consuming, the program is often in the waiting state. With gevent to automatically switch the collaboration for us, we can ensure that there are always greenlet s running instead of waiting for Io

install

pip3 install gevent

6.2 use of gevent

import gevent

def work(n):
    for i in range(n):
        
        print(gevent.getcurrent(), i)

g1 = gevent.spawn(work, 5)
g2 = gevent.spawn(work, 5)
g3 = gevent.spawn(work, 5)
g1.join()
g2.join()
g3.join()

Operation results

<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 0
<Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 0
<Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 0
<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 1
<Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 1
<Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 1
<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 2
<Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 2
<Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 2
<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 3
<Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 3
<Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 3
<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 4
<Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 4
<Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 4

As you can see, the three greenlet s run in turn rather than alternately

6.3 gevent switching execution

import gevent

def work(n):
    for i in range(n):
        
        print(gevent.getcurrent(), i)
        
        gevent.sleep(1)

g1 = gevent.spawn(work, 5)
g2 = gevent.spawn(work, 5)
g3 = gevent.spawn(work, 5)
g1.join()
g2.join()
g3.join()

Operation results

<Greenlet at 0x7fa70ffa1c30: f(5)> 0
<Greenlet at 0x7fa70ffa1870: f(5)> 0
<Greenlet at 0x7fa70ffa1eb0: f(5)> 0
<Greenlet at 0x7fa70ffa1c30: f(5)> 1
<Greenlet at 0x7fa70ffa1870: f(5)> 1
<Greenlet at 0x7fa70ffa1eb0: f(5)> 1
<Greenlet at 0x7fa70ffa1c30: f(5)> 2
<Greenlet at 0x7fa70ffa1870: f(5)> 2
<Greenlet at 0x7fa70ffa1eb0: f(5)> 2
<Greenlet at 0x7fa70ffa1c30: f(5)> 3
<Greenlet at 0x7fa70ffa1870: f(5)> 3
<Greenlet at 0x7fa70ffa1eb0: f(5)> 3
<Greenlet at 0x7fa70ffa1c30: f(5)> 4
<Greenlet at 0x7fa70ffa1870: f(5)> 4
<Greenlet at 0x7fa70ffa1eb0: f(5)> 4

6.4 patch the program

import gevent
import time
from gevent import monkey

monkey.patch_all()

def work1(num):
    for i in range(num):
        print("work1....")
        time.sleep(0.2)

def work2(num):
    for i in range(num):
        print("work2....")
        time.sleep(0.2)
        
if __name__ == '__main__':
    
    g1 = gevent.spawn(work1, 3)
    g2 = gevent.spawn(work2, 3) 
    g1.join()
    g2.join()

Operation results

work1....
work2....
work1....
work2....
work1....
work2....

6.5 notes

The current program is an endless loop and can have time-consuming operations, so there is no need to add the join method, because the program needs to run all the time and will not exit

Sample code

import gevent
import time
from gevent import monkey

monkey.patch_all()

def work1(num):
    for i in range(num):
        print("work1....")
        time.sleep(0.2)
        
def work2(num):
    for i in range(num):
        print("work2....")
        time.sleep(0.2)
        
if __name__ == '__main__':
    
    g1 = gevent.spawn(work1, 3)
    g2 = gevent.spawn(work2, 3)

    while True:
        print("Execute in main thread")
        time.sleep(0.5)

Execution results:

Execute in main thread
work1....
work2....
work1....
work2....
work1....
work2....
Execute in main thread execute in main thread execute in main thread
..ellipsis..

If too many coprocessors are used, you need to use the join() method to block the main thread one by one if you want to start them. In this way, the code will be too redundant. You can use gevent The joinall () method starts the coroutine that needs to be used

Example code

 import time
import gevent

def work1():
    for i in range(5):
        print("work1 Work{}".format(i))
        gevent.sleep(1)

def work2():
    for i in range(5):
        print("work2 Work{}".format(i))
        gevent.sleep(1)

if __name__ == '__main__':
    w1 = gevent.spawn(work1)
    w2 = gevent.spawn(work2)
    gevent.joinall([w1, w2])  

7. Process, thread and collaboration comparison

7.1 relationship among processes, threads and coroutines

  • A process has at least one thread, and there can be multiple threads in the process
  • There can be multiple coprocessors in a thread

7.2 comparison of process, thread and thread

  1. A process is a unit of resource allocation
  2. Thread is the unit of operating system scheduling
  3. Process switching requires the most resources and is inefficient
  4. Thread switching requires average resources and efficiency (of course, without considering GIL)
  5. The cooperative process switching task has small resources and high efficiency
  6. Multiprocesses and multithreads may be parallel depending on the number of cpu cores, but the coroutines are in one thread, so they are concurrent

Summary

  1. Processes, threads and collaborative processes can complete multiple tasks and can be selected and used according to their actual development needs

  2. Because threads and coroutines require very few resources, they are most likely to be used

  3. The least resources are needed to develop the cooperation process

Topics: Python