iteration
1.1 concept of iteration
The process of using the for loop to traverse the value is called iteration. For example, the process of using the for loop to traverse the list to obtain the value
for value in [2, 3, 4]: print(value)
1.2 iteratable objects
Standard concept: defined in class__ iter__ Method, and the object created using this class is an iteratable object
Simple memory: objects that use the for loop to traverse values are called iteratable objects, such as lists, tuples, dictionaries, sets, range s, and strings
1.3 judge whether the object is an iterative object
from collections import Iterable result = isinstance((3, 5), Iterable) print("Is the tuple an iteratable object:", result) result = isinstance([3, 5], Iterable) print("Is the list an iteratable object:", result) result = isinstance({"name": "Zhang San"}, Iterable) print("Is the dictionary an iteratable object:", result) result = isinstance("hello", Iterable) print("Is the string an iteratable object:", result) result = isinstance({3, 5}, Iterable) print("Is the collection an iteratable object:", result) result = isinstance(range(5), Iterable) print("range Is it an iteratable object:", result) result = isinstance(5, Iterable) print("Is the integer an iteratable object:", result) result = isinstance(5, int) print("Is the integer int Type object:", result) class Student(object): pass stu = Student() result = isinstance(stu, Iterable) print("stu Is it an iteratable object:", result) result = isinstance(stu, Student) print("stu Is it Student Object of type:", result)
1.4 user defined iteratable objects
Implement in class__ iter__ method
Custom iteratable type code
from collections import Iterable class MyList(object): def __init__(self): self.my_list = list() def append_item(self, item): self.my_list.append(item) def __iter__(self): pass my_list = MyList() my_list.append_item(1) my_list.append_item(2) result = isinstance(my_list, Iterable) print(result) for value in my_list: print(value)
Execution results:
Traceback (most recent call last): True File "/Users/hbin/Desktop/untitled/aa.py", line 24, in <module> for value in my_list: TypeError: iter() returned non-iterator of type 'NoneType'
It can be seen from the execution results that iterators are required to traverse the iteratable objects to obtain data in turn
summary
Provide a__ iter__ The created object is an iteratable object, which needs an iterator to complete data iteration
2. Iterator
2.1 custom iterator object
Custom iterator object: defined in class__ iter__ And__ next__ Method is an iterator object
from collections import Iterable from collections import Iterator class MyList(object): def __init__(self): self.my_list = list() def append_item(self, item): self.my_list.append(item) def __iter__(self): my_iterator = MyIterator(self.my_list) return my_iterator class MyIterator(object): def __init__(self, my_list): self.my_list = my_list self.current_index = 0 result = isinstance(self, Iterator) print("MyIterator Is the object created an iterator:", result) def __iter__(self): return self def __next__(self): if self.current_index < len(self.my_list): self.current_index += 1 return self.my_list[self.current_index - 1] else: raise StopIteration my_list = MyList() my_list.append_item(1) my_list.append_item(2) result = isinstance(my_list, Iterable) print(result) for value in my_list: print(value)
Operation results:
True MyIterator Is the object created an iterator: True 1 2
2.2 iter() function and next() function
ITER function: get the iterator of the iteratable object and call the iterator on the iteratable object__ iter__ method
Next function: get the next value in the iterator and call the next value on the iterator object__ next__ method
class MyList(object): def __init__(self): self.my_list = list() def append_item(self, item): self.my_list.append(item) def __iter__(self): my_iterator = MyIterator(self.my_list) return my_iterator class MyIterator(object): def __init__(self, my_list): self.my_list = my_list self.current_index = 0 def __iter__(self): return self def __next__(self): if self.current_index < len(self.my_list): self.current_index += 1 return self.my_list[self.current_index - 1] else: raise StopIteration my_list = MyList() my_list.append_item(1) my_list.append_item(2) my_iterator = iter(my_list) print(my_iterator) while True: try: value = next(my_iterator) print(value) except StopIteration as e: break
2.3 essence of for loop
Traversal is an iteratable object
The essence of the for item in iteratable loop is to first obtain the iterator of the iteratable object through the iter() function, and then continuously call the next() method on the obtained iterator to obtain the next value and assign it to item. When an exception of StopIteration is encountered, the loop ends.
Iterators are traversed
The iterator of the for item in Iterator loop continuously calls the next() method to get the next value and assign it to item. When an exception of StopIteration is encountered, the loop ends.
2.4 application scenarios of iterators
We found that the core function of the iterator is to return the next data value through the call of the next() function. If the data value returned each time is not read in an existing data set, but calculated and generated by the program according to a certain law, it means that you can no longer rely on an existing data set, that is, you don't need to cache all the data to be iterated at one time for subsequent reading in turn, This can save a lot of storage (memory) space.
For example, there is a famous Fibonacci sequence in mathematics. The first number in the sequence is 0 and the second number is 1. Each subsequent number can be obtained by adding the first two numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
Now we want to pass for in... Loop to iterate over the first n numbers in the iterative Fibonacci sequence. Then we can implement the Fibonacci sequence with an iterator. Each iteration generates the next number through mathematical calculation.
class Fibonacci(object): def __init__(self, num): self.num = num self.a = 0 self.b = 1 self.current_index = 0 def __iter__(self): return self def __next__(self): if self.current_index < self.num: result = self.a self.a, self.b = self.b, self.a + self.b self.current_index += 1 return result else: raise StopIteration fib = Fibonacci(5) for value in fib: print(value)
Execution results:
0 1 1 2 3
Summary
The function of an iterator is to record the location of the current data in order to get the value of the next location
3. Generator
3.1 concept of generator
A generator is a special kind of iterator that does not need to be written like the above class__ iter__ () and__ next__ () method, which is more convenient to use. It can still use the next function and for loop to get values
3.2 method of creating generator 1
The first method is very simple. Just change the [] of a list generator to ()
my_list = [i * 2 for i in range(5)] print(my_list) my_generator = (i * 2 for i in range(5)) print(my_generator) for value in my_generator: print(value)
Execution results:
[0, 2, 4, 6, 8] <generator object <genexpr> at 0x101367048> 0 2 4 6 8
3.3 method of creating generator 2
If you see the yield keyword in the def function, it is the generator
def fibonacci(num): a = 0 b = 1 current_index = 0 print("--11---") while current_index < num: result = a a, b = b, a + b current_index += 1 print("--22---") yield result print("--33---") fib = fibonacci(5) value = next(fib) print(value) value = next(fib) print(value) value = next(fib) print(value)
In the way of using the generator implementation, we will use the iterator__ next__ The basic logic implemented in the method is implemented in a function, but the return of the value returned in each iteration is replaced by yield. At this time, the newly defined function is no longer a function, but a generator.
Simply put: as long as there is a yield keyword in def, it is called a generator
3.4 the generator uses the return keyword
def fibonacci(num): a = 0 b = 1 current_index = 0 print("--11---") while current_index < num: result = a a, b = b, a + b current_index += 1 print("--22---") yield result print("--33---") return "Hee hee" fib = fibonacci(5) value = next(fib) print(value) try: value = next(fib) print(value) except StopIteration as e: print(e.value)
Tips:
There is no syntax problem with using the return keyword in the generator, but when the code executes the return statement, it will stop the iteration and throw an exception to stop the iteration.
3.5 comparison between yield and return
The function that uses the yield keyword is no longer a function, but a generator. (the function that uses yield is the generator)
When the code is executed to yield, it will be suspended, and then the result will be returned. The next time the generator is started, it will continue to execute at the suspended position
Each time you start the generator, you will return a value. Multiple starts can return multiple values, that is, yield can return multiple values
Return can only return a value once. When the code executes the return statement, it stops the iteration and throws an exception to stop the iteration
3.6 use the send method to start the generator and pass parameters
The send method can pass parameters when starting the generator
def gen(): i = 0 while i<5: temp = yield i print(temp) i+=1
Execution results:
In [43]: f = gen() In [44]: next(f) Out[44]: 0 In [45]: f.send('haha') haha Out[45]: 1 In [46]: next(f) None Out[46]: 2 In [47]: f.send('haha') haha Out[47]: 3 In [48]:
Note: if the first startup generator uses the send method, the parameter can only be passed in to None. Generally, the first startup generator uses the next function
Summary
- There are two ways to create a generator. Generally, the yield keyword method is used to create a generator
- The feature of yield is that when the code is executed to yield, it will pause, return the result, start the generator again, and continue to execute at the suspended position
4. Synergetic process
4.1 concept of collaborative process
Co process, also known as micro thread and fiber process, also known as user level thread, completes multi tasks without opening up threads, that is, multi tasks are completed in the case of single thread, and multiple tasks are executed alternately in a certain order. Generally speaking, as long as you see only one yield keyword in def, it is a co process
Collaborative process is also a way to realize multitasking
Code implementation of collaborative yield
Simple implementation of collaborative process
import time def work1(): while True: print("----work1---") yield time.sleep(0.5) def work2(): while True: print("----work2---") yield time.sleep(0.5) def main(): w1 = work1() w2 = work2() while True: next(w1) next(w2) if __name__ == "__main__": main()
Operation results:
----work1--- ----work2--- ----work1--- ----work2--- ----work1--- ----work2--- ----work1--- ----work2--- ----work1--- ----work2--- ----work1--- ----work2--- ...ellipsis...
Summary
The tasks executed between collaborative processes are executed alternately in a certain order
5,greenlet
5.1 introduction to Greenlet
In order to better use the collaborative process to complete multi tasks, the Green let module in python encapsulates it, making it easier to switch tasks
Install the greenlet module using the following command:
pip3 install greenlet
Multitasking using collaboration
import time import greenlet #Task 1 def work1(): for i in range(5): print("work1...") time.sleep(0.2) #Switch to collaboration 2 to execute the corresponding task g2.switch() #Task 2 def work2(): for i in range(5): print("work2...") time.sleep(0.2) #Switch to the first collaboration to execute the corresponding task g1.switch() if __name__ == '__main__': #Create a collaboration and specify the corresponding task g1 = greenlet.greenlet(work1) g2 = greenlet.greenlet(work2) #Switch to the first collaboration to execute the corresponding task g1.switch()
Operation effect
work1... work2... work1... work2... work1... work2... work1... work2... work1... work2...
6,gevent
6.1 introduction to gevent
greenlet has implemented the collaborative process, but it needs manual switching. Here is a third-party library that is more powerful than greenlet and can automatically switch tasks, gevent.
The principle of gevent's internally encapsulated greenlet is that when a greenlet encounters IO (referring to input and output, such as network, file operation, etc.), such as accessing the network, it will automatically switch to other greenlets, wait until the IO operation is completed, and then switch back to continue execution when appropriate.
Because the IO operation is very time-consuming, the program is often in the waiting state. With gevent to automatically switch the collaboration for us, we can ensure that there are always greenlet s running instead of waiting for Io
install
pip3 install gevent
6.2 use of gevent
import gevent def work(n): for i in range(n): print(gevent.getcurrent(), i) g1 = gevent.spawn(work, 5) g2 = gevent.spawn(work, 5) g3 = gevent.spawn(work, 5) g1.join() g2.join() g3.join()
Operation results
<Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 0 <Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 0 <Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 0 <Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 1 <Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 1 <Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 1 <Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 2 <Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 2 <Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 2 <Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 3 <Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 3 <Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 3 <Greenlet "Greenlet-0" at 0x26d8c970488: work(5)> 4 <Greenlet "Greenlet-1" at 0x26d8c970598: work(5)> 4 <Greenlet "Greenlet-2" at 0x26d8c9706a8: work(5)> 4
As you can see, the three greenlet s run in turn rather than alternately
6.3 gevent switching execution
import gevent def work(n): for i in range(n): print(gevent.getcurrent(), i) gevent.sleep(1) g1 = gevent.spawn(work, 5) g2 = gevent.spawn(work, 5) g3 = gevent.spawn(work, 5) g1.join() g2.join() g3.join()
Operation results
<Greenlet at 0x7fa70ffa1c30: f(5)> 0 <Greenlet at 0x7fa70ffa1870: f(5)> 0 <Greenlet at 0x7fa70ffa1eb0: f(5)> 0 <Greenlet at 0x7fa70ffa1c30: f(5)> 1 <Greenlet at 0x7fa70ffa1870: f(5)> 1 <Greenlet at 0x7fa70ffa1eb0: f(5)> 1 <Greenlet at 0x7fa70ffa1c30: f(5)> 2 <Greenlet at 0x7fa70ffa1870: f(5)> 2 <Greenlet at 0x7fa70ffa1eb0: f(5)> 2 <Greenlet at 0x7fa70ffa1c30: f(5)> 3 <Greenlet at 0x7fa70ffa1870: f(5)> 3 <Greenlet at 0x7fa70ffa1eb0: f(5)> 3 <Greenlet at 0x7fa70ffa1c30: f(5)> 4 <Greenlet at 0x7fa70ffa1870: f(5)> 4 <Greenlet at 0x7fa70ffa1eb0: f(5)> 4
6.4 patch the program
import gevent import time from gevent import monkey monkey.patch_all() def work1(num): for i in range(num): print("work1....") time.sleep(0.2) def work2(num): for i in range(num): print("work2....") time.sleep(0.2) if __name__ == '__main__': g1 = gevent.spawn(work1, 3) g2 = gevent.spawn(work2, 3) g1.join() g2.join()
Operation results
work1.... work2.... work1.... work2.... work1.... work2....
6.5 notes
The current program is an endless loop and can have time-consuming operations, so there is no need to add the join method, because the program needs to run all the time and will not exit
Sample code
import gevent import time from gevent import monkey monkey.patch_all() def work1(num): for i in range(num): print("work1....") time.sleep(0.2) def work2(num): for i in range(num): print("work2....") time.sleep(0.2) if __name__ == '__main__': g1 = gevent.spawn(work1, 3) g2 = gevent.spawn(work2, 3) while True: print("Execute in main thread") time.sleep(0.5)
Execution results:
Execute in main thread work1.... work2.... work1.... work2.... work1.... work2.... Execute in main thread execute in main thread execute in main thread ..ellipsis..
If too many coprocessors are used, you need to use the join() method to block the main thread one by one if you want to start them. In this way, the code will be too redundant. You can use gevent The joinall () method starts the coroutine that needs to be used
Example code
import time import gevent def work1(): for i in range(5): print("work1 Work{}".format(i)) gevent.sleep(1) def work2(): for i in range(5): print("work2 Work{}".format(i)) gevent.sleep(1) if __name__ == '__main__': w1 = gevent.spawn(work1) w2 = gevent.spawn(work2) gevent.joinall([w1, w2])
7. Process, thread and collaboration comparison
7.1 relationship among processes, threads and coroutines
- A process has at least one thread, and there can be multiple threads in the process
- There can be multiple coprocessors in a thread
7.2 comparison of process, thread and thread
- A process is a unit of resource allocation
- Thread is the unit of operating system scheduling
- Process switching requires the most resources and is inefficient
- Thread switching requires average resources and efficiency (of course, without considering GIL)
- The cooperative process switching task has small resources and high efficiency
- Multiprocesses and multithreads may be parallel depending on the number of cpu cores, but the coroutines are in one thread, so they are concurrent
Summary
1. Processes, threads and collaborative processes can complete multiple tasks and can be selected and used according to their actual development needs
2. Because threads and coroutines require very few resources, they are most likely to be used
3. The least resources are needed to develop the cooperation process