Introduction of python parallel computing pathos module

Posted by amsgwp on Fri, 04 Mar 2022 06:35:19 +0100

catalogue

pathos module

1. Pathos's own multi process methods (pathos.multiprocessing.ProcessPool, pathos.multiprocessing.ProcessingPool, pathos.pools.ProcessPool)

2. Multiprocess method for mapping multiprocess module (pathos.multiprocessing.Pool)

3. Multi process method 1 for mapping PP module (pathos.pools.ParallelPool, pathos.pp.ParallelPool, pathos.pp.parallelpython pool, pathos.parallel.parallelpython pool, pathos.parallel.ParallelPool)

4. Multi process method 2 for mapping PP module (pathos.pp.pp module)

5. Methods for mapping python built-in map functions (pathos.serial.SerialPool, pathos.pools.SerialPool)

Instance (pathos module)

(1)pathos.multiprocessing.ProcessPool(), pipe method

(2)pathos.multiprocessing.ProcessPool(), apipe method

(3)pathos.multiprocessing.ProcessPool(), map method

(4)pathos.multiprocessing.ProcessPool(), imap method

(5)pathos.multiprocessing.ProcessPool(), uimap method

(6)pathos.multiprocessing.ProcessPool(), amap method

(7)pathos.pp.ParallelPool(), pipe method

(8)pathos.pp.ParallelPool(), apipe method

(9)pathos.pp.ParallelPool(), map method

(10)pathos.pp.ParallelPool(), amap method

(11)pathos.pp.ParallelPool(), imap method

(12)pathos.pp.ParallelPool(), uimap method

pathos module

pathos is a relatively comprehensive module, which can not only multi process, but also multi thread. It mainly adopts the process pool / thread pool method.

pathos itself has a set of process pool methods, and also integrates the process pool methods of multiprocess and pp modules.

1. Pathos's own multi process methods (pathos.multiprocessing.ProcessPool, pathos.multiprocessing.ProcessingPool, pathos.pools.ProcessPool)

(1) Establish process pool

pathos.multiprocessing.ProcessPool(*args, **kwds) # establishes the process pool of pathos (pathos.multiprocessing.ProcessPool instance).

pathos.multiprocessing.ProcessingPool(*args, **kwds) # ditto.

pathos.pools.ProcessPool(*args, **kwds) # ditto.

Nodes: number of workers. If nodes is not specified, the number of processors (i.e. ncpus) is automatically detected.
ncpus: number of worker processors.
Servers: list of worker servers.
Scheduler: the corresponding scheduler.
WORKDIR: $WORKDIR for scratch calculations/files.
Scatter: if True, scatter gatter (worker pool by default) is adopted.
source: if False, it means that temporary files should be used as little as possible.
timeout: the time to wait for the value returned by the scheduler.

There are also several common methods for process pools:

XXX.close() # closes the process pool. After closing, no new child processes can be added to the pool. Then, you can call the join() function to wait for the existing child processes to finish executing. XXX is the process pool.

XXX.join() # waits for the execution of child processes in the process pool to complete. It needs to be called after the close() function. XXX is the process pool.

def f(a, b = value):
    pass

pool = pathos.multiprocessing.Pool() 
pool.map(f, a_seq, b_seq)
pool.close()
pool.join()

(2) Create child process

(a) A single sub process can be created through the pipe method:

XXX.pipe(f, *args, **kwds) # submits a task in blocking mode (non parallel), blocking until the result is returned. XXX is a process pool instance.

XXX.apipe(f, *args, **kwds) # asynchronously (in parallel) submits a task to the queue and returns the ApplyResult instance (its get method can get the return value of the task, but the get method is blocked and should be called after all child processes are added). XXX is a process pool instance.

f(*args,**kwds) is the activity corresponding to the child process.

(b) If the child process has a return value and the return value needs to be processed centrally, it is recommended to adopt the map method (multiple parameters are allowed for child process activities):

XXX.map (f, *args, **kwds) # use blocking mode to run a batch of tasks in sequence and return a list composed of results. func(iterable1[i], iterable2[i], ...) Is the activity corresponding to the child process. XXX is a process pool instance.

XXX.amap(f, *args, **kwds) #XXX.map The asynchronous (parallel) version of () returns the MapResult instance (which has a get() method to obtain a list of results). XXX is a process pool instance.

def f(a, b): #The map method allows multiple parameters
    pass

pool = pathos.multiprocessing.Pool() 
result = pool.map_async(f, (a0, a1, ...), (b0, b1, ...)).get()
pool.close()
pool.join()

(c) If the memory is not enough, you can also use imap iterator:

XXX.imap(f, *args, **kwds) #XXX.map The non blocking, sequential iterator version of () returns an iterator instance. XXX is a process pool instance.

XXX. uimap(f, *args, **kwds) #XXX. The unordered version of IMAP () (which is not returned in the calling order, but in the ending order) returns the iterator instance. XXX is a process pool instance.

def f(a, b): 
    pass

pool = pathos.multiprocessing.Pool() 
result = pool.uimap(f, a_seq, b_seq)
pool.close()
pool.join()

for item in result:
    pass

2. Multiprocess method for mapping multiprocess module (pathos.multiprocessing.Pool)

(1) Establish process pool

pathos.multiprocessing.Pool(processes=None, initializer=None, initargs=(), maxtasksperchild=None, context=None) # establish a multiprocess process pool.

Processes: the number of work processes used. If processes is None, then OS cpu_ The quantity returned by count().
Initializer: if initializer is not None, then every worker process will call initializer(*initargs) at the beginning.
maxtasksperchild: the number of tasks that can be completed before the work process exits. After completion, a new work process is used to replace the original process to release idle resources. maxtasksperchild defaults to None, which means that the working process will survive as long as the Pool exists.
Context: it is used to formulate the context when the work process is started. Generally, multiprocess is used Pool() or the Pool() method of a context object to create a pool. Both methods set the context appropriately.

(2) Create child process

The method of creating subprocess corresponding to the process pool is the same as multiprocess Pool() (that is, multiprocessing.Pool()) is exactly the same.

3. Multi process method 1 for mapping PP module (pathos.pools.ParallelPool, pathos.pp.ParallelPool, pathos.pp.parallelpython pool, pathos.parallel.parallelpython pool, pathos.parallel.ParallelPool)

(1) Establish process pool

pathos.pp.ParallelPool(*args, **kwds) # establishes the process pool mapping pp module method and returns pathos parallel. Parallelpool instance. Note that the method of establishing the process pool is completely different from that of pp module.

pathos. PP. parallelpython pool (* args, * * kwds) # equivalent pathos pp.ParallelPool().

pathos.pools.ParallelPool(*args, **kwds) # equivalent pathos pp.ParallelPool().

pathos.parallel.ParallelPool(*args, **kwds) # equivalent pathos pp.ParallelPool().

pathos. parallel. Parallelpython pool (* args, * * kwds) # equivalent pathos pp.ParallelPool().

Nodes: number of workers. If nodes is not specified, the number of processors (i.e. ncpus) is automatically detected.
Number of punchers: processors.
Servers: list of worker servers.
Scheduler: the corresponding scheduler.
WORKDIR: $WORKDIR for scratch calculations/files.
Scatter: if True, scatter gatter (worker pool by default) is adopted.
source: if False, it means that temporary files should be used as little as possible.
timeout: the time to wait for the value returned by the scheduler.

(2) Create child process

The child process creation method corresponding to the process pool is the same as pathos multiprocessing. Processpool () is identical (completely different from pp module).

Note that multiprocessing Pipe () or multiprocess The pipe object created by pipe() cannot be passed into the child process (possibly a pickle error). However, in the ParallelPool process pool, the print function of the child process can be directly output to the standard output, so there is no need to pass information to the main process through the pipeline. However, there are often exceptions in the format of print output of child processes. It is best to output it in the main process through the return value.

Moreover, the amap method is a special case. In the amap method, if the subprocess has a print statement, the return result will be incorrect. It only contains the tuple of the return value of the last subprocess, rather than the return values of all subprocesses to form a complete list. The reason is unclear. Therefore, in the amap method, the content that the child process needs to output can only be output in the main process through the return value.

4. Multi process method 2 for mapping PP module (pathos.pp.pp module)

The essence of this method is pp module.

5. Methods for mapping python built-in map functions (pathos.serial.SerialPool, pathos.pools.SerialPool)

This kind of method is actually serial (non parallel) and will not be introduced in detail.

The process pool established by SerialPool can only use pipe, map and imap methods (all blocked), but can not use apipe, amap and uimap methods.

Instance (pathos module)

(1)pathos.multiprocessing.ProcessPool(), pipe method

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    res = []
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    for i in var:
        res.append(p.pipe(f, i, c_conn, t0))

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Result: it can be seen that all sub processes are executed one by one.

 output:
factorial of 4: start@1.11s
factorial of 4: finish@2.61s, res = 24
factorial of 8: start@2.61s
factorial of 8: finish@6.12s, res = 40320
factorial of 12: start@6.12s
factorial of 12: finish@11.62s, res = 479001600
factorial of 20: start@11.63s
factorial of 20: finish@21.13s, res = 2432902008176640000
factorial of 16: start@21.15s
factorial of 16: finish@28.65s, res = 20922789888000
factorial of (4, 8, 12, 20, 16)@28.73s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(2)pathos.multiprocessing.ProcessPool(), apipe method

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    res = []
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    for i in var:
        res.append(p.apipe(f, i, c_conn, t0))
    for i in range(len(res)):
        res[i] = res[i].get()

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

result:

output:
factorial of 4: start@1.10s
factorial of 8: start@1.18s
factorial of 12: start@1.19s
factorial of 20: start@1.25s
factorial of 4: finish@2.60s, res = 24
factorial of 16: start@2.61s
factorial of 8: finish@4.69s, res = 40320
factorial of 12: finish@6.69s, res = 479001600
factorial of 16: finish@10.11s, res = 20922789888000
factorial of 20: finish@10.75s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@10.85s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(3)pathos.multiprocessing.ProcessPool(), map method

Note that the instance will multiprocessing The connection created by pipe () is passed to the child process as a parameter. The pickle error is changed to multiprocess Pipe() creates a connection.

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    conn_s = [c_conn] * len(var)
    t0_s = [t0] * len(var)
    res = p.map(f, var, conn_s, t0_s)

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately.

output:
factorial of 4: start@1.15s
factorial of 8: start@1.15s
factorial of 12: start@1.19s
factorial of 20: start@1.26s
factorial of 4: finish@2.65s, res = 24
factorial of 16: start@2.65s
factorial of 8: finish@4.66s, res = 40320
factorial of 12: finish@6.70s, res = 479001600
factorial of 16: finish@10.15s, res = 20922789888000
factorial of 20: finish@10.76s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@10.91s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(4)pathos.multiprocessing.ProcessPool(), imap method

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    conn_s = [c_conn] * len(var)
    t0_s = [t0] * len(var)
    res = list(p.imap(f, var, conn_s, t0_s))

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately.

output:
factorial of 4: start@1.27s
factorial of 8: start@1.29s
factorial of 12: start@1.30s
factorial of 20: start@1.38s
factorial of 4: finish@2.77s, res = 24
factorial of 16: start@2.77s
factorial of 8: finish@4.79s, res = 40320
factorial of 12: finish@6.81s, res = 479001600
factorial of 16: finish@10.27s, res = 20922789888000
factorial of 20: finish@10.89s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@11.01s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(5)pathos.multiprocessing.ProcessPool(), uimap method

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    conn_s = [c_conn] * len(var)
    t0_s = [t0] * len(var)
    res = list(p.uimap(f, var, conn_s, t0_s))

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately. Moreover, the return value of the fifth process is ahead of the return value of the fourth process.

output:
factorial of 4: start@1.03s
factorial of 8: start@1.08s
factorial of 12: start@1.10s
factorial of 20: start@1.15s
factorial of 4: finish@2.53s, res = 24
factorial of 16: start@2.53s
factorial of 8: finish@4.58s, res = 40320
factorial of 12: finish@6.60s, res = 479001600
factorial of 16: finish@10.03s, res = 20922789888000
factorial of 20: finish@10.66s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@10.78s: [24, 40320, 479001600, 20922789888000, 2432902008176640000]

(6)pathos.multiprocessing.ProcessPool(), amap method

import pathos
import multiprocess
import time

def f(x, conn, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    conn.send('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.multiprocessing.ProcessPool()
    p_conn, c_conn = multiprocess.Pipe()
    t0 = time.time()
    conn_s = [c_conn] * len(var)
    t0_s = [t0] * len(var)
    res = p.amap(f, var, conn_s, t0_s).get()

    print('output:')
    while p_conn.poll():
        print(p_conn.recv())
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately.

output:
factorial of 4: start@1.04s
factorial of 8: start@1.07s
factorial of 12: start@1.12s
factorial of 20: start@1.13s
factorial of 4: finish@2.54s, res = 24
factorial of 16: start@2.54s
factorial of 8: finish@4.58s, res = 40320
factorial of 12: finish@6.62s, res = 479001600
factorial of 16: finish@10.04s, res = 20922789888000
factorial of 20: finish@10.64s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@10.76s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(7)pathos.pp.ParallelPool(), pipe method

Multiprocessing, note Pipe () or multiprocess The pipe object generated by pipe() cannot be passed into the child process (possibly a pickle error). However, pathos PP. parallelpool() in the process pool, the print function of the child process can be directly output to the standard output, so there is no need to pass information to the main process through the pipeline.

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    print('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    print('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    res = []
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    for i in var:
        res.append(p.pipe(f, i, t0))
        
    print('output:')
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Result: it can be seen that all sub processes are executed one by one.

factorial of 4: start@0.12s
factorial of 4: finish@1.62s, res = 24
factorial of 8: start@1.80s
factorial of 8: finish@5.30s, res = 40320
factorial of 12: start@5.46s
factorial of 12: finish@10.96s, res = 479001600
factorial of 20: start@11.16s
factorial of 20: finish@20.66s, res = 2432902008176640000
factorial of 16: start@20.94s
factorial of 16: finish@28.44s, res = 20922789888000
output:
factorial of (4, 8, 12, 20, 16)@28.67s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(8)pathos.pp.ParallelPool(), apipe method

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    print('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    print('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    res = []
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    for i in var:
        res.append(p.apipe(f, i, t0))
        
    print('output:')
    for i in range(len(res)):
        res[i] = res[i].get()
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately.

output:
factorial of 4: start@0.20s
factorial of 4: finish@1.70s, res = 24
 factorial of 8: start@0.21s
factorial of 8: finish@3.71s, res = 40320
 factorial of 12: start@0.13s
factorial of 12: finish@5.63s, res = 479001600
 factorial of 20: start@0.18s
factorial of 20: finish@9.68s, res = 2432902008176640000
factorial of 16: start@1.70s
factorial of 16: finish@9.20s, res = 20922789888000
 factorial of (4, 8, 12, 20, 16)@9.72s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(9)pathos.pp.ParallelPool(), map method

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    print('factorial of %d: start@%.2fs' % (x0, t))
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    print('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
    return ans

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    res= p.map(f, var, [t0] * 5)
        
    print('output:')
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Result: it can be seen that all sub processes are executed one by one.

factorial of 4: start@0.14s
factorial of 4: finish@1.64s, res = 24
factorial of 8: start@1.74s
factorial of 8: finish@5.24s, res = 40320
factorial of 12: start@5.35s
factorial of 12: finish@10.85s, res = 479001600
factorial of 20: start@11.01s
factorial of 20: finish@20.51s, res = 2432902008176640000
factorial of 16: start@20.66s
factorial of 16: finish@28.16s, res = 20922789888000
 output:
factorial of (4, 8, 12, 20, 16)@28.51s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(10)pathos.pp.ParallelPool(), amap method

Note: in the amap method, if the subprocess has a print statement, the return result will be a tuple containing only the return value of the last subprocess, rather than a complete list of the return values of all subprocesses. The reason is unclear. Therefore, in the amap method, the content that the child process needs to output can only be output in the main process through the return value.

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    msg1 = 'factorial of %d: start@%.2fs' % (x0, t)
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    msg2 = 'factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans)
    return (ans, msg1, msg2)

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    ret = p.amap(f, var, [t0] * 5).get()
    res = [item[0] for item in ret]
        
    print('output:')
    for item in ret:
        print(item[1])
        print(item[2])
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately.

output:
factorial of 4: start@0.16s
factorial of 4: finish@1.66s, res = 24
factorial of 8: start@0.18s
factorial of 8: finish@3.68s, res = 40320
factorial of 12: start@0.19s
factorial of 12: finish@5.69s, res = 479001600
factorial of 20: start@0.14s
factorial of 20: finish@9.64s, res = 2432902008176640000
factorial of 16: start@1.66s
factorial of 16: finish@9.16s, res = 20922789888000
factorial of (4, 8, 12, 20, 16)@9.72s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(11)pathos.pp.ParallelPool(), imap method

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    msg1 = 'factorial of %d: start@%.2fs' % (x0, t)
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    msg2 = 'factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans)
    return (ans, msg1, msg2)

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    ret = list(p.imap(f, var, [t0] * 5))
    res = [item[0] for item in ret]
        
    print('output:')
    for item in ret:
        print(item[1])
        print(item[2])
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Result: it can be seen that all sub processes are executed one by one.

output:
factorial of 4: start@0.17s
factorial of 4: finish@1.67s, res = 24
factorial of 8: start@1.67s
factorial of 8: finish@5.17s, res = 40320
factorial of 12: start@5.17s
factorial of 12: finish@10.67s, res = 479001600
factorial of 20: start@10.67s
factorial of 20: finish@20.17s, res = 2432902008176640000
factorial of 16: start@20.17s
factorial of 16: finish@27.67s, res = 20922789888000
factorial of (4, 8, 12, 20, 16)@28.41s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]

(12)pathos.pp.ParallelPool(), uimap method

import pathos
import time

def f(x, t0):
    ans = 1
    x0 = x
    t = time.time() - t0
    msg1 = 'factorial of %d: start@%.2fs' % (x0, t)
    while x > 1:
        ans *= x
        time.sleep(0.5)
        x -= 1
    t = time.time() - t0
    msg2 = 'factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans)
    return (ans, msg1, msg2)

def main():
    var = (4, 8, 12, 20, 16)
    p = pathos.pp.ParallelPool()
    t0 = time.time()
    ret = list(p.uimap(f, var, [t0] * 5))
    res = [item[0] for item in ret]
        
    print('output:')
    for item in ret:
        print(item[1])
        print(item[2])
    t = time.time() - t0
    print('factorial of %s@%.2fs: %s' % (var, t, res))

if __name__ == '__main__':
    main()

Results: it can be seen that the four sub processes of the first batch are started almost at the same time; When a subprocess ends, start the fifth subprocess immediately. Moreover, the return value of the fifth process is ahead of the return value of the fourth process.

output:
factorial of 4: start@0.26s
factorial of 4: finish@1.76s, res = 24
factorial of 8: start@0.29s
factorial of 8: finish@3.79s, res = 40320
factorial of 12: start@0.25s
factorial of 12: finish@5.75s, res = 479001600
factorial of 16: start@1.77s
factorial of 16: finish@9.28s, res = 20922789888000
factorial of 20: start@0.31s
factorial of 20: finish@9.81s, res = 2432902008176640000
factorial of (4, 8, 12, 20, 16)@10.24s: [24, 40320, 479001600, 20922789888000, 2432902008176640000]

Topics: Python Back-end