Share a simple and easy-to-use python parallel module [PP module]

Posted by Stille on Sun, 16 Jan 2022 07:44:17 +0100

At present, most personal computers are multicore, but when you run a python program, you will find that there is actually only one core (CPU) running the code, and the other cores are lazy, as shown below.
The purpose of parallel computing is to run all cores to speed up code execution. In python, due to the existence of global interpreter locks (GIL s), if you use the default Python multithreads for parallel computing, you may find that code execution is not faster or even slower than using a single core!!!

Some parallel modules overcome this limitation by modifying pyhton's GIL mechanism, making Python efficient for parallel computing on multi-core computers. The PP (Parallel Python) module is one of them.

pp module is a lightweight parallel module of python, which can effectively improve the efficiency of program operation and is very convenient to use.

The following test code calculates the sum of all prime numbers in the range from 0 to a given range, that is, 0100000, 0100100,..., 0 to 100700:

import pp,time,math
def isprime(n):
    """Returns True if n is prime and False otherwise"""
    if not isinstance(n, int):
        raise TypeError("argument passed to is_prime is not of 'int' type")
    if n < 2:
        return False
    if n == 2:
        return True
    max = int(math.ceil(math.sqrt(n)))
    i = 2
    while i <= max:
        if n % i == 0:
            return False
        i += 1
    return True

def sum_pimes(n):
    """Calculates sum of all primes below given integer n"""
    return sum([x for x in range(2, n) if isprime(x)])

#Serial Code
print("{beg}Serial Program{beg}".format(beg='-'*16))
startTime = time.time()

inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
results = [ (input,sum_primes(input)) for input in inputs ]

for input, result in results:
    print("Sum of primes below %s is %s" % (input, result))

print("Time-consuming:%.3fs"%( time.time()-startTime ) )

#Parallel Code
print("{beg}Parallel Programs{beg}".format(beg='-'*16))
startTime = time.time()

job_server = pp.Server()
inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
jobs = [(input, job_server.submit(sum_primes, (input, ), (isprime, ),
        ("math", )) ) for input in inputs]

for input, job in jobs:
    print("Sum of primes below %s is %s" % (input, job()))

print("Time-consuming:%.3fs"%( time.time()-startTime ) )

Running Effect Chart:

Since the building owner's computer is a pseudo-4 core (actually a 2 core, since intel uses hyperthreading to make it look like a 4 core), the efficiency is only doubled, but it's still good ~
In theory, how many cores are actually in your computer and how much can be improved by parallel computing (the actual increase in multiples will be lower).

You can see that the parallel code is only a few lines more than the serial code, but the efficiency is doubled. So, are you interested in seeing this fish oil?

This module is now officially introduced

One: Installation

1. Download the corresponding version of the P module from the official website: http://www.parallelpython.com/content/view/18/32/

My system is Windows and my python version is 3.4.4, so I chose the following version

2. Open the command line in the unzipped directory and enter python setup.py install automatically starts the installation (that's it)

2: Use

1. Import Modules

import pp

2. Open Services

job_server = pp.Server() 
#ncpus = 4 #Number of cores you can specify for use
#job_server = pp.Server(ncpus) #Create Service
#Default to use all cores

3. Submit Tasks

f1 = job_server.submit(func1, args1, depfuncs1, modules1) 
#func1: Functions executed in parallel
#Args1: parameter of func, passed in as tuple
#depfuncs1: Function called by func, passed in as tuple
#modules1: The function executes the module that needs to be called, passing in as a tuple

4. Get results

r1 = f1()

The following explains the use of the pp module based on previous test examples

math
def isprime(n):
    if not isinstance(n, int):
        raise TypeError("argument passed to is_prime is not of 'int' type")
    if n < 2:
        return False
    if n == 2:
        return True
    max = int(math.ceil(math.sqrt(n)))
    i = 2
    while i <= max:
        if n % i == 0:
            return False
        i += 1
    return True

def sum_pimes(n):
    return sum([x for x in range(2, n) if isprime(x)])

main program

import pp
job_server = pp.Server()
inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
jobs = [(input, job_server.submit(sum_primes, (input, ), (isprime, ),
        ("math", )) ) for input in inputs]
for input, job in jobs:
    print("Sum of primes below %s is %s" % (input, job()))

Line 1: Import pp module
Line 2: Start the pp service without filling in parameters to indicate that all cores are used for calculation
Line 3: Set the task that the program needs to calculate the sum of all prime numbers in the 8 ranges of 0100000, 0100100,..., 0-100700
Line 4: A list derivation that passes values from inputs to sum_primes, and execute job_server.submit function.

In job_ Server. Parameters for submit:

  • sum_primes is a function executed in parallel, note that you cannot write sum_here Primes()
  • (input,) is sum_ The parameters of primes are 100000, 100100,..., 100700. Note that you are passing in as a tuple here, and "," is the token of the tuple, so the comma in parentheses cannot be missed.
  • (isprime,) is sum_ Functions required for primes execution, passed in as tuples
  • ("math") is sum_ Modules needed in primes or isprime functions, passed in as tuples

The last two lines: After executing the fourth line, each element in the jobs list is in the form of a binary tuple (input,func). These two lines mean that the func is called circularly and the results are printed. (Parallel calculation results are obtained by function call)
This is where a parallel computing code using the pp module is implemented (parallel computing is so simple).

The amount of code varies depending on the requirements of the task, but the basic process is the four steps above.

Topics: Python