Top of Python performance

Posted by moonie on Wed, 08 Dec 2021 07:20:38 +0100

Core value

1. How Python adjusts C / C + + 2. How much is the performance difference between the two in computing intensive application scenarios.

background

I found this paragraph after reading websockets

4.Performance: memory usage is optimized and configurable. A C extension accelerates expensive operations. It's pre-compiled for Linux, macOS and Windows and packaged in the wheel format for each system and Python version

It means that for computing tasks with high overhead, an extension written in C language is used inside websockets to complete its calculation. The author hopes to improve the performance of the program in this way. So the question is, how much faster will c language deal with the same problem than Python language?

Design test scenarios

Theoretically, in order to be as accurate and comprehensive as possible, I should set test cases for different scenarios. In this way, I don't know when I can finish it. Let's save it for the next issue.

So this time I want to make a simple. First, I can know how many times the difference is in a computing intensive application scenario. Second, I want to introduce in detail how to call C/C + + with Python.

Finally, I set the scene to calculate the nth bit of the Fibonacci sequence.

Python implementation

Now let's use Python to calculate the nth bit of the Fibonacci sequence and measure its time consumption.

#!/usr/bin/env python3
# -*- coding: utf8 -*-

import time

def fib(n):
    """
    Find the second order of the Fibonacci sequence n Value of bit

    Parameter
    ---------
    n: int 
        In the second part of the bonachi sequence n position
    """
    if n <= 1:
        return 1
    else:
        return fib(n -1) + fib(n - 2)


if __name__ == "__main__":
    # Calculate the 39th bit of the Banach sequence and print it
    start_at = time.time()
    fib(39)
    end_at = time.time()

    print(f"total-time = {end_at - start_at}")

18.38 s econds on my Mac:

python3 fib.py
total-time = 18.83948802947998

C + + implementation

In terms of execution time, the effect of Python using recursive algorithm to calculate the 39th bit is not ideal. Let's take a look at the execution time of C + +.

#include<iostream>
#include<ctime>
using namespace std;

int fib(int n)
{
    if(n <= 1)
    {
        return 1;
    }
    else 
    {
        return fib(n - 1) + fib(n - 2);
    }
}

int main() 
{
    clock_t start_at = clock();
    int number = fib(39);
    clock_t end_at = clock();
    cout<<"total-time =  "<<double(end_at - start_at)/CLOCKS_PER_SEC<<endl;

    return 0;
}

0.34 s on my Mac:

g++ -o fib-cpp main.cpp && ./fib-cpp
total-time =  0.345918

Python vs C++

For the scenario of calculating Fibonacci sequence, the time consumption of two different languages is as follows.

Test items

language

Time (s)

Calculate the Fibonacci sequence

C++

0.34

Calculate the Fibonacci sequence

Python

18.38

The performance of the two is 54 times related (54 = 18.38 / 0.34).

Implementation of Python calling C + +

The Python interpreter cannot directly call the source files of C + + language, but as long as we compile the source files of C + + into shared libraries (so files of linux platform and dll files of windows platform), python can use them as a module.

In the actual code writing, our library should be written according to the defined specifications of Python, otherwise the interpreter still can't recognize it.

The first step is to realize the function

int fib(int n)
{
    if(n <= 1)
    {
        return 1;
    }
    else 
    {
        return fib(n - 1) + fib(n - 2);
    }
}

The second step is to wrap our function with Python data types

static PyObject *fib_wraper(PyObject *self,PyObject *args) 
{
    int n = 0,result = 0;
    PyArg_ParseTuple(args,"i",&n);
    result = fib(n);
    return Py_BuildValue("i",result);
}

Step 3: define the function list of the module

static PyMethodDef methods[] = {
    {"fib",fib_wraper,METH_VARARGS,"fib generator ."},
    {0,0,0,0}
};

Step 4 define the module

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "fibcpp",
    "a simple module",
    -1,
    methods
};

Step 5 define the initialization logic of the module

PyMODINIT_FUNC PyInit_fibcpp(void)
{
    return PyModule_Create(&module);
}

Complete C + + code implementation (source file name main.cpp)

#include<iostream>
#include<ctime>
#include <Python.h>

using namespace std;

int fib(int n)
{
    if(n <= 1)
    {
        return 1;
    }
    else 
    {
        return fib(n - 1) + fib(n - 2);
    }
}

static PyObject *fib_wraper(PyObject *self,PyObject *args) 
{
    int n = 0,result = 0;
    PyArg_ParseTuple(args,"i",&n);
    result = fib(n);
    return Py_BuildValue("i",result);
}

static PyMethodDef methods[] = {
    {"fib",fib_wraper,METH_VARARGS,"fib generator ."},
    {0,0,0,0}
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "fibcpp",
    "a simple module",
    -1,
    methods
};

PyMODINIT_FUNC PyInit_fibcpp(void)
{
    return PyModule_Create(&module);
}

Compiling C + + code into library files

The development I did on the linux platform should be similar on windows. The compilation commands are as follows.

g++ -fPIC -I /usr/local/python/include/python3.8/ -o fibcpp.so -shared main.cpp

Test the efficiency of Python calling C + +

#!/usr/bin/env python3
# -*- coding: utf8 -*-
import time
from fibcpp import fib

if __name__ == "__main__":
# Calculate the 39th bit of the Banach sequence and print it
    start_at = time.time()
    fib(39)
    end_at = time.time()
    print(f"total-time = {end_at - start_at}")

The effect of operation is as follows.

python3 fib-embbed.py 
total-time = 0.6759123802185059

summary

The efficiency of using the same algorithm and different implementation methods for the same problem is shown in the table below.

Test items

language

Time (s)

Calculate the Fibonacci sequence

C++

0.34

Calculate the Fibonacci sequence

Python-Call-C++

0.67

Calculate the Fibonacci sequence

Python

18.38