Operation of reading Python source code Python list

Posted by weyes1 on Fri, 11 Feb 2022 18:08:36 +0100

Introduction:

In the last article, I briefly understood the expression form of python list. Today, I'll briefly understand some operations on list. The following is in Listobject Some methods declared in the H header file, where PyList_New,PyList_SetItem was briefly read in the previous article, pylist_ I won't say more about the size method to obtain the length of the list. Next, let's briefly understand the remaining methods.

1. List reading elements

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    if (!PyList_Check(op)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        }
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    }
    return ((PyListObject *)op) -> ob_item[i];
}

When we find its source code, we will stop looking at the verification code and look directly at its last sentence. In the diagram in the previous article, we can know ob_item refers to an array of pointers. In C language, we can quickly locate the element through the element subscript.

2. Insert elements into the list

The insert method in python can insert elements anywhere in the list. Let's learn how it inserts elements. Let's first analyze its source code:

static int
ins1(PyListObject *self, Py_ssize_t where, PyObject *v)
{
    Py_ssize_t i, n = Py_SIZE(self);  // First, get the number of current list elements
    PyObject **items;
    if (v == NULL) {
        PyErr_BadInternalCall();
        return -1;
    }
    if (n == PY_SSIZE_T_MAX) {
        PyErr_SetString(PyExc_OverflowError,
            "cannot add more objects to list");
        return -1;
    }

    if (list_resize(self, n+1) < 0)
        return -1;

    if (where < 0) {
        where += n;
        if (where < 0)
            where = 0;
    }
    if (where > n)
        where = n;
    // The position to be inserted is fixed here. The upper part should be understandable.
    items = self->ob_item;
    // The following loop is to move all elements after the insertion position back bit by bit, so the time complexity here is O(n)
    for (i = n; --i >= where; )
        items[i+1] = items[i];
    Py_INCREF(v);
    items[where] = v;  // Insert the element to be inserted into the specified position
    return 0;
}

(it's mentioned here that when creating a list, a relatively large space will be applied to avoid applying once every time you add elements. If it's not enough, you can apply again.)

3. Add elements to the list

The addition method intersects the insertion method, which is simpler. It adds an element directly to the end of the list.

static int
app1(PyListObject *self, PyObject *v)
{
    Py_ssize_t n = PyList_GET_SIZE(self);  // Here we get the list length and assign it to n

    assert (v != NULL);
    if (n == PY_SSIZE_T_MAX) {
        PyErr_SetString(PyExc_OverflowError,
            "cannot add more objects to list");
        return -1;
    }

    if (list_resize(self, n+1) < 0)
        return -1;

    Py_INCREF(v);
    PyList_SET_ITEM(self, n, v);  // Insert the element pointer v directly at the last position with index n, where the time complexity is O(1)
    return 0;
}

We can see that the time complexity of the append method is less than that of the insert method.

4. List slicing

List slicing in python is very useful in many scenarios. Let's explore how it is sliced in the source code:

/*
Direct examples: a = [1,2,3,4,5,6], obtain b = a[2:4], and find b
 Analysis: ilow=2,ihigh=4, bring in the following code
*/
static PyObject *
list_slice(PyListObject *a, Py_ssize_t ilow, Py_ssize_t ihigh)
{
    PyListObject *np;  // Declare a list pointer variable to store list slices
    PyObject **src, **dest;
    Py_ssize_t i, len;
    // The following two branch structures deal mainly with extreme cases
    if (ilow < 0)
        ilow = 0;
    else if (ilow > Py_SIZE(a))
        ilow = Py_SIZE(a);
    if (ihigh < ilow)
        ihigh = ilow;
    else if (ihigh > Py_SIZE(a))
        ihigh = Py_SIZE(a);
    len = ihigh - ilow;  // Here we can calculate len=2
    np = (PyListObject *) PyList_New(len); // Create a new list
    if (np == NULL)
        return NULL;

    src = a->ob_item + ilow;  // At this point, a - > ob_ Item should point to the pointer address of element 3, which is used as the first address of the array below
    dest = np->ob_item;
    for (i = 0; i < len; i++) {
        PyObject *v = src[i];
        Py_INCREF(v);
        dest[i] = v;
    }
    // After the loop ends, put the pointers of elements 3 and 4 in NP - > ob_ Item
    return (PyObject *)np;
}

Through the analysis of the above example, we can see that the slice actually creates a new list.

When I see here, there is actually a little dizzy rhythm. Here I want to remind myself, or give an example to illustrate:

Since the original list and sliced ob_item points to an array of pointers, where pointers are stored. Why did a change and b not?

In fact, my thinking here has entered the misunderstanding of derivation in one direction. I stopped to assign pylist from the list_ Setitem method thought for a while and suddenly realized that to modify the list element, you need to declare the variable assignment first before you can modify the original value to this value. Take advantage of the situation and simply draw its diagram:

Here we can clearly know why the original list elements change and the slice elements remain unchanged.

5. List flip

Finally, let's take a look at the source code of list flipping. This is very simple, that is, the last one of the list is exchanged with the first, and the penultimate one is exchanged with the second, one by one.

static void
reverse_slice(PyObject **lo, PyObject **hi)
{
    assert(lo && hi);

    --hi;
    while (lo < hi) {
        PyObject *t = *lo;
        *lo = *hi;
        *hi = t;
        ++lo;
        --hi;
    }
}

int
PyList_Reverse(PyObject *v)
{
    PyListObject *self = (PyListObject *)v;

    if (v == NULL || !PyList_Check(v)) {
        PyErr_BadInternalCall();
        return -1;
    }
    if (Py_SIZE(self) > 1)
        reverse_slice(self->ob_item, self->ob_item + Py_SIZE(self));
    return 0;
}

See here first today...  

Topics: Python Back-end