The basis of data structure from the perspective of python

Posted by nonexistentera on Thu, 20 Jan 2022 01:18:32 +0100

The basis of data structure from the perspective of python

array

concept

An array is an ordered set of finite variables of the same type. Sequential storage in memory can realize logical sequential table.

python mainly uses lists and tuples, which are essentially the encapsulation of arrays.

basic operation

#Initialization list
my_list = [3,1,2,5,4,9,7,2]
#Read element
print(my_list[2])
2
#Update element
my_list[3] = 10
print(my_list[3])
10
#Insert element
#Tail insertion
#Tail insert element
my_list.append(6)
#Middle insert element
my_list.insert(5,11)
print(my_list)
[3, 1, 2, 10, 4, 11, 9, 7, 2, 6]

Understand the insert operation as an array:

class MyArray:
    def __init__(self,capacity):
        self.array = [None] * capacity
        self.size = 0
    def insert(self,index,element):
        #Determine whether the access subscript is out of range
        if index < 0 or index > self.size:
            raise Exception("Out of range of actual elements of array!")
        #Loop from right to left, moving elements one bit to the right
        for i in range(self.size - 1,-1,-1):
            self.array[i + 1] = self.array[i]
        #Place the new element in the vacated position
        self.array[index] = element
        self.size += 1
    def output(self):
        for i in range(self.size):
            print(self.array[i])

The output results are as follows:

array = MyArray(4)
array.insert(0,10)
array.insert(0,11)
array.insert(0,15)
array.output()
15
11
10

However, in this way, once the number of elements exceeds the maximum length of the array, the array will be considered illegal input, so we need to use out of range insertion.

class MyArray:
    def __init__(self,capacity):
        self.array = [None] * capacity
        self.size = 0
    def insert_v2(self,index,element):
        #Determine whether the access subscript is out of range
        if index < 0 or index > self.size:
            raise Exception("Out of range of actual elements of array!")
        #If the actual element reaches the upper limit of the array capacity, the array capacity will be expanded
        if self.size >= len(self.array):
            self.resize()
        #Loop from right to left, moving elements one bit to the right
        for i in range(self.size - 1,-1,-1):
            self.array[i + 1] = self.array[i]
        #Place the new element in the vacated position
        self.array[index] = element
        self.size += 1
    def resize(self):
        array_new = [None] * len(self.array) * 2
        #Copy from old array to new array
        for i in range(self.size):
            array_new[i] = self.array[i]
        self.array = array_new
    def output(self):
        for i in range(self.size):
            print(self.array[i])
array = MyArray(4)
array.insert_v2(0,10)
array.insert_v2(0,11)
array.insert_v2(0,12)
array.insert_v2(0,14)
array.insert_v2(0,15)
array.insert_v2(0,16)
array.output()
16
15
14
12
11
10

The deletion operation is also realized by the following methods:

def remove(self,index):
        #Judge whether the access to the following table is out of range
        if index < 0 or index >= self.size:
            raise Exception("Array actual element range exceeded!")
        #From left to right, all elements move one bit in turn
        for i in range(index,self.size):
            self.array[i] = self.array[i + 1]
        self.size -= 1

The time complexity of this operation is O ( n ) O(n) O(n), but if we copy the last element to the location where the element needs to be deleted, and then delete the last element, there is no need to move a large number of elements, and the time complexity will become O ( 1 ) O(1) O(1).

Array is suitable for scenarios with more read operations and less write operations.

Linked list

concept

Linked list is a physically discontinuous and non sequential data structure, which is composed of several nodes.

  • Unidirectional linked list: it is composed of the variable data storing data and the pointer next pointing to the next node
  • Bidirectional linked list: it also has a prev pointer to the preceding node

The linked list is stored randomly in memory.

basic operation

class Node:
    def __init__(self,data):
        self.data = data
        self.next = None

class LinkedList:
    def __init__(self):
        self.size = 0
        self.head = None
        self.last = None
    def get(self,index):
        if index < 0 or index > self.size:
            raise Exception("Out of range of linked list nodes!")
        p = self.head
        for i in range(index):
            p = p.next
        return p
    def insert(self,data,index):
        if index < 0 or index > self.size:
            raise Exception("Out of range of linked list nodes!")
        node = Node(data)
        if self.size == 0:
            #Empty linked list
            self.head = node
            self.last = node
        elif index == 0:
            #Insert head
            node.next = self.head
            self.head = node
        elif self.size == index:
            #Insert tail
            self.last.next == node
            self.last = node
        else:
            #Insert middle
            prev_node = self.get(index - 1)
            node.next = prev_node.next
            prev_node.next = node
        self.size += 1
    def remove(self,index):
        if index < 0 or index >= self.size:
            raise Exception("Out of range of linked list nodes!")
        #Staging deleted nodes for returning
        if index == 0:
            #Delete header node
            removed_node = self.head
            self.head = self.head.next
        elif index == self.size - 1:
            #Delete tail node
            prev_node = self.get(index - 1)
            removed_node = prev_node.next
            prev_node.next = None
            self.last = prev_node
        else:
            #Delete intermediate node
            prev_node = self.get(index - 1)
            next_node = prev_node.next.next
            removed_node = prev_node.next
            prev_node.next = next_node
            self.size -= 1
            return removed_node
    def output(self):
        p = self.head
        while p is not None:
            print(p.data)
            p = p.next
linkedList = LinkedList()
linkedList.insert(3,0)
linkedList.insert(4,0)
linkedList.insert(5,0)
linkedList.insert(9,2)
linkedList.insert(5,3)
linkedList.insert(6,1)
linkedList.output()
5
6
4
9
5
3
linkedList.remove(0)
linkedList.output()
6
4
9
5
3
  • Performance comparison between array and linked list
lookupto updateinsertdelete
array O ( 1 ) O(1) O(1) O ( 1 ) O(1) O(1) O ( n ) O(n) O(n) O ( n ) O(n) O(n)
Linked list O ( n ) O(n) O(n) O ( 1 ) O(1) O(1) O ( 1 ) O(1) O(1) O ( 1 ) O(1) O(1)

If you need to insert and delete elements frequently, linked lists are more suitable.

Stack and queue

Arrays and linked lists are storage structures, while logical structures depend on physical structures.

Logical structure is divided into linear structure and nonlinear structure

Linear structure: such as sequence table, stack, queue, etc

Nonlinear structure: such as tree, graph, etc.

Stack

Stack is a linear data structure, and the elements in the stack can only be in first out (FILO). The first to enter is the bottom of the stack, and the last to enter is the top of the stack.

The basic operations of stack include push and pop

The list in python has well realized the function of stack. append is equivalent to entering the stack and pop is equivalent to leaving the stack.

queue

Queue is a linear data structure. The elements of queue can only be first in first out (FIFO). The exit end of the queue is the head of the queue and the entry end is the tail of the queue.

When implemented with an array, for the convenience of queue operation, the position of the end of the queue is specified as the next position of the queue element.

The basic operations of queues are enqueue and dequeue.

The queue implemented by array can keep the queue capacity constant by means of circular queue. Until (tail subscript + 1)% array length = header subscript, the queue is full. Since the position pointed to by the end of the queue pointer is always 1 bit empty, the maximum capacity of the queue is 1 bit smaller than the length of the array.

python provides a variety of queue tools, such as collections deque,queue.Queue, etc.

We try to implement a queue ourselves:

class MyQueue:
    def __init__(self,capacity):
        self.list = [None] * capacity
        self.front = 0
        self.rear = 0

    def enqueue(self,element):
        if (self.rear + 1) % len(self.list) == self.front:
            raise Exception("The queue is full!")
        self.list[self.rear] = element
        self.rear = (self.rear + 1) % len(self.list)

    def dequeue(self):
        if self.rear == self.front:
            raise Exception("The queue is full!")
        dequeue_element = self.list[self.front]
        self.front = (self.front + 1) % len(self.list)
        return dequeue_element

    def output(self):
        i = self.front
        while i != self.rear:
            print(self.list[i])
            i = (i + 1) % len(self.list)
myqueue = MyQueue(6)
myqueue.enqueue(3)
myqueue.enqueue(5)
myqueue.enqueue(6)
myqueue.dequeue()
myqueue.dequeue()
myqueue.output()
6

Application of stack and queue

Stack can be used to implement recursive logic, as well as breadcrumb navigation (users trace back to the previous page during browsing).

Queues can be used in waiting queues competing for fair locks in multiple threads, and web crawlers can also store URL s in queues.

Dual end queue combines the advantages of stack and queue. From the head of the team, you can enter and leave the team, and from the end of the team, you can also enter and leave the team.

The priority queue follows who has the highest priority and who comes out first, but it is implemented based on binary heap.

Hashtable

Hash table is also called hash table. This data structure provides the mapping relationship between key and value.

In python, the set corresponding to the hash table is a dictionary. key and array subscript can be converted through hash function.

Read / write operation of hash table

The write operation is to insert a new key value pair (Entry) into the hash mark.

When hash conflicts occur during writing, we can solve them through open addressing method or linked list method.

  • Open addressing method: find the next position of the index of the occupied array.
  • Linked list method: each element of the hash table array is also the head node of a linked list. You only need to insert it into the corresponding linked list.

The read operation is to find the corresponding value in the hash table through the given key.

Some languages use the linked list method: HashMap in Java; The dict in python adopts the open addressing method.

Topics: Python Algorithm data structure