The basis of data structure from the perspective of python
array
concept
An array is an ordered set of finite variables of the same type. Sequential storage in memory can realize logical sequential table.
python mainly uses lists and tuples, which are essentially the encapsulation of arrays.
basic operation
#Initialization list my_list = [3,1,2,5,4,9,7,2] #Read element print(my_list[2]) 2 #Update element my_list[3] = 10 print(my_list[3]) 10 #Insert element #Tail insertion #Tail insert element my_list.append(6) #Middle insert element my_list.insert(5,11) print(my_list) [3, 1, 2, 10, 4, 11, 9, 7, 2, 6]
Understand the insert operation as an array:
class MyArray: def __init__(self,capacity): self.array = [None] * capacity self.size = 0 def insert(self,index,element): #Determine whether the access subscript is out of range if index < 0 or index > self.size: raise Exception("Out of range of actual elements of array!") #Loop from right to left, moving elements one bit to the right for i in range(self.size - 1,-1,-1): self.array[i + 1] = self.array[i] #Place the new element in the vacated position self.array[index] = element self.size += 1 def output(self): for i in range(self.size): print(self.array[i])
The output results are as follows:
array = MyArray(4) array.insert(0,10) array.insert(0,11) array.insert(0,15) array.output() 15 11 10
However, in this way, once the number of elements exceeds the maximum length of the array, the array will be considered illegal input, so we need to use out of range insertion.
class MyArray: def __init__(self,capacity): self.array = [None] * capacity self.size = 0 def insert_v2(self,index,element): #Determine whether the access subscript is out of range if index < 0 or index > self.size: raise Exception("Out of range of actual elements of array!") #If the actual element reaches the upper limit of the array capacity, the array capacity will be expanded if self.size >= len(self.array): self.resize() #Loop from right to left, moving elements one bit to the right for i in range(self.size - 1,-1,-1): self.array[i + 1] = self.array[i] #Place the new element in the vacated position self.array[index] = element self.size += 1 def resize(self): array_new = [None] * len(self.array) * 2 #Copy from old array to new array for i in range(self.size): array_new[i] = self.array[i] self.array = array_new def output(self): for i in range(self.size): print(self.array[i])
array = MyArray(4) array.insert_v2(0,10) array.insert_v2(0,11) array.insert_v2(0,12) array.insert_v2(0,14) array.insert_v2(0,15) array.insert_v2(0,16) array.output() 16 15 14 12 11 10
The deletion operation is also realized by the following methods:
def remove(self,index): #Judge whether the access to the following table is out of range if index < 0 or index >= self.size: raise Exception("Array actual element range exceeded!") #From left to right, all elements move one bit in turn for i in range(index,self.size): self.array[i] = self.array[i + 1] self.size -= 1
The time complexity of this operation is O ( n ) O(n) O(n), but if we copy the last element to the location where the element needs to be deleted, and then delete the last element, there is no need to move a large number of elements, and the time complexity will become O ( 1 ) O(1) O(1).
Array is suitable for scenarios with more read operations and less write operations.
Linked list
concept
Linked list is a physically discontinuous and non sequential data structure, which is composed of several nodes.
- Unidirectional linked list: it is composed of the variable data storing data and the pointer next pointing to the next node
- Bidirectional linked list: it also has a prev pointer to the preceding node
The linked list is stored randomly in memory.
basic operation
class Node: def __init__(self,data): self.data = data self.next = None class LinkedList: def __init__(self): self.size = 0 self.head = None self.last = None def get(self,index): if index < 0 or index > self.size: raise Exception("Out of range of linked list nodes!") p = self.head for i in range(index): p = p.next return p def insert(self,data,index): if index < 0 or index > self.size: raise Exception("Out of range of linked list nodes!") node = Node(data) if self.size == 0: #Empty linked list self.head = node self.last = node elif index == 0: #Insert head node.next = self.head self.head = node elif self.size == index: #Insert tail self.last.next == node self.last = node else: #Insert middle prev_node = self.get(index - 1) node.next = prev_node.next prev_node.next = node self.size += 1 def remove(self,index): if index < 0 or index >= self.size: raise Exception("Out of range of linked list nodes!") #Staging deleted nodes for returning if index == 0: #Delete header node removed_node = self.head self.head = self.head.next elif index == self.size - 1: #Delete tail node prev_node = self.get(index - 1) removed_node = prev_node.next prev_node.next = None self.last = prev_node else: #Delete intermediate node prev_node = self.get(index - 1) next_node = prev_node.next.next removed_node = prev_node.next prev_node.next = next_node self.size -= 1 return removed_node def output(self): p = self.head while p is not None: print(p.data) p = p.next
linkedList = LinkedList() linkedList.insert(3,0) linkedList.insert(4,0) linkedList.insert(5,0) linkedList.insert(9,2) linkedList.insert(5,3) linkedList.insert(6,1) linkedList.output() 5 6 4 9 5 3 linkedList.remove(0) linkedList.output() 6 4 9 5 3
- Performance comparison between array and linked list
lookup | to update | insert | delete | |
---|---|---|---|---|
array | O ( 1 ) O(1) O(1) | O ( 1 ) O(1) O(1) | O ( n ) O(n) O(n) | O ( n ) O(n) O(n) |
Linked list | O ( n ) O(n) O(n) | O ( 1 ) O(1) O(1) | O ( 1 ) O(1) O(1) | O ( 1 ) O(1) O(1) |
If you need to insert and delete elements frequently, linked lists are more suitable.
Stack and queue
Arrays and linked lists are storage structures, while logical structures depend on physical structures.
Logical structure is divided into linear structure and nonlinear structure
Linear structure: such as sequence table, stack, queue, etc
Nonlinear structure: such as tree, graph, etc.
Stack
Stack is a linear data structure, and the elements in the stack can only be in first out (FILO). The first to enter is the bottom of the stack, and the last to enter is the top of the stack.
The basic operations of stack include push and pop
The list in python has well realized the function of stack. append is equivalent to entering the stack and pop is equivalent to leaving the stack.
queue
Queue is a linear data structure. The elements of queue can only be first in first out (FIFO). The exit end of the queue is the head of the queue and the entry end is the tail of the queue.
When implemented with an array, for the convenience of queue operation, the position of the end of the queue is specified as the next position of the queue element.
The basic operations of queues are enqueue and dequeue.
The queue implemented by array can keep the queue capacity constant by means of circular queue. Until (tail subscript + 1)% array length = header subscript, the queue is full. Since the position pointed to by the end of the queue pointer is always 1 bit empty, the maximum capacity of the queue is 1 bit smaller than the length of the array.
python provides a variety of queue tools, such as collections deque,queue.Queue, etc.
We try to implement a queue ourselves:
class MyQueue: def __init__(self,capacity): self.list = [None] * capacity self.front = 0 self.rear = 0 def enqueue(self,element): if (self.rear + 1) % len(self.list) == self.front: raise Exception("The queue is full!") self.list[self.rear] = element self.rear = (self.rear + 1) % len(self.list) def dequeue(self): if self.rear == self.front: raise Exception("The queue is full!") dequeue_element = self.list[self.front] self.front = (self.front + 1) % len(self.list) return dequeue_element def output(self): i = self.front while i != self.rear: print(self.list[i]) i = (i + 1) % len(self.list)
myqueue = MyQueue(6) myqueue.enqueue(3) myqueue.enqueue(5) myqueue.enqueue(6) myqueue.dequeue() myqueue.dequeue() myqueue.output() 6
Application of stack and queue
Stack can be used to implement recursive logic, as well as breadcrumb navigation (users trace back to the previous page during browsing).
Queues can be used in waiting queues competing for fair locks in multiple threads, and web crawlers can also store URL s in queues.
Dual end queue combines the advantages of stack and queue. From the head of the team, you can enter and leave the team, and from the end of the team, you can also enter and leave the team.
The priority queue follows who has the highest priority and who comes out first, but it is implemented based on binary heap.
Hashtable
Hash table is also called hash table. This data structure provides the mapping relationship between key and value.
In python, the set corresponding to the hash table is a dictionary. key and array subscript can be converted through hash function.
Read / write operation of hash table
The write operation is to insert a new key value pair (Entry) into the hash mark.
When hash conflicts occur during writing, we can solve them through open addressing method or linked list method.
- Open addressing method: find the next position of the index of the occupied array.
- Linked list method: each element of the hash table array is also the head node of a linked list. You only need to insert it into the corresponding linked list.
The read operation is to find the corresponding value in the hash table through the given key.
Some languages use the linked list method: HashMap in Java; The dict in python adopts the open addressing method.