python basic summary: 1.5. Basic data structure

Posted by rabidvibes on Thu, 27 Jan 2022 14:43:43 +0100

python basic summary: 1.5. Basic data structure

1. Preface

First, we have some basic knowledge of lists, tuples and strings. Next, we summarize the basic data structures commonly used in python: lists, tuples, sets and dictionaries (we summarize Java in a similar way. After summarizing arrays and strings, we summarize some other data structures, such as set and map), And briefly mention the concepts of sequence and set. Of course, these concepts can be understood (to be honest, the classification of these concepts is chaotic. We can be familiar with the above four and string for the time being).

2. List

List data types support many methods. All methods of list objects are as follows:

list.append(x)
Add an element at the end of the list, which is equivalent to a[len(a):] = [x].

list.extend(iterable)
Expand the list with the elements of the iteratable object. Equivalent to a[len(a):] = iterable.

list.insert(i, x)
Inserts an element at the specified location. The first parameter is the index of the inserted element. Therefore, a.insert(0, x) inserts the element at the beginning of the list, and a.insert(len(a), x) is equivalent to a.append(x).

list.remove(x)
Removes the first element with a value of x from the list. When the specified element is not found, a ValueError exception is triggered.

list.pop([i])
Deletes the element at the specified position in the list and returns the deleted element. When no location is specified, a.pop() deletes and returns the last element of the list. (the square brackets around i in the method signature indicate that the parameter is optional and does not require square brackets. This representation is common in Python reference libraries).

list.clear()
Delete all elements in the list, equivalent to del a [:].

list.index(x[, start[, end]])
Returns the zero based index of the first element in the list with a value of x. When the specified element is not found, a ValueError exception is triggered.

The optional parameters start and end are slice symbols used to limit the search to specific subsequences of the list. The returned index is calculated relative to the beginning of the entire sequence, not the start parameter.

list.count(x)
Returns the number of occurrences of element x in the list.

list.sort(*, key=None, reverse=False)
Sort the elements in the list in place (see sorted() for custom sorting parameters).

list.reverse()
Inverts the elements in the list.

list.copy()
Returns a shallow copy of the list. Equivalent to a [:].

List method example:

>>> fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana']
>>> fruits.count('apple')
2
>>> fruits.count('tangerine')
0
>>> fruits.index('banana')
3
>>> fruits.index('banana', 4)  # Find next banana starting a position 4
6
>>> fruits.reverse()
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']
>>> fruits.append('grape')
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange', 'grape']
>>> fruits.sort()
>>> fruits
['apple', 'apple', 'banana', 'banana', 'grape', 'kiwi', 'orange', 'pear']
>>> fruits.pop()
'pear'

insert, remove, sort and other methods only modify the list and do not output the return value -- the default value returned is None. 1 This is the design principle of all Python variable data structures.

Also, not all data can be sorted or compared. For example, [None, 'hello', 10] is not sortable because integers cannot be compared with strings, and None cannot be compared with other types. Some types have no order relationship defined at all. For example, the comparison operation of 3 + 4J < 5 + 7J is invalid.

2.1 stack with list

Using the list method to implement the stack is very easy. The last inserted is taken out first ("last in, first out"). Add the element to the top of the stack and use append(). Take the element from the top of the stack and use pop(), without specifying the index. For example:

>>> stack = [3, 4, 5]
>>> stack.append(6)
>>> stack.append(7)
>>> stack
[3, 4, 5, 6, 7]
>>> stack.pop()
7
>>> stack
[3, 4, 5, 6]
>>> stack.pop()
6
>>> stack.pop()
5
>>> stack
[3, 4]

2.2 queue with list

The list can also be used as a queue. The first element added is taken out first ("first in, first out"); However, the efficiency of lists as queues is very low. Because adding and removing elements at the end of the list is very fast, but inserting or removing elements at the beginning of the list is slow (because all other elements must move one bit).

Queue implementation is best used collections.deque , you can quickly add or remove elements from both ends. For example:

>>> from collections import deque
>>> queue = deque(["Eric", "John", "Michael"])
>>> queue.append("Terry")           # Terry arrives
>>> queue.append("Graham")          # Graham arrives
>>> queue.popleft()                 # The first to arrive now leaves
'Eric'
>>> queue.popleft()                 # The second to arrive now leaves
'John'
>>> queue                           # Remaining queue in order of arrival
deque(['Michael', 'Terry', 'Graham'])

2.3 list derivation

List derivation is a simpler way to create a list. The common usage is to apply some operation to each element in the sequence or iteratable object and create a new list with the generated result; Or create subsequences with elements that meet specific conditions.

For example, create a list of square values:

>>> squares = []
>>> for x in range(10):
...     squares.append(x**2)
...
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Note that this code creates (or overrides) the variable x, which remains after the loop ends. The following method can calculate the square list without side effects:

squares = list(map(lambda x: x**2, range(10)))

Or equivalent to:

squares = [x**2 for x in range(10)]

The above writing method is more concise and easy to read.

The list derivation contains the following in square brackets: an expression followed by a for clause, followed by zero or more for or if clauses. The result is a new list of expressions evaluated against the for and if clauses. For example, the following list derivation combines unequal elements in two lists:

>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

Equivalent to:

>>> combs = []
>>> for x in [1,2,3]:
...     for y in [3,1,4]:
...         if x != y:
...             combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

Note that in the above two pieces of code, for and if In the same order.

When the expression is a tuple (such as (x, y) in the above example), parentheses must be added:

>>> vec = [-4, -2, 0, 2, 4]
>>> # create a new list with the values doubled
>>> [x*2 for x in vec]
[-8, -4, 0, 4, 8]
>>> # filter the list to exclude negative numbers
>>> [x for x in vec if x >= 0]
[0, 2, 4]
>>> # apply a function to all the elements
>>> [abs(x) for x in vec]
[4, 2, 0, 2, 4]
>>> # call a method on each element
>>> freshfruit = ['  banana', '  loganberry ', 'passion fruit  ']
>>> [weapon.strip() for weapon in freshfruit]
['banana', 'loganberry', 'passion fruit']
>>> # create a list of 2-tuples like (number, square)
>>> [(x, x**2) for x in range(6)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
>>> # the tuple must be parenthesized, otherwise an error is raised
>>> [x, x**2 for x in range(6)]
  File "<stdin>", line 1, in <module>
    [x, x**2 for x in range(6)]
               ^
SyntaxError: invalid syntax
>>> # flatten a list using a listcomp with two 'for'
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]

List derivation can use complex expressions and nested functions:

>>> from math import pi
>>> [str(round(pi, i)) for i in range(1, 6)]
['3.1', '3.14', '3.142', '3.1416', '3.14159']

2.4 nested list derivation

The initial expression in the list derivation can be any expression, or even another list derivation.

The following 3x4 matrix consists of three lists with a length of 4:

>>> matrix = [
...     [1, 2, 3, 4],
...     [5, 6, 7, 8],
...     [9, 10, 11, 12],
... ]

The following list derivation can transpose rows and columns:

>>> [[row[i] for row in matrix] for i in range(4)]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

As shown in the previous section, the nested list derivation is based on the following for Evaluation, so this example is equivalent to:

>>> transposed = []
>>> for i in range(4):
...     transposed.append([row[i] for row in matrix])
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

Conversely, it is also equivalent to:

>>> transposed = []
>>> for i in range(4):
...     # the following 3 lines implement the nested listcomp
...     transposed_row = []
...     for row in matrix:
...         transposed_row.append(row[i])
...     transposed.append(transposed_row)
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

In practical application, it is best to replace complex process statements with built-in functions. At this point, zip() Better use of functions:

>>> list(zip(*matrix))
[(1, 5, 9), (2, 6, 10), (3, 7, 11), (4, 8, 12)]

For a detailed description of the asterisk in this line, see Unpack argument list.

3. del statement

del Statement removes an element from the list by index, not by value. Unlike the pop() method that returns a value, the del statement can also remove the slice from the list or empty the entire list (previously assigning an empty list to the slice). For example:

>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]

del It can also be used to delete the entire variable:

>>> del a

After that, referencing a again will report an error (until another value is assigned to it). It will be introduced later del Other uses of.

4. Tuples and sequences

We see that lists and strings have many common features, such as indexing and slicing operations. They are sequence data types (see Sequence type - list, tuple, range )Two of them. With the development of Python language, other sequence types will also be added. Here is another standard sequence type: tuple.

A tuple consists of several values separated by commas, for example

>>> t = 12345, 54321, 'hello!'
>>> t[0]
12345
>>> t
(12345, 54321, 'hello!')
>>> # Tuples may be nested:
... u = t, (1, 2, 3, 4, 5)
>>> u
((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
>>> # Tuples are immutable:
... t[0] = 88888
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> # but they can contain mutable objects:
... v = ([1, 2, 3], [3, 2, 1])
>>> v
([1, 2, 3], [3, 2, 1])

As you can see, tuples are always output surrounded by parentheses to correctly represent nested tuples. Parentheses are optional when entering, but often necessary (if the tuple is part of a larger expression). Assigning a value to a single element in a tuple is not allowed. Of course, you can create tuples containing variable objects, such as lists.

Although tuples may look like lists, they are usually used in different scenarios and have different purposes. Tuple is immutable , its sequence usually contains different kinds of elements and is accessed by unpacking (explained later in this section) or index (if yes) namedtuples It can even be accessed through properties). The list is mutable And the elements in the list are generally of the same type and accessed iteratively.

A special problem is to construct tuples containing 0 or 1 elements: there are some additional changes in the syntax to accommodate this situation. Empty tuples can be directly created by a pair of empty parentheses. Tuples containing an element can be constructed by adding a comma after the element (if there is only one value in parentheses, it is not clear enough). Ugly, but effective. for example

>>> empty = ()
>>> singleton = 'hello',    # <-- note trailing comma
>>> len(empty)
0
>>> len(singleton)
1
>>> singleton
('hello',)

Statement t = 12345, 54321, 'hello!' Is an example of tuple packaging: values 12345, 54321 and 'hello!' Packed into tuples. The reverse operation is also allowed

>>> x, y, z = t

This is also called sequence unpacking, which is also appropriate, because the right side of the equal sign of unpacking operation can be any sequence. Sequence unpacking requires that the number of variables on the left of the equal sign is the same as the number of elements in the sequence on the right. Note that multiple assignment is just a combination of tuple packaging and sequence unpacking.

5. Assemble

Python also contains collection types. A set is an unordered set of non repeating elements. Its basic usage includes member detection and de duplication. Set objects also support mathematical operations such as union, intersection, difference set, symmetric difference and so on.

Curly braces or set() Function can be used to create a collection. Note: to create an empty set, you can only use set() instead of {}, because the latter is to create an empty dictionary. This data structure will be discussed in the next section.

Here are some simple examples

>>> basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
>>> print(basket)                      # show that duplicates have been removed
{'orange', 'banana', 'pear', 'apple'}
>>> 'orange' in basket                 # fast membership testing
True
>>> 'crabgrass' in basket
False

>>> # Demonstrate set operations on unique letters from two words
...
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a                                  # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
>>> a - b                              # letters in a but not in b
{'r', 'd', 'b'}
>>> a | b                              # letters in a or b or both
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
>>> a & b                              # letters in both a and b
{'a', 'c'}
>>> a ^ b                              # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}

be similar to List derivation , sets also support derivation

>>> a = {x for x in 'abracadabra' if x not in 'abc'}
>>> a
{'r', 'd'}

6. Dictionary

Another very useful Python built-in data type is a dictionary (see Mapping type - dict ). Dictionaries may be called associative memory or associative arrays in other languages. Unlike sequences indexed by continuous integers, dictionaries are indexed by keywords, which can be of any immutable type, usually strings or numbers. If a tuple contains only strings, numbers, or tuples, the tuple can also be used as a keyword. However, if a tuple contains a variable object directly or indirectly, it cannot be used as a keyword. A list cannot be used as a keyword because it can be changed by indexing, slicing, or methods such as append() and extend().

The best way to understand a dictionary is to think of it as a key: a set of value pairs, and the key must be unique (in a dictionary). A pair of curly braces can create an empty dictionary: {}. Another way to initialize a dictionary is to put some comma separated key value pairs in a pair of curly braces, which is also the way of dictionary output.

The main operation of the dictionary is to use keywords to store and parse values. You can also use del to delete a key value pair. If you use an existing keyword to store a value, the previous value associated with the keyword will be forgotten. If you use a nonexistent key to get the value, an error will be reported.

Executing list(d) on a dictionary returns a list containing all the keys in the dictionary, arranged in the insertion order (sorted (d) if other sorting is required). To check whether a specific key exists in the dictionary, use in keyword.

Here are some simple examples of using a dictionary

>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'jack': 4098, 'sape': 4139, 'guido': 4127}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'jack': 4098, 'guido': 4127, 'irv': 4127}
>>> list(tel)
['jack', 'guido', 'irv']
>>> sorted(tel)
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False

dict() Constructors can create dictionaries directly from the sequence of key value pairs.

>>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
{'sape': 4139, 'guido': 4127, 'jack': 4098}

In addition, dictionary derivation can create a dictionary from any key value expression

>>> {x: x**2 for x in (2, 4, 6)}
{2: 4, 4: 16, 6: 36}

When the keyword is a simple string, it is sometimes more convenient to specify key value pairs directly through keyword parameters

>>> dict(sape=4139, guido=4127, jack=4098)
{'sape': 4139, 'guido': 4127, 'jack': 4098}

7. Compare sequences and other types

Sequence objects can be compared with other objects of the same type. They are compared using dictionary order: first, the first element of the two sequences is compared. If they are different, this determines the result of the comparison operation. If they are the same, the second element of each sequence is compared, and so on until one sequence is exhausted. If the two elements to be compared are themselves sequences of the same type, the dictionary order comparison is performed recursively. If all the elements in two sequences are equal, we think the two sequences are equal. If one sequence is the initial subsequence of another sequence, the short sequence is less than (less than) the other. Dictionary order for strings is the order in which single character Unicode codes are used. The following is an example of comparison between sequences of the same type

(1, 2, 3)              < (1, 2, 4)
[1, 2, 3]              < [1, 2, 4]
'ABC' < 'C' < 'Pascal' < 'Python'
(1, 2, 3, 4)           < (1, 2, 4)
(1, 2)                 < (1, 2, -1)
(1, 2, 3)             == (1.0, 2.0, 3.0)
(1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)

Note that for different types of objects, as long as the object to be compared provides an appropriate comparison method, you can use < and > to compare. For example, mixed numeric types are compared by their numeric values, so 0 equals 0.0, and so on. Otherwise, the interpreter throws a TypeError Exception, rather than giving a random result.

Topics: Python