1, Python data model
1. Special methods
Understand the special method deeply. Although len and [] are used in the following example to obtain the length and index value, because we have changed it when defining the special method, the final return value is defined by ourselves.
class Testlen(): # This is a class that tests special methods # def __init__(self): def __len__(self): return 99 def __getitem__(self, pos): return '233' if __name__ == '__main__': test_1 = Testlen() length = len(test_1) item = test_1[0] print(length, item) out: 99 233
So what is the point of doing so?
In fact, if it's a Python built-in type, For example, list, str, bytearray, etc., CPython will take a shortcut, _len_ in fact, it will directly return the ob_size attribute in PyVarObject. PyVarObject is a C language structure representing built-in objects with variable length in memory. Reading this value directly is much faster than calling a method.
2, Array of sequences
1. Readability of list derivation
lower:
symbols = 'abcdefg' codes = [] for symbol in symbols: codes.append(ord(symbol)) print(codes) out: [97, 98, 99, 100, 101, 102, 103]
Advanced:
symbols = 'abcdefg' codes = [ord(symbol) for symbol in symbols] print(codes) out: [97, 98, 99, 100, 101, 102, 103]
2. Cartesian product
List derivation can also use double loops to build other sequence types:
lower:
symbols = 'abc' nums = '123' codes = [(num, symbol) for num in nums for symbol in symbols] print(codes) [('1', 'a'), ('1', 'b'), ('1', 'c'), ('2', 'a'), ('2', 'b'), ('2', 'c'), ('3', 'a'), ('3', 'b'), ('3', 'c')]
Advanced:
symbols = 'abc' nums = '123' for mul in ((num, symbol) for num in nums for symbol in symbols): print(mul) out: ('1', 'a') ('1', 'b') ('1', 'c') ('2', 'a') ('2', 'b') ('2', 'c') ('3', 'a') ('3', 'b') ('3', 'c')
After using the generator expression, a list of 9 combinations will not be left in memory, because the generator expression will generate a combination each time the for loop runs. Using the generator to initialize sequences other than lists can avoid additional memory occupation.
3. Tuples are not just immutable lists
In addition to being used as an immutable list, tuples can also be used for records without field names; The for loop can extract the elements in tuples separately, which is also called unpacking. Unpacking allows tuples to be perfectly used as records. Tuple unpacking can be applied to any iteratable object. The only hard requirement is that the number of elements in the iterated object must be consistent with the number of tuples that accept these elements.
lower:
a = 1 b = 2 temp = a a = b b = temp print(a, b) out: 2 1
Advanced:
a = 1 b = 2 a, b = b, a print(a, b) out: 2 1
This more elegant implementation than other languages is based on tuples.
lower:
num = (20, 8) output = divmod(num[0], num[1]) print(output) out: (2, 4)
Advanced:
num = (20, 8) output = divmod(*num) print(output) out: (2, 4)
Here, the * operator is used to disassemble an iteratable object as the parameter of the function, and * args can also be used to obtain an uncertain number of parameters.
For example:
a, *b, c, d = range(7) print(a, b, c, d) out: 0 [1, 2, 3, 4] 5 6
collections.namedtuple is a factory function that can be used to build a tuple with field names and a named class -- this named class is very helpful for debugging programs. The memory consumed by an instance of a class built with namedtuple is the same as that of a tuple, because the field names are stored in the corresponding class. This instance is also smaller than an ordinary object instance because Python can't use it__ dict__ To store the properties of these instances. This is called named tuple. It has a very powerful function. It will not be described here.
Tuples support all methods of lists except those related to adding or removing elements. With one exception, tuples don't__ reversed__ method
4. Slice
In Python, sequence types such as list, tuple, and str support slicing.
In slicing and interval operations, the last element that does not contain the interval range is Python style. This habit is in line with the tradition of using 0 as the starting subscript in Python, C and other languages. The benefits of doing so are as follows:
- When there is only the last location information, we can also quickly see that there are several elements in the slice and interval: range(3) and my_list[:3] all return 3 elements.
- When the start and end position information is visible, we can quickly calculate the length of the slice and interval, and subtract the first subscript (stop start) from the latter number.
- This also allows us to use any subscript to divide the sequence into two non overlapping parts, as long as it is written as my_list[:x] and my_ Just list [x:].
The correct writing method of ellipsis is three English periods (...), rather than the half ellipsis represented by Unicdoe code point U+2026 (...). Ellipsis is a symbol in the eyes of the Python parser, but it is actually an alias of the ellipsis object, which is a single instance of the ellipsis class. It can be used as part of the slicing specification or in the parameter list of a function, such as f(a,..., z), or a[i:...].
In NumPy,... Is used as a shortcut for multidimensional array slicing. If x is a four-dimensional array, then x[i,...] is the abbreviation of x[i,:,:,:].
5. Use + and for the sequence*
Error:
# Initialize the array in the correct way length = 3 array_1 = [] * length array_2 = [0] * length print(array_1) print(array_2) # Initialization matrix, method error mat_1 = [[]] * length mat_2 = [[0]] * length mat_3 = [[0] * length] * length print(mat_1) print(mat_2) print(mat_3) out: [] [0, 0, 0] [[], [], []] [[0], [0], [0]] [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
If in the a * n statement, the elements in sequence a are references to other variable objects, you need to pay special attention, because the result of this formula may be unexpected. For example, you want to use my_list=[[]] * 3 to initialize a list composed of lists, but the three elements in the list you get are actually three references, and these three references point to the same list.
This results in the inability to modify a particular element.
*Can help us build lists quickly:
correct:
# Initialization matrix, incomplete mat_4 = [[] * length for n in range(length)] # Initialize the matrix in the correct way mat_5 = [[0] * length for n in range(length)] print(mat_4) print(mat_5) out: [[], [], []] [[0, 0, 0], [0, 0, 0], [0, 0, 0]]