Introduction to Numpy
Numpy (Numerical Python) is an open source Python scientific computing library, which is used to quickly process arrays of arbitrary dimensions.
Numpy supports common array and matrix operations. For the same numerical calculation task, using numpy is much simpler than using Python directly.
Numpy uses the ndarray object to handle multidimensional arrays, which is a fast and flexible big data container.
Introduction to ndarray
NumPy provides an N-dimensional array type ndarray, which describes a collection of "items" of the same type.
Store with ndarray:
Simply create an array type data
import numpy as np # Create ndarray score = np.array([[80, 89, 86, 67, 79], [78, 97, 89, 67, 81], [90, 94, 78, 67, 74], [91, 91, 90, 67, 69], [76, 87, 75, 67, 86], [70, 79, 84, 67, 84], [94, 92, 93, 67, 64], [86, 85, 83, 67, 80]]) score
Comparison of operation efficiency between ndarray and Python native list
Here, we realize the benefits of ndarray through a belt run
import random import time import numpy as np a=[] for i in range(100000000): a.append(random.random()) t1 = time.time() sum1 = sum(a) t2 = time.time() b = np.array(a) t4 = time.time() sum3 = np.sum(b) t5 = time.time() #t2-t1 is the time consumed by using python's own summation function, and t5-t4 is the time consumed by using numpy summation. The result is: print(t2-t1,t5-t4)
From this, we can see that the calculation speed of ndarray is much faster and saves time.
- The biggest feature of machine learning is a large number of data operations. Without a fast solution, python may not achieve good results in the field of machine learning.
Numpy is specially designed for ndarray operations and operations, so the storage efficiency and input-output performance of arrays are much better than those of nested lists in Python. The larger the array, the more obvious the advantages of numpy.
N-dimensional array - ndarray
Properties of ndarray
Array properties reflect the information inherent in the array itself.
Attribute name | Attribute interpretation |
---|---|
ndarray.shape | Tuple of array dimension |
ndarray.ndim | Array dimension |
ndarray.size | Number of elements in the array |
ndarray.itemsize | Length of an array element (bytes) |
ndarray.dtype | Type of array element |
Shape of ndarray (ndarray.shape)
# Shape of ndarray # Create arrays of different shapes a = np.array([[1,2,3],[4,5,6]]) b = np.array([1,2,3,4]) c = np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]]) a.shape # (2, 3) # Two dimensional array b.shape # (4,) # One dimensional array c.shape # (2, 2, 3) # 3D array
Type of ndarray (ndarray.dtype)
- Note: if not specified, integer defaults to int64 and decimal defaults to float64
name | describe | Abbreviation |
---|---|---|
np.bool | Boolean type (True or False) stored in one byte | 'b' |
np.int8 | One byte size, - 128 to 127 | 'i' |
np.int16 | Integer, - 32768 to 32767 | 'i2' |
np.int32 | Integer, - 2 31 to 2 32 - 1 | 'i4' |
np.int64 | Integer, - 2 63 to 2 63 - 1 | 'i8' |
np.uint8 | Unsigned integer, 0 to 255 | 'u' |
np.uint16 | Unsigned integer, 0 to 65535 | 'u2' |
np.uint32 | Unsigned integer, 0 to 2 * * 32 - 1 | 'u4' |
np.uint64 | Unsigned integer, 0 to 2 * * 64 - 1 | 'u8' |
np.float16 | Semi precision floating point number: 16 bits, sign 1 bit, index 5 bits, precision 10 bits | 'f2' |
np.float32 | Single precision floating point number: 32 bits, sign 1 bit, exponent 8 bits, precision 23 bits | 'f4' |
np.float64 | Double precision floating point number: 64 bits, sign 1 bit, index 11 bits, precision 52 bits | 'f8' |
np.complex64 | Complex number, which represents the real part and imaginary part with two 32-bit floating-point numbers respectively | 'c8' |
np.complex128 | Complex numbers, representing the real part and imaginary part with two 64 bit floating-point numbers respectively | 'c16' |
np.object_ | python object | 'O' |
np.string_ | character string | 'S' |
np.unicode_ | unicode type | 'U' |
Example code
# Type of ndarray a = np.array([[1, 2, 3],[4, 5, 6]], dtype=np.float32) a.dtype # dtype('float32') arr = np.array(['python', 'tensorflow', 'scikit-learn', 'numpy'], dtype = np.string_) arr.dtype # dtype('S12')