Introduction of python using numpy module, basic operation, processing data, including generating array

Posted by T_Hayden on Sun, 10 Oct 2021 11:16:53 +0200

Introduction to Numpy

Numpy (Numerical Python) is an open source Python scientific computing library, which is used to quickly process arrays of arbitrary dimensions.
Numpy supports common array and matrix operations. For the same numerical calculation task, using numpy is much simpler than using Python directly.
Numpy uses the ndarray object to handle multidimensional arrays, which is a fast and flexible big data container.

Introduction to ndarray

NumPy provides an N-dimensional array type ndarray, which describes a collection of "items" of the same type.

Store with ndarray:

Simply create an array type data

import numpy as np
# Create ndarray
score = np.array([[80, 89, 86, 67, 79],
[78, 97, 89, 67, 81],
[90, 94, 78, 67, 74],
[91, 91, 90, 67, 69],
[76, 87, 75, 67, 86],
[70, 79, 84, 67, 84],
[94, 92, 93, 67, 64],
[86, 85, 83, 67, 80]])
score

Comparison of operation efficiency between ndarray and Python native list

Here, we realize the benefits of ndarray through a belt run

import random 
import time 
import numpy as np

a=[]
for i in range(100000000):
    a.append(random.random())
    
t1 = time.time()
sum1 = sum(a)
t2 = time.time()

b = np.array(a)
t4 = time.time()
sum3 = np.sum(b)
t5 = time.time()
#t2-t1 is the time consumed by using python's own summation function, and t5-t4 is the time consumed by using numpy summation. The result is:
print(t2-t1,t5-t4)


From this, we can see that the calculation speed of ndarray is much faster and saves time.

  • The biggest feature of machine learning is a large number of data operations. Without a fast solution, python may not achieve good results in the field of machine learning.

    Numpy is specially designed for ndarray operations and operations, so the storage efficiency and input-output performance of arrays are much better than those of nested lists in Python. The larger the array, the more obvious the advantages of numpy.

N-dimensional array - ndarray

Properties of ndarray

Array properties reflect the information inherent in the array itself.

Attribute nameAttribute interpretation
ndarray.shapeTuple of array dimension
ndarray.ndimArray dimension
ndarray.sizeNumber of elements in the array
ndarray.itemsizeLength of an array element (bytes)
ndarray.dtypeType of array element

Shape of ndarray (ndarray.shape)

# Shape of ndarray
# Create arrays of different shapes
a = np.array([[1,2,3],[4,5,6]])
b = np.array([1,2,3,4])
c = np.array([[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]])

a.shape # (2, 3) # Two dimensional array
b.shape # (4,) # One dimensional array
c.shape # (2, 2, 3)  # 3D array

Type of ndarray (ndarray.dtype)

  • Note: if not specified, integer defaults to int64 and decimal defaults to float64
namedescribeAbbreviation
np.boolBoolean type (True or False) stored in one byte'b'
np.int8One byte size, - 128 to 127'i'
np.int16Integer, - 32768 to 32767'i2'
np.int32Integer, - 2 31 to 2 32 - 1'i4'
np.int64Integer, - 2 63 to 2 63 - 1'i8'
np.uint8Unsigned integer, 0 to 255'u'
np.uint16Unsigned integer, 0 to 65535'u2'
np.uint32Unsigned integer, 0 to 2 * * 32 - 1'u4'
np.uint64Unsigned integer, 0 to 2 * * 64 - 1'u8'
np.float16Semi precision floating point number: 16 bits, sign 1 bit, index 5 bits, precision 10 bits'f2'
np.float32Single precision floating point number: 32 bits, sign 1 bit, exponent 8 bits, precision 23 bits'f4'
np.float64Double precision floating point number: 64 bits, sign 1 bit, index 11 bits, precision 52 bits'f8'
np.complex64Complex number, which represents the real part and imaginary part with two 32-bit floating-point numbers respectively'c8'
np.complex128Complex numbers, representing the real part and imaginary part with two 64 bit floating-point numbers respectively'c16'
np.object_python object'O'
np.string_character string'S'
np.unicode_unicode type'U'

Example code

#  Type of ndarray
a = np.array([[1, 2, 3],[4, 5, 6]], dtype=np.float32)
a.dtype  # dtype('float32')

arr = np.array(['python', 'tensorflow', 'scikit-learn', 'numpy'], dtype =
np.string_)
arr.dtype # dtype('S12')

Topics: Python Pycharm Machine Learning Data Analysis numpy