Python data analysis - Numpy numerical calculation - 1-ndarray creation and index

Posted by exasp on Fri, 19 Nov 2021 02:26:27 +0100

1. Create

(1) ndarray data type

bool

inti (integer whose precision is determined by the platform), int8, int16, int32, int64 (signed integer)

unit8, unit16, unit32, unit64 (unsigned integer)

float16, float32, float64/float (floating point number)

complex64 (complex, real and imaginary parts are represented by two 32-bit floating-point numbers), complex128/complex (real and imaginary parts are represented by two 64 bit floating-point numbers)

All element types in the same ndarray must be consistent.

The data conversion between real data types is as follows:

import numpy as np
np.float64(42)
np.int8(42.0)
np.bool(42)
np.int(True)

(2) ndarray create

1) array function

np.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0,
      like=None)

object: receive array,list,tuple, etc.

dtype: indicates the type of array created. The default value is the data type of the minimum number of bytes required to save the object.

Create one-dimensional and two-dimensional arrays:

np.array([1,2,3,4])
>array([1, 2, 3, 4])
np.array([[1,2,3,4],[5,6,7,8]])
>array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

Common properties of ndarray are as follows:

arr=np.array([[1,2,3,4],[5,6,7,8]])
arr.ndim#dimension
>2
arr.shape#size
>(2, 4)
arr.size#Total elements
>8
arr.dtype#Element type
>dtype('int32')
arr.itemsize#The size of each element in bytes (int32 has 32 / 8 = 4 bytes)
>4

2) Other create functions

Create an arithmetic sequence using the range (start value, end value, step size):

np.arange(0,1,0,1)
>array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

Create an arithmetic sequence using linspace (start value, end value, number of elements):

np.linspace(0,1,10)
>array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

Create an isometric sequence using logspace (start value, end value, number of elements):

np.logspace(1,100,5)
>array([1.00000000e+001, 5.62341325e+025, 3.16227766e+050, 1.77827941e+075,
       1.00000000e+100])

Use zeros ((number of rows, number of columns)) to create a matrix with all zero values:

np.zeros((2,3))
>array([[0., 0., 0.],
       [0., 0., 0.]])

Create an identity matrix using eye (number of diagonal elements):

np.eye(3)
>array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Create a diagonal matrix using diag (diagonal element):

np.diag([1,2,3,4])
>array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

Use ones (matrix dimension) to create a matrix with all 1:

np.ones((2,3))
>array([[1., 1., 1.],
       [1., 1., 1.]])

(3) Random number

The functions related to generating random numbers are all in the np.random module:

Seed: random number seed

Permutation: returns the random permutation of a sequence

shuffle: randomly arrange a sequence

Random: generates a random floating-point number of 0-1

rand: generates the random number ndarray of the specified shape

randint: generates a random integer with a given range of upper and lower bounds

randn: a random number that produces a normal distribution

Binary: a random number that produces a binomial distribution

Normal: a random number that produces a normal distribution

Beta: the random number that produces the beta distribution

chisquare: random number generating chi square distribution

Gamma: a random number that produces a gamma distribution

uniform: generate [0,1) uniformly distributed random numbers

np.random.random(10)
>array([0.49905372, 0.85914534, 0.01105556, 0.78294205, 0.56929015,
       0.63081794, 0.29417168, 0.80998379, 0.38684665, 0.49672103])
np.random.rand(2,3)
>array([[0.71282353, 0.87641901, 0.85941578],
       [0.74710244, 0.34053078, 0.13364268]])
np.random.randn(2,3)
>array([[ 0.60000428,  0.94897067,  1.20422687],
       [-0.65710487, -0.0812635 , -0.45994949]])
np.random.randint(low=1,high=10,size=[2,3])
>array([[9, 4, 9],
       [9, 8, 9]])

2. Indexing and slicing

(1) One dimensional index

arr=np.random.random(10)
arr
>array([0.74412323, 0.5186568 , 0.85832988, 0.29784057, 0.83864654,
       0.84512263, 0.77279106, 0.20471927, 0.95401965, 0.67786501])
arr[5]
>0.8451226296662746
arr[3:5]#It's actually the fourth element and the fifth element
>array([0.29784057, 0.83864654])
arr[:5]
>array([0.74412323, 0.5186568 , 0.85832988, 0.29784057, 0.83864654])
arr[:-1]#Abandon the last one
>array([0.74412323, 0.5186568 , 0.85832988, 0.29784057, 0.83864654,
       0.84512263, 0.77279106, 0.20471927, 0.95401965])
arr[2:4]=100#Modify element value
arr
>array([  0.74412323,   0.5186568 , 100.        , 100.        ,
         0.83864654,   0.84512263,   0.77279106,   0.20471927,
         0.95401965,   0.67786501])
arr[1:-1:2]#Isometric extraction
>array([  0.5186568 , 100.        ,   0.84512263,   0.20471927])
arr[5:1:-1]#Reverse extraction
>array([  0.84512263,   0.83864654, 100.        , 100.        ])

(2) Multidimensional index

Multidimensional index each dimension has an index separated by commas.

arr=np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
arr
>array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
arr[0,2]#Row 1, column 3.
>3
arr[1,1:4]#Row 2, columns 2 to 4
>array([6, 7, 8])
arr[0:3,1:3]
>array([[ 2,  3],
       [ 6,  7],
       [10, 11]])
arr[:,0:3]#All rows
>array([[ 1,  2,  3],
       [ 5,  6,  7],
       [ 9, 10, 11]])

(3) Fancy index

arr=np.array([np.arange(i*4,i*4+4) for i in np.arange(6)])
arr
>array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])
arr[[1,5,4,2],[0,3,1,2]]#It is equivalent to returning elements with arr coordinates of (1,0), (5,3), (4,1), (2,2)
>array([ 4, 23, 17, 10])

Use the ix function to convert two one-dimensional integers ndarray into an indexer of a square area:

The ix function actually makes Cartesian product of two one-dimensional arrays, and then finds the elements of coordinates corresponding to all Cartesian product results.

arr[np.ix_([1,5,4,2],[0,3,1,2])
>array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [16, 19, 17, 18],
       [ 8, 11,  9, 10]])

Topics: Python Data Analysis numpy