Basic Python data analysis tutorial: NumPy Learning Guide (2nd Edition) note 15: Chapter 7 special function 1 - sorting

Posted by system_critical on Fri, 14 Jan 2022 13:01:43 +0100

Chapter VII Special Functions

This chapter mainly introduces Numpy function in financial calculation or signal processing.

7.1 sorting

NumPy provides a variety of sorting functions, as shown below:

  • The sort function returns the sorted array;
  • The lexport function sorts according to the dictionary order of key values;
  • The argsort function returns the sorted subscript of the input array;
  • The sort method of ndarray class can sort the array in situ;
  • The msort function sorts along the first axis;
  • sort_ The complex function sorts complex numbers in the order of real part first and imaginary part later.
    In the list above, the argsort and sort functions can be used to sort NumPy array types.

7.2 hands on practice: sort by dictionary

The lexport function in NumPy returns the subscripts of the input array sorted in dictionary order. We need to provide the lexport function with the key value array or tuple on which to sort. The steps are as follows.

  • (1) Review the AAPL stock price data we used in Chapter 3. Now we will use these long ago data in completely different places. We will load the closing price and date data. Yes, dealing with dates is always complex, and we have prepared a special conversion function for it.

    import numpy as np
    import datetime
    
    def datestr2num(s):
        return datetime.datetime.strptime(s.decode('ascii'), "%d-%m-%Y").toordinal()
    
    dates,closes=np.loadtxt('AAPL.csv', delimiter=',', usecols=(1, 6), converters={1:datestr2num}, unpack=True)
    
  • (2) Sort using the lexport function. The data itself has been sorted by date, but we now prioritize sorting by closing price:

    indices = np.lexsort((dates, closes))
    
    print( "Indices", indices)
    print( ["%s %s" % (datetime.date.fromordinal(int(dates[i])),  closes[i]) for i in indices])
    

    The output results are as follows:

    Indices [ 0 16  1 17 18  4  3  2  5 28 19 21 15  6 29 22 27 20  9  7 25 26 10  8
     14 11 23 12 24 13]
    ['2011-01-28 336.1', '2011-02-22 338.61', '2011-01-31 339.32', '2011-02-23 342.62', '2011-02-24 342.88', '2011-02-03 343.44', '2011-02-02 344.32', '2011-02-01 345.03', '2011-02-04 346.5', '2011-03-10 346.67', '2011-02-25 348.16', '2011-03-01 349.31', '2011-02-18 350.56', '2011-02-07 351.88', '2011-03-11 351.99', '2011-03-02 352.12', '2011-03-09 352.47', '2011-02-28 353.21', '2011-02-10 354.54', '2011-02-08 355.2', '2011-03-07 355.36', '2011-03-08 355.76', '2011-02-11 356.85', '2011-02-09 358.16', '2011-02-17 358.3', '2011-02-14 359.18', '2011-03-03 359.56', '2011-02-15 359.9', '2011-03-04 360.0', '2011-02-16 363.13']
    

The complete code of the case is as follows:

import numpy as np
import datetime

def datestr2num(s):
    return datetime.datetime.strptime(s.decode('ascii'), "%d-%m-%Y").toordinal()

dates,closes=np.loadtxt('AAPL.csv', delimiter=',', usecols=(1, 6), converters={1:datestr2num}, unpack=True)
indices = np.lexsort((dates, closes))

print( "Indices", indices)
print( ["%s %s" % (datetime.date.fromordinal(int(dates[i])),  closes[i]) for i in indices])

7.3 plural

The complex number includes the real part and the imaginary part. As mentioned in the previous chapter, NumPy has a special complex type, using two floating-point numbers to represent the complex. These complex numbers can use NumPy's sort_ Sort with the complex function. The function is sorted in the order of real part before imaginary part.

7.4 hands on practice: sort plural numbers

We will create a complex array and sort it. The steps are as follows.

  • (1) Generate 5 random numbers as real parts and 5 random numbers as imaginary parts. Set the random number seed to 42:

    import numpy as np
    
    np.random.seed(42)
    complex_numbers = np.random.random(5) + 1j * np.random.random(5)
    print("Complex numbers\n", complex_numbers)
    

    Output is:

    Complex numbers
     [0.37454012+0.15599452j 0.95071431+0.05808361j 0.73199394+0.86617615j
     0.59865848+0.60111501j 0.15601864+0.70807258j]
    
  • (2) Call sort_ The complex function sorts the complex numbers generated above:

    print("Sorted\n", np.sort_complex(complex_numbers))
    

    The sorted results are as follows:

    Sorted
     [0.15601864+0.70807258j 0.37454012+0.15599452j 0.59865848+0.60111501j
     0.73199394+0.86617615j 0.95071431+0.05808361j]
    

The complete code of the case is as follows:

import numpy as np

np.random.seed(42)
complex_numbers = np.random.random(5) + 1j * np.random.random(5)
print("Complex numbers\n", complex_numbers)
print("Sorted\n", np.sort_complex(complex_numbers))

7.5 search

There are several functions in NumPy that can search in the array, as shown below.

  • The argmax function returns the subscript corresponding to the largest value in the array.

    >>> a = np.array([2, 4, 8])
    >>> np.argmax(a)
    2
    
  • The nanargmax function provides the same functionality, but ignores the NaN value.

    >>> b = np.array([np.nan, 2, 4])
    >>> np.nanargmax(b)
    2
    
  • The function of argmin and nanargmin is similar, but changed to the minimum value.

  • The argwhere function searches for non-zero elements according to conditions and returns the corresponding subscripts in groups.

    >>> a = np.array([2, 4, 8])
    >>> np.argwhere(a <= 4)
    array([[0],
    [1]])
    
  • The searchsorted function finds the index position that maintains the array sorting for the specified insertion value. The function uses a binary search algorithm with a computational complexity of O(log(n)).

  • The extract function returns the array elements that meet the specified conditions.

7.6 hands on practice: using the searchsorted function

The searchsorted function returns an index position in an ordered array for the specified insertion value. Inserting from this position can maintain the order of the array. The following example can be explained more clearly. Please complete the following steps.

  • (1) We need a sorted array. Use the array function to create an array in ascending order:

    import numpy as np
    
    a = np.arange(5)
    
  • (2) Now let's call the searchsorted function:

    indices = np.searchsorted(a, [-2, 7])
    print("Indices", indices)
    

    The following index can maintain the insertion position of array sorting:

    Indices [0 5]
    
  • (3) Use the insert function to build a complete array:

    print("The full array", np.insert(a, indices, [-2, 7]))
    

    The results are as follows:

    The full array [-2 0 1 2 3 4 7]
    

The complete code of the case is as follows:

import numpy as np

a = np.arange(5)
indices = np.searchsorted(a, [-2, 7])
print("Indices", indices)

print("The full array", np.insert(a, indices, [-2, 7]))

7.7 array element extraction

The * * extract function of NumPy can extract elements * * from the array according to a certain condition. This function is similar to the where function we encountered in Chapter 3. The nonzero function is specifically used to extract non-zero array elements.

7.8 hands on practice: extracting elements from arrays

We want to extract even elements from an array. The steps are as follows.

  • (1) Create an array using the range function:

    import numpy as np
    
    a = np.arange(7)
    
  • (2) Generate condition variables for selecting even elements:

    condition = (a % 2) == 0
    
  • (3) Use the extract function to extract elements from the array based on the generated conditions:

    print("Even numbers", np.extract(condition, a))
    

    The even elements in the output array are as follows:

    Even numbers [0 2 4 6]
    
  • (4) Use the nonzero function to extract non-zero elements from the array:

    print("Non zero", np.nonzero(a))
    

    The output results are as follows:

    Non zero (array([1, 2, 3, 4, 5, 6]),)
    

The complete code of the case is as follows:

import numpy as np

a = np.arange(7)
condition = (a % 2) == 0
print("Even numbers", np.extract(condition, a))
print("Non zero", np.nonzero(a))

Topics: Permutation function numpy