NumPy: understanding broadcasting

Posted by Syto on Thu, 10 Feb 2022 17:16:54 +0100

brief introduction

The broadcast describes how NumPy calculates the operation between arrays of different shapes. If it is a larger matrix and a smaller matrix, the smaller matrix will be broadcast to ensure the correct operation.

This article will explain in detail the use of broadcast in NumPy with specific examples.

Basic broadcasting

Normally, two arrays need to be calculated, so the object of each array needs to have a corresponding value for calculation. For example, the following example:

a = np.array([1.0, 2.0, 3.0])
b = np.array([2.0, 2.0, 2.0])
a * b
array([ 2.,  4.,  6.])

However, if the broadcast feature of Numpy is used, the number of elements does not have to correspond accurately.

For example, we can talk about an array multiplied by a constant:

a = np.array([1.0, 2.0, 3.0])
>>> b = 2.0
>>> a * b
array([ 2.,  4.,  6.])

The following example is equivalent to the above example. Numpy will automatically extend b.

NumPy is smart enough to use the original scalar value without actually making a copy, so that the broadcast operation can save memory and improve computational efficiency as much as possible.

The code in the second example is more efficient than the code in the first example because the broadcast moves less memory during multiplication (b is a scalar rather than an array).

Broadcasting rules

If two arrays are operated, NumPy will compare the objects of the two arrays, starting from the last dimension. If the dimensions of the two arrays meet the following two conditions, we think the two arrays are compatible and can be operated:

  1. The number of elements in the dimension is the same
  2. One dimension is 1

If the above two conditions are not met, an exception will be thrown: valueerror: operators could not be broadcast together.

The same number of elements in the dimension does not mean that two arrays are required to have the same number of dimensions.

For example, a 256x256x3 array representing colors can be multiplied by a one-dimensional array of three elements:

Image  (3d array): 256 x 256 x 3
Scale  (1d array):             3
Result (3d array): 256 x 256 x 3

When multiplying, if the number of elements in the dimension is 1, it will be stretched to be consistent with the number of elements in another dimension:

A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5

In the above example, 1 in the second dimension is stretched to 7, 1 in the third dimension is stretched to 6, and 1 in the fourth dimension is stretched to 5.

There are more examples:

B      (1d array):      1
Result (2d array):  5 x 4

A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4

A      (3d array):  15 x 3 x 5
B      (3d array):  15 x 1 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 1
Result (3d array):  15 x 3 x 5

The following are examples of mismatches:

A      (1d array):  3
B      (1d array):  4 # trailing dimensions do not match

A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched

Another example of actual code:

>>> x = np.arange(4)
>>> xx = x.reshape(4,1)
>>> y = np.ones(5)
>>> z = np.ones((3,4))

>>> x.shape
(4,)

>>> y.shape
(5,)

>>> x + y
ValueError: operands could not be broadcast together with shapes (4,) (5,)

>>> xx.shape
(4, 1)

>>> y.shape
(5,)

>>> (xx + y).shape
(4, 5)

>>> xx + y
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.],
       [ 4.,  4.,  4.,  4.,  4.]])

>>> x.shape
(4,)

>>> z.shape
(3, 4)

>>> (x + z).shape
(3, 4)

>>> x + z
array([[ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.]])

The broadcast also provides a very convenient operation for the external product of two 1-dimensional arrays:

>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[  1.,   2.,   3.],
       [ 11.,  12.,  13.],
       [ 21.,  22.,  23.],
       [ 31.,  32.,  33.]])

Where a[:, np.newaxis] converts a 1-dimensional array into a 4-dimensional array:

In [230]: a[:, np.newaxis]
Out[230]:
array([[ 0.],
       [10.],
       [20.],
       [30.]])

This article has been included in http://www.flydean.com/07-python-numpy-broadcasting/

The most popular interpretation, the most profound dry goods, the most concise tutorial, and many tips you don't know are waiting for you to find!

Welcome to my official account: "those things in procedure", understand technology, know you better!

Topics: Python Machine Learning numpy