[Tensor data operation] reading notes of hands-on learning and deep learning

Posted by gaugeboson on Tue, 25 Jan 2022 17:25:37 +0100

Tensor data operation

1. Tensor creation

# Import some common libraries
import torch
from IPython import display
from matplotlib import pyplot as plt
import numpy as np
import random
import torch.nn as nn
import torch.optim as optim # torch.optim module provides many common optimization algorithms, such as SGD, Adam and RMSProp.
from torch.nn import init
import torch.utils.data as Data

x=torch.zeros(2, 3)# tensor([[0., 0., 0.], [0., 0., 0.]])
y=torch.rand(2, 3)# tensor([[0.4825, 0.5103, 0.4058], [0.0201, 0.4958, 0.5905]])
z=torch.tensor([5.5, 2])# tensor([5.5000, 2.0000])

It can also be created through an existing Tensor. This method will reuse some properties of the input Tensor by default, such as data type, unless you customize the data type.

x = torch.eye(5, 3)
y = x.new_ones(5, 3, dtype=torch.float64)#The returned tensor has the same torch by default Dtype and torch device
z = torch.randn_like(x, dtype=torch.float) # Specify a new data type
# print(f"{x}\n{y}\n{z}")
print(x)# tensor([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [0., 0., 0.], [0., 0., 0.]])
print(y)# tensor([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]], dtype=torch.float64)
print(z)# tensor([[ 0.4024, -0.2069, -1.3955], [ 0.2060, -0.0344, -0.7559], [-0.3694,  1.6153, -0.0104], [-0.1180,  0.3266, -0.4363], [ 1.1623,  0.0751,  0.9289]])
print(x.size())# torch.Size([5, 3])
print(x.shape)# torch.Size([5, 3])

2. Index

The indexed results share memory with the original data, that is, if one is modified, the other will be modified.

y = x[0, :]
y += 1
print(y)# tensor([2., 1., 1.])
# The source tensor has also been changed
print(x[0, :]) # tensor([2., 1., 1.])

3. Change shape

Note that the new Tensor returned by view() may have different size s from the source Tensor, but they share data, that is, changing one of them will change the other. Although the Tensor returned by the view shares data with the source Tensor, it is still a new Tensor (because Tensor has some other attributes bes id es data), and their IDs (memory addresses) are not consistent.

y = x.view(15)
z = x.view(-1, 5)  # -1 refers to a dimension that can be derived from the values of other dimensions
print(x.size(), y.size(), z.size())# torch.Size([5, 3]) torch.Size([15]) torch.Size([3, 5])
x += 1
print(x)# tensor([[4., 3., 3.], [2., 3., 2.], [2., 2., 3.], [2., 2., 2.], [2., 2., 2.]])
print(y) # tensor([4., 3., 3., 2., 3., 2., 2., 2., 3., 2., 2., 2., 2., 2., 2.])

x_cp = x.clone().view(15)# First create a copy with clone, and then use view to get a real copy with changed shape
x -= 1
print(x)# tensor([[3., 2., 2.], [1., 2., 1.],[1., 1., 2.], [1., 1., 1.], [1., 1., 1.]])
print(x_cp)# tensor([3., 2., 2., 1., 2., 1., 1., 1., 2., 1., 1., 1., 1., 1., 1.])

x = torch.randn(1)# item() can convert a scalar Tensor into a Python number
print(x)# tensor([-0.6769])
print(x.item())# -0.6768880486488342

4. Broadcasting mechanism

# When two tensors with different shapes are calculated by element, the broadcasting mechanism may be triggered: copy the elements appropriately to make the two tensors have the same shape, and then calculate by element.
x = torch.arange(1, 3).view(1, 2)
print(x)# tensor([[1, 2]])
y = torch.arange(1, 4).view(3, 1)
print(y)# tensor([[1], [2], [3]])
print(x + y)tensor([[2, 3], [3, 4], [4, 5]])

5. Storage mechanism of operation

An index operation will not open up new memory, but an operation such as y = x + y will open up new memory, and then point y to the new memory. To demonstrate this, you can use Python's own ID function: if the IDs of two instances are the same, their corresponding memory addresses are the same; Otherwise, it is different.

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
y = y + x
print(id(y) == id_before) # False

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
y[:] = y + x # Use the index to achieve the specified result to the original memory of y
print(id(y) == id_before) # True

x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
id_before = id(y)
torch.add(x, y, out=y) # y += x, y.add_(x) You can also specify the result to the original y memory
print(id(y) == id_before) # True

6. Tensor and NumPy are converted to each other

[tensor to numpy]: numpy and tensor arrays generated by numpy() share the same memory (so the conversion between them is fast). When one of them is changed, the other will also change!!!

a = torch.ones(5)
b = a.numpy()
print(a, b)# tensor([1., 1., 1., 1., 1.]) [1. 1. 1. 1. 1.]
a += 1
print(a, b)# tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.]
b += 1
print(a, b)# tensor([3., 3., 3., 3., 3.]) [3. 3. 3. 3. 3.]

[numpy to tensor]: use from_numpy() or tensor(), from_ The numpy and tensor arrays generated by numpy () share the same memory (so the conversion between them is fast). When you change one of them, the other will also change!!! The tensor () will always copy the data, and the returned tensor and the original data will no longer share memory.

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
print(a, b)# [1. 1. 1. 1. 1.] tensor([1., 1., 1., 1., 1.], dtype=torch.float64)

a += 1
print(a, b)# [2. 2. 2. 2. 2.] tensor([2., 2., 2., 2., 2.], dtype=torch.float64)

b += 1
print(a, b)# [3. 3. 3. 3. 3.] tensor([3., 3., 3., 3., 3.], dtype=torch.float64)

c = torch.tensor(a)
a += 1
print(a, c)# [4. 4. 4. 4. 4.] tensor([3., 3., 3., 3., 3.], dtype=torch.float64)

7. Gradient

Put the tensor requires_ If the grad property is set to True, all operations on it will be tracked (so that gradient propagation can be carried out by using the chain rule).

After the calculation is completed, you can call backward() to complete all gradient calculations. The gradient of this Tensor will accumulate to In the grad attribute.

grad_fn attribute means whether the Tensor is obtained through some operations. If so, grad_fn returns an object related to these operations, otherwise it is None.

x = torch.ones(2, 2, requires_grad=True)
print(x)# tensor([[1., 1.], [1., 1.]], requires_grad=True)
print(x.grad_fn)# None

For example:

y = x + 2
print(y)# tensor([[3., 3.], [3., 3.]], grad_fn=<AddBackward0>)

# y is created by an addition operation, so it has a grad called < addbackward >_ fn
print(y.grad_fn)# <AddBackward0 object at 0x000002ADB0796048>

z = y * y * 3
out = z.mean()
print(z, out)# tensor([[27., 27.], [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)

# Directly created objects like x are called leaf nodes, which correspond to grads_ FN is None
print(x.is_leaf, y.is_leaf) # True False

a = torch.randn(2, 2) # Default requirements in case of missing_ grad = False
a = ((a * 3) / (a - 1))
print(a.requires_grad) # False

a.requires_grad_(True)# Pass requires_grad_ () to change requirements in an in place manner_ Grad attribute
print(a.requires_grad) # True

b = (a * a).sum()
print(b.grad_fn)# <SumBackward0 object at 0x0000016BBB329208>

x = torch.ones(2, 2, requires_grad=True)
y = x + 2
z = y * y * 3
out = z.mean()
out.backward() # Equivalent to out backward(torch.tensor(1.))
print(x.grad)# tensor([[4.5000, 4.5000], [4.5000, 4.5000]])

# Back propagation again. Note that grad is cumulative
out2 = x.sum()
out2.backward()
print(x.grad)# tensor([[5.5000, 5.5000], [5.5000, 5.5000]])

out3 = x.sum()
x.grad.data.zero_()# The gradient needs to be cleared before back propagation
out3.backward()
print(x.grad)# tensor([[1., 1.], [1., 1.]])

x = torch.tensor([1.0, 2.0, 3.0, 4.0], requires_grad=True)
y = 2 * x
z = y.view(2, 2)
print(z)# tensor([[2., 4.], [6., 8.]], grad_fn=<ViewBackward0>)
# Tensor is not allowed to derive from tensor, only scalar is allowed to derive from tensor, and the result of derivation is a tensor of the same shape as the independent variable
v = torch.tensor([[1.0, 0.1], [0.01, 0.001]], dtype=torch.float)
z.backward(v)# Now z is not a scalar, so when calling backward, you need to pass in a weight vector of the same shape as z for weighted summation to get a scalar.
print(x.grad)# tensor([2.0000, 0.2000, 0.0200, 0.0020])

Interrupt gradient tracking

x = torch.tensor(1.0, requires_grad=True)
y1 = x ** 2 
with torch.no_grad():
    y2 = x ** 3
y3 = y1 + y2

print(x.requires_grad)# True
print(y1, y1.requires_grad) # tensor(1., grad_fn=<PowBackward0>) True
print(y2, y2.requires_grad) # tensor(1.) False
print(y3, y3.requires_grad) # tensor(2., grad_fn=<AddBackward0>) True
y3.backward()
print(x.grad)# tensor(2.)

Topics: Python Pytorch Deep Learning

Programmer Think