Detailed explanation of neural network calculation using GPU

Posted by virken on Fri, 28 Jan 2022 04:58:12 +0100

Pytorch learning notes (6): a simple LeNet network model using GPU It also mentioned how to realize the operation on GPU. Although it is not detailed, it is also enough.

Summary: (if you are familiar with the summary knowledge, the following detailed explanation can be omitted)

  1. Default CPU for calculation.
  2. Variables or models on CPU cannot be calculated with variables or models on GPU, that is, models and variables must be on the same device.
  3. . cuda() can move variables or models to GPU cpu() can move it to CPU.
  4. You can also use device = torch device('cuda' if torch.cuda.is_available() else 'cpu'),. to(device) move the variable or model to the GPU.

GPU calculation

For complex neural networks and large-scale data, using CPU to calculate may not be efficient enough. The following describes how to use a single NVIDIA GPU to calculate. Therefore, you need to ensure that the PyTorch GPU version is installed. After all the preparations are completed, you can view the graphics card information through NVIDIA SMI command.

!nvidia-smi  # Valid for Linux/macOS users

Output:

Sun Mar 17 14:59:57 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:01:00.0 Off |                  N/A |
| 20%   36C    P5    N/A /  75W |   1223MiB /  2000MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1235      G   /usr/lib/xorg/Xorg                           434MiB |
|    0      2095      G   compiz                                       163MiB |
|    0      2660      G   /opt/teamviewer/tv_bin/TeamViewer              5MiB |
|    0      4166      G   /proc/self/exe                               416MiB |
|    0     13274      C   /home/tss/anaconda3/bin/python               191MiB |
+-----------------------------------------------------------------------------+

The above is the output of the boss. I can't output it in window, but I can view the information through the task manager.

Computing equipment

PyTorch can specify the device used for storage and computing, such as CPU using memory or GPU using video memory. By default, PyTorch will create data in memory and then use CPU to calculate.

Use torch cuda. is_ Available() to check whether the GPU is available:

import torch
from torch import nn

torch.cuda.is_available() # Output True

View the number of GPU s:

torch.cuda.device_count() # Output 1

View the current GPU index number, which starts from 0:

torch.cuda.current_device() # Output 0

View the GPU name according to the index number:

torch.cuda.get_device_name(0) 

Tensor GPU calculation

In this case, Tensor will be stored in memory. Therefore, before, we couldn't see the GPU related logo every time we printed Tensor.

x = torch.tensor([1, 2, 3])
print(x)
"""
tensor([1, 2, 3])
"""

use. cuda() can convert (copy) Tensor on CPU to GPU. If there are multiple GPUs, we can use them cuda(i) i i i GPU and corresponding video memory( i i i starts from 0) and cuda(0) is equivalent to cuda().

x = x.cuda(0)
print(x)
"""
tensor([1, 2, 3], device='cuda:0')
"""

We can view the device of the Tensor through the device attribute of the Tensor.

print(x.device)
"""
device(type='cuda', index=0) 	
"""

We can specify the device directly at the time of creation.

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

x = torch.tensor([1, 2, 3], device=device)
# or
x = torch.tensor([1, 2, 3]).to(device)
print(x)
"""
tensor([1, 2, 3], device='cuda:0')
"""

If the data on the GPU is calculated, the results are still stored on the GPU.

y = x**2
print(y)
"""
tensor([1, 4, 9], device='cuda:0')
"""

It should be noted that data stored in different locations cannot be calculated directly. That is, the data stored on the CPU cannot be calculated directly with the data stored on the GPU, and the data located on different GPUs cannot be calculated directly.

z = y + x.cpu()

Error will be reported:

RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #3 'other'

GPU calculation of model

Similar to Tensor, PyTorch model can also be used cuda is converted to GPU. We can check the device attribute of the parameters of the model to see the device where the model is stored.

net = nn.Linear(3, 1)
print(list(net.parameters())[0].device)
"""
device(type='cpu')
"""

The visible model is on the CPU and converted to the GPU:

net.cuda()
print(list(net.parameters())[0].device)
"""
device(type='cuda', index=0)
"""

Similarly, we need to ensure that the Tensor input by the model and the model are on the same device, otherwise an error will be reported.

x = torch.rand(2,3).cuda()
print(net(x))
"""
tensor([[-0.5800],
        [-0.2995]], device='cuda:0', grad_fn=<ThAddmmBackward>)
"""

REF

[1] 4.6 GPU calculation - dive into DL pytorch

[2] 5.6. GPU - hands on deep learning 2.0.0-beta0 documentation

Topics: neural networks Pytorch Deep Learning