CUDA - Programmer Think - where programmers share thinking

CUDA

CUDA C programming authoritative guide Grossman Chapter 9 multi GPU Programming

Accelerate the application expansion across GPUs within one computing node or across multiple GPUs. CUDA provides a large number of functions to realize multi GPU Programming, including: managing multiple devices in one or more processes, using unified virtual addressing to directly access the memory of other devices, GPUDirect, and overlappin ...

Posted by bookbuyer2000 on Fri, 04 Mar 2022 06:39:28 +0100

Deep learning environment configuration under Ubuntu 20.04 system (subsystem) (pytoch + GPU)

Deep learning environment configuration under Ubuntu 20.04 system (subsystem) (pytoch + GPU) 🍀 Previous sequence (pit avoidance) Last time I installed Ubuntu in the virtual machine and installed Nvidia driver , enter the command Ubuntu drivers devices, and you can see that there are no recommended drivers, as shown below: Finally, sudo ap ...

Posted by geaser_geek on Tue, 01 Mar 2022 16:03:51 +0100

Ubuntu configuration in non virtual machine environment

Before bloggers special column Ubuntu in is installed on the virtual machine. Here the blogger has integrated a dual system, so we need to reconfigure the Ubuntu environment. Here is a simple record of the process: 1. Prepare the operating system ubuntu-20.04.2.0-desktop-amd64 ISO, USB flash disk startup disk making tool. After starting the ...

Posted by zuperxtreme on Sun, 20 Feb 2022 19:31:21 +0100

[CS344-1](GPU Programming Model)

For example, how can we dig holes from the United States to China faster Use a shovel to dig from every 2 seconds to every 1 second. There is an upper limit. If it is too fast, the shovel will break (increasing the clock frequency of the processor will increase the energy consumption, and there is an upper limit on the energy consumptio ...

Posted by pmeasham on Sun, 20 Feb 2022 15:57:44 +0100

Ubuntu16.04 installing cuda10 0/cudnn7. 6.5/openpose and problems (graphics card GTX1660 SUPER)

1, Install cuda10 0,CUDNN7. six point five Please follow the blog post below for the installation tutorial ubuntu16.04 installation of CUDA, cuDNN GTX 1660Ti Note: the latest version of graphics card installation and computer graphics card adaptation 2, CUDA unloading steps No need to uninstall the graphics card driver sudo apt-get rem ...

Posted by Optimo on Thu, 10 Feb 2022 02:02:10 +0100

opecv cuda acceleration official tutorial 2: Using a cv::cuda::GpuMat with thrust

Original address Global Thrust is a very powerful library of various cuda acceleration algorithms. However, thrust is designed for vector rather than tilt matrix. The following tutorial discusses how to wrap cv::cuda::GpuMat into a thrust iterator that can be used for thrust algorithms. This tutorial will show you how to: Wrap GpuMat in a t ...

Posted by JonnySnip3r on Wed, 02 Feb 2022 01:50:00 +0100

Process of installing anaconda and pytorch in linux and sharing of error reports

Note: the bugs in this article are all encountered during the installation process. After finding out the reasons, the command is optimized. Therefore, if you type according to the command, you should not encounter the following bugs It's not easy for new bloggers to sort out. If your problem is solved, please give a praise~~~~~~ I Installing ...

Posted by j0n on Tue, 25 Jan 2022 02:12:33 +0100

A full set of configuration deep learning environment under win10: anaconda3 + cuda10 2 + Cudnn + Pytorch1. 6 + pychar (detailed tutorial, pro test available)

Random talk about Autonomous vehicle_ 02 I ReadMe This paper aims to build a usable and pollution-free deep learning environment for children who are equipped with GPU in the computer and have deep learning needs.The article is based on the various methods of the great gods on the Internet, and then synthesized by myself. If there are simil ...

Posted by kindoman on Sat, 15 Jan 2022 00:11:19 +0100

CUDA C programming uses shared memory as a programmable management cache to reduce global memory access

one of the main reasons for using shared memory is to cache the data on the chip, so as to reduce the number of global memory accesses in the kernel function. Next, the parallel reduction kernel function will be reused, and the shared memory will be used as the programmable management cache to reduce global memory accesses. Reduct ...

Posted by closer on Tue, 04 Jan 2022 05:04:38 +0100

[CUDA basic exercise] laser point cloud top projection depth image

ok, continue to do a CUDA exercise. Among many methods of laser point cloud target detection, one kind of method is based on the depth image projected from the top view of point cloud from the perspective of BEV. Of course, as far as the overhead projection step is concerned, it can be completed with very simple logic whether in python or c + + ...

Posted by dr.wong on Fri, 31 Dec 2021 08:21:41 +0100

Hot Topics