进行 CUDA 开发时,首先需要一台带有 GPU 显卡的机器(废话~~),笔记本、台式机、服务器都可以。此仓库以 Linux 系统为基础环境,Windows 环境的配置下文会提供一些教程(我没有 windows,穷~~)。
在装有 GPU 显卡的 Linux 系统上,一般自带了 nvidia-smi 命令,可以查看显卡驱动版本号、型号等信息,如下是我开发机的输出信息:
+-----------------------------------------------------------------------------+| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 ||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | | MIG M. ||===============================+======================+======================|| 0 Tesla V100-SXM2... On | 00000000:3F:00.0 Off | 0 || N/A 34C P0 57W / 300W | 16128MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 1 Tesla V100-SXM2... On | 00000000:40:00.0 Off | 0 || N/A 33C P0 53W / 300W | 764MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 2 Tesla V100-SXM2... On | 00000000:41:00.0 Off | 0 || N/A 36C P0 54W / 300W | 9666MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 3 Tesla V100-SXM2... On | 00000000:42:00.0 Off | 0 || N/A 37C P0 56W / 300W | 3280MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 4 Tesla V100-SXM2... On | 00000000:62:00.0 Off | 0 || N/A 31C P0 40W / 300W | 3MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 5 Tesla V100-SXM2... On | 00000000:63:00.0 Off | 0 || N/A 31C P0 39W / 300W | 3MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 6 Tesla V100-SXM2... On | 00000000:64:00.0 Off | 0 || N/A 34C P0 41W / 300W | 3MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+| 7 Tesla V100-SXM2... On | 00000000:65:00.0 Off | 0 || N/A 34C P0 41W / 300W | 3MiB / 16160MiB | 0% Default || | | N/A |+-------------------------------+----------------------+----------------------+
如果你的机器上显卡驱动都没有安装,可以参考 Nvidia 官网根据你显卡的型号,下载和安装对应的驱动:https://www.nvidia.cn/geforce/drivers/
CUDA Toolkit 是开发 CUDA 程序必备的工具。就像我们写 C++ 一样,你得装 GCC 吧,Toolkit 装完在命令行里输入 nvcc -V 就会输出版本信息,比如:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Jan_28_19:32:09_PST_2021
Cuda compilation tools, release 11.2, V11.2.142
Build cuda_11.2.r11.2/compiler.29558016_0
如果还不是很清楚 CUDA Toolkit 是什么,可以翻阅 Nivida 官网的介绍:
The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize, and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to deploy your application.
安装时,直接点击 Nivida 官网 的 Download Now 下载安装即可。安装后可以借助 nvcc -V 来确认是否安装成功。
新建一个 hello_world.cu 文件(见此目录):
#include <stdio.h>__global__ void cuda_say_hello(){ printf("Hello world, CUDA! %d\\n", threadIdx.x);}int main(){ printf("Hello world, CPU\\n"); cuda_say_hello<<<1,1>>>(); cudaError_t cudaerr = cudaDeviceSynchronize(); if (cudaerr != cudaSuccess) printf("kernel launch failed with error \\"%s\\".\\n", cudaGetErrorString(cudaerr)); return 0;}