# GPUs

Check your CUDA driver and device. 

In [1]:
!nvidia-smi

Wed Jul 3 22:10:58 2019 
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| 0 Tesla V100-SXM2... Off | 00000000:00:1B.0 Off | 0 |
| N/A 70C P0 228W / 300W | 7684MiB / 16130MiB | 78% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... Off | 00000000:00:1C.0 Off | 0 |
| N/A 44C P0 38W / 300W | 11MiB / 16130MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... Off | 00000000:00:1D.0 Off | 0 |
| N/A 43C P0 59W / 300W | 978MiB / 16130MiB | 14% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... Off | 0

Number of available GPUs

In [2]:
from mxnet import np, npx
from mxnet.gluon import nn
npx.set_np()

npx.num_gpus()

2

Computation devices

In [3]:
print(npx.cpu(), npx.gpu(), npx.gpu(1))

def try_gpu(i=0):
 return npx.gpu(i) if npx.num_gpus() >= i + 1 else npx.cpu()

def try_all_gpus():
 ctxes = [npx.gpu(i) for i in range(npx.num_gpus())]
 return ctxes if ctxes else [npx.cpu()]

try_gpu(), try_gpu(3), try_all_gpus()

cpu(0) gpu(0) gpu(1)


(gpu(0), cpu(0), [gpu(0), gpu(1)])

Create ndarrays on the 1st GPU

In [4]:
x = np.ones((2, 3), ctx=try_gpu())
print(x.context)
x

gpu(0)


array([[1., 1., 1.],
 [1., 1., 1.]], ctx=gpu(0))

Create on the 2nd GPU

In [5]:
y = np.random.uniform(size=(2, 3), ctx=try_gpu(1))
y

array([[0.59119 , 0.313164 , 0.76352036],
 [0.9731786 , 0.35454726, 0.11677533]], ctx=gpu(1))

Copying between devices

In [6]:
z = x.copyto(try_gpu(1))
print(x)
print(z)

[[1. 1. 1.]
 [1. 1. 1.]] @gpu(0)
[[1. 1. 1.]
 [1. 1. 1.]] @gpu(1)


The inputs of an operator must be on the same device, then the computation will run on that device.

In [7]:
y + z

array([[1.59119 , 1.313164 , 1.7635204],
 [1.9731786, 1.3545473, 1.1167753]], ctx=gpu(1))

Initialize parameters on the first GPU.

In [8]:
net = nn.Sequential()
net.add(nn.Dense(1))
net.initialize(ctx=try_gpu())

When the input is an ndarray on the GPU, Gluon will calculate the result on the same GPU.

In [9]:
net(x)

array([[0.04995865],
 [0.04995865]], ctx=gpu(0))

Let us confirm that the model parameters are stored on the same GPU.

In [10]:
net[0].weight.data()

array([[0.0068339 , 0.01299825, 0.0301265 ]], ctx=gpu(0))