GPU use on NeSI

This page provides generic information about how to access NeSI's GPU cards.

For application specific settings (e.g. OpenMP, Tensorflow on GPU, ...), please have a look at the dedicated pages listed at the end of this page.


Details about GPU cards available for each system and usage limits are available in the Mahuika Slurm Partitions and Māui_Ancil (CS500) Slurm Partitions support pages.

Details about pricing in terms of compute units can be found in the What is an allocation? page.

Request GPU resources using Slurm

To request a GPU for your Slurm job, add the following option at the beginning of your submission script:

#SBATCH --gpus-per-node=1

You can specify the type of GPU you need, depending on which ones you have access to with your allocation. If you wish to use the A100 GPUs please contact our support team to learn more about getting access to the A100 GPU cards.

For example, to access a P100 card, use the following option:

#SBATCH --gpus-per-node=P100:1

On Mahuika, you can request 2 GPUs per node using:

#SBATCH --gpus-per-node=P100:2

Conversely, to request A100 GPU devices, do this:

  • To use one A100:

    #SBATCH --gpus-per-node=A100:1
  • To use two A100s:

    #SBATCH --gpus-per-node=A100:2

If not specified, the default GPU type is P100.

You can also use the --gpus-per-nodeoption in Slurm interactive sessions, with the srun and salloc commands. For example:

srun --job-name "InteractiveGPU" --gpus-per-node 1 --cpus-per-task 8 --mem 2GB --time 00:30:00 --pty bash

will request and then start a bash session with access to a GPU, for a duration of 30 minutes.


When you use the --gpus-per-nodeoption, Slurm automatically sets the CUDA_VISIBLE_DEVICES environment variable inside your job environment to list the index/es of the allocated GPU card/s on each node.
$ srun --job-name "GPUTest" --gpus-per-node=P100:2 --time 00:05:00 --pty bash
srun: job 20015016 queued and waiting for resources
srun: job 20015016 has been allocated resources


On Māui Ancillary Nodes, you also need to request the nesi_gpu partition to have access to the GPU.

#SBATCH --partition=nesi_gpu

Load CUDA and cuDNN modules

To use an Nvidia GPU card with your application, you need to load the driver and the CUDA toolkit via the environment modules mechanism:

module load CUDA/11.0.2

You can list the available versions using:

module spider CUDA

Please contact us at if you need a version not available on the platform.


On Māui Ancillary Nodes, use module avail CUDA to list available versions.

The CUDA module also provides access to additional command line tools:

      • nvidia-smi to directly monitor GPU resource utilisation,
      • nvcc to compile CUDA programs,
      • cuda-gdb to debug CUDA applications.

In addition, the cuDNN (NVIDIA CUDA® Deep Neural Network library) library is accessible via its dedicated module:

module load cuDNN/

which will automatically load the related CUDA version. Available versions can be listed using:

module spider cuDNN

Example Slurm script

The following Slurm script illustrates a minimal example to request a GPU card, load the CUDA toolkit and query some information about the GPU:

#!/bin/bash -e
#SBATCH --job-name=GPUJob # job name (shows up in the queue)
#SBATCH --time=00-00:10:00 # Walltime (DD-HH:MM:SS)
#SBATCH --gpus-per-node=1 # GPU resources required per node
#SBATCH --cpus-per-task=2 # number of CPUs per task (1 by default)
#SBATCH --mem=512MB # amount of memory per node (1 by default)

# load CUDA module
module purge
module load CUDA/11.0.2

# display information about the available GPUs

# check the value of the CUDA_VISIBLE_DEVICES variable

Save this in a file and submit it using:


The content of job output file would look like:

$ cat slurm-20016124.out

The following modules were not unloaded:
   (Use "module --force purge" to unload all):

  1) slurm   2) NeSI
Wed May 12 12:08:27 2021
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  On   | 00000000:05:00.0 Off |                    0 |
| N/A   29C    P0    23W / 250W |      0MiB / 12198MiB |      0%      Default |
|                               |                      |                  N/A |

| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |


CUDA_VISIBLE_DEVICES=0 indicates that this job was allocated to CUDA GPU index 0 on this node. It is not a count of allocated GPUs.

NVIDIA Nsight Systems and Compute profilers

Nsight Systems is a system-wide analysis tool, particularly good for profiling CPU-GPU interactions. It is provided on Mahuika via the Nsight-Systems module:

$ module load Nsight-Systems/2020.5.1
Load `PyQt/5.12.1-gimkl-2020a-Python-3.8.2` module prior to running `nsys-ui`
$ nsys --version
NVIDIA Nsight Systems version 2020.5.1.85-5ee086b

This module gives you access to the nsys command line tool or the nsys-ui graphical interface.

Nsight Compute is a profiler for CUDA kernels. It is accessible on Mahuika using the Nsight-Compute module:

$ module load Nsight-Compute/2020.3.0
Load `PyQt/5.12.1-gimkl-2020a-Python-3.8.2` module prior to running `nsys-ui`
$ ncu --version
NVIDIA (R) Nsight Compute Command Line Profiler
Copyright (c) 2018-2020 NVIDIA Corporation
Version 2020.3.0.0 (build 29307467) (public-release)

Then you can use the ncu command line tool or the ncu-ui graphical interface.


The nsys-ui and ncu-ui tools require access to a display server, either via X11 or a Virtual Desktop. You also need to load the PyQt module beforehand:
module load PyQt/5.12.1-gimkl-2020a-Python-3.8.2
module load Nsight-Systems/2020.5.1
nsys-ui  # this will work only if you have a graphical session

Application and toolbox specific support pages

The following pages provide additional information for supported applications:

And programming toolkits:

Labels: gpu
Was this article helpful?
0 out of 0 found this helpful