Parallel Execution

Many scientific software applications are written to take advantage of multiple CPUs in some way. But often this must be specifically requested by the user at the time they run the program, rather than happening automatically.

There are three main types of parallel execution available;


Whenever Slurm mentions CPUs it is referring to logical CPU's (2 logical CPU's = 1 physical core).
  • --cpus-per-task=4 will give you 4 logical cores.
  • --mem-per-cpu=1500 will give 1500MB per logical core.
  • If --hint=nomultithread is used then --cpus-per-task will now refer to physical cores, but --mem-per-cpu=1500 will stay the same.

See our article on hyperthreading for more information.

Multi-threading (OpenMP)

Multi-threading is a method of parallelisation whereby a master thread forks a specified number of slave threads to divide a task among.

Diagram showing serial operations.
Fig. 1: In a serial operation, tasks complete one after another.

Fig. 2: Multi-threading involves dividing the process into multiple 'threads' which can be run across multiple cores.

Multi-threading is limited in that it requires shared memory. All CPU cores used must be on the same node. However, because all the CPUs share the same memory environment things only need to be loaded into memory once, meaning that memory requirements will usually not increase proportionally to the number of CPUs.

Example script;

#!/bin/bash -e
#SBATCH --job-name=MultithreadingTest # job name (shows up in the queue)
#SBATCH --time=00:01:00 # Walltime (HH:MM:SS)
#SBATCH --mem=6000 # memory in MB (Should be 3000*physical CPUs on a standard partition)
#SBATCH --cpus-per-task=4 # 2 Physical cores per task.

taskset -c -p $$ #Prints  process ID and which CPUs it can use (on a node)

The expected output being

pid 13538's current affinity list: 7,9,43,45


MPI stands for Message Passing Interface, and is a communication protocol used by parallel computers.

Similar in many ways to multi-threading, MPI does not have the limitation of requiring shared memory and thus can be used across multiple nodes, but has higher computational overheads.

Because MPI tasks will not necessarilly be running in the same memory environment all working data will need to be loaded into memory equal to the number of task your job has. This means that memory usage of MPI job will increase at least proportionally to the amount of tasks you are using.

For MPI jobs you need to set --ntasks to a value larger than 1, or if you want all nodes to run the same number of tasks, set --ntasks-per-node and --nodes instead.

The Slurm command srun sets up the MPI runtime environment needed to run a parallel program, launching it on multiple CPUs, which can be on different nodes.

For MPI jobs, --mem-per-cpu should be used instead of --mem (remember, CPU refers to logical CPU cores)


srun should be used in place of any other MPI launcher such as aprun or mpirun.

#!/bin/bash -e
#SBATCH --job-name=MPIJob    	# job name (shows up in the queue)
#SBATCH --time=00:01:00     	# Walltime (HH:MM:SS)
#SBATCH --mem-per-cpu=1500    	# memory/cpu in MB (half the actual required memory)
#SBATCH --cpus-per-task=4 # 2 Physical cores per task. #SBATCH --ntasks=2 # number of tasks (e.g. MPI) srun pwd # Prints working directory

The expected output being



Running srun--cpus-per-task in conjuction when --ntasks=1 will cause your job to run twice. Instead, remove --ntasks=1 and srun entirely if you wish to run a single task.

Job Arrays

Job arrays are best used for tasks that are completely independent, such as parameter sweeps, permutation analysis or simulation, that could be executed in any order and don't have to run at the same time. This kind of work is often described as embarrassingly parallel. 

A job array will submit the same script repeatedly over a designated index using the SBATCH command #SBATCH --array

For example, the following code:

#!/bin/bash -e
#SBATCH --job-name=ArrayJob 	        # job name (shows up in the queue)
#SBATCH --time=00:01:00     		# Walltime (HH:MM:SS)
#SBATCH --mem=3000			# Memory
#SBATCH --array=1-2         		# Array jobs

pwd echo "This is result ${SLURM_ARRAY_TASK_ID}"

will submit two jobs,  ArrayJob_1 and ArrayJob_2, which will return the results This is result 1 and This is result 2 respectively.


Use of the environment variable ${SLURM_ARRAY_TASK_ID} is the recommended method of variation between the jobs. For example:

  • As a direct input to a function.
    matlab -nodisplay -r "myFunction(${SLURM_ARRAY_TASK_ID})"
  • As an index to an array.
    inArray=(1 2 4 8 16 32 64 128)
  • For selecting input files.
  • As a seed for a pseudo-random number.
    d20=$[RANDOM%20+1] #Random number between 1-20
    Using a seed is important, otherwise multiple jobs may receive the same pseudo-random numbers.

Environment variables will not work in the Slurm header. In place of ${SLURM_ARRAY_TASK_ID}, you can use the token %a. This can be useful for sorting your output files e.g.

#SBATCH --output=outputs/run_%a/slurm_output.out
#SBATCH --output=outputs/run_%a/slurm_error.err

Avoiding Conflicts

As all the array jobs could theoretically run at the same time, it is important that all file references are unique and independent.

If your program makes use of a working directory make sure you set it e.g.

mkdir .tmp/run_${SLURM_ARRAY_TASK_ID}          #Create new directory
export TMPDIR=.tmp/run_${SLURM_ARRAY_TASK_ID}  #Set TMPDIR to point there

If you have no control over the name/path of an output used by a program, this can be resolved in a similar manner.

mkdir run_${SLURM_ARRAY_TASK_ID}                             #Create new directory
cd run_${SLURM_ARRAY_TASK_ID}        #CD to new directory
mv output.log ../outputs/output_${SLURM_ARRAY_TASK_ID}.log #Move and rename output
rm -r ../run_${SLURM_ARRAY_TASK_ID}                          #Clear directory

The Slurm documentation on job arrays can be found here.


Was this article helpful?
1 out of 2 found this helpful