MATLAB

Availability

Licences

Example script

Script Example

#!/bin/bash -e
#SBATCH --job-name   MATLAB_job   #Name to appear in squeue 
#SBATCH --time 01:00:00 #Max walltime
#SBATCH --mem 1500 #Max memory

module load MATLAB/2018b
# Run the MATLAB script MATLAB_job.m
matlab -nodisplay < MATLAB_job.m

Function Example

#!/bin/bash -e
#SBATCH --job-name       MATLAB_job    #Name to appear in squeue
#SBATCH --time 06:00:00 #Max walltime
#SBATCH --mem 6000 #Max memory
#SBATCH --cpus-per-task 4 #2 physical cores.
#SBATCH --output %x.log #Location of output log

module load MATLAB/2018b

#Job run
matlab -nodisplay -r "addpath(genpath('../parentDirectory'));myFunction(5,20)"

Tip

Using the prefix ! will allow you to run bash commands from within MATLAB. e.g. !squeue -u $USER will print your currently queued slurm jobs.

Parallelism

MATLAB does not support MPI therefore #SBATCH --ntasks should always be 1, but if given the necessary resources some MATLAB functions can make use of multiple threads (--cpus-per-task) or GPUs (--gres gpu).

Implicit parallelism.

Implicit parallelism requires no changes to be made in your code. By default MATLAB will utilise multi-threading for a wide range of operations, scalability will vary but generally you will not be able to utilise more than a 4-8 CPUs this way.

Explicit parallelism.

Explicit parallelism is when you write your code specifically to make use of multiple CPU's. This can be done using MATLABs parpool-based language constructs, MATLAB assigns each thread a 'worker' that can be assigned sections of code.

MATLAB will make temporary files under your home directory (in ~/.matlab/local_cluster_jobs) for communication with worker processes. To prevent simultaneous parallel MATLAB jobs from interfering with each other you should tell them to each use their own job-specific local directories:

pc = parcluster('local')
pc.JobStorageLocation = getenv('TMPDIR')
parpool(pc, str2num(getenv('SLURM_CPUS_PER_TASK')))

Note

Parpool will throw a warning when started due to a difference in how time zone is specified. To fix this, add the following line to your SLURM script: export TZ="Pacific/Auckland'

 The main ways to make use of parpool are;

parfor: Executes each iteration of a loop on a different worker. e.g.

parfor i=1:100

%Your operation here.

end

parfor operates similarly to a SLURM job array and must be embarrassingly parallel. Therefore all variables either need to be defined locally (used internally within one iteration of the loop), or static (not changing during execution of loop).

More info here.

parfeval:

parfeval is used to assign a particular function to a thread, allowing it to be run asynchronously. e.g.

my_coroutine=parfeval(@my_async_function,2,in1,in2);

% Do something that doesn't require outputs from 'my_async_function'

[out1, out2]=fetchOutputs(my_coroutine); % If 'my_coroutine' has not finished execution will pause.

function [out1,out2]=my_async_function(in1,in2)

%Your operation here.

end

fetchOutputs is used to retrieve the values.

More info here.

Note

When killed (cancelled, timeout, etc), job steps utilising parpool may show state OUT_OF_MEMORY, this is a quirk of how the steps are ended and not necessarily cause to raise total memory requested.


Determining which of these categories your variables fall under is a good place to start when attempting to parallelise your code.

Tip

If your code is parallel at a high level it is preferable to use SLURM job arrays as there is less computational overhead and the multiple smaller jobs will queue faster.

Using GPUs

As with standard parallelism, some MATLAB functions will work implicitly on GPUs while other require setup. More info on using GPUs with MATLAB here.

MATLAB uses Nvidia CUDA drivers, so make sure to include module load CUDA before launching MATLAB.

GPU Example

#!/bin/bash -e
#SBATCH --job-name       MATLAB_GPU    # Name to appear in squeue
#SBATCH --time 06:00:00 # Max walltime
#SBATCH --mem 50G # 50G per GPU
#SBATCH --cpus-per-task 4 # 4 CPUs per GPU
#SBATCH --output %x.log #Location of output log
#SBATCH --gres gpu:1 # Number of GPUs to use (max 2)
#SBATCH --partition gpu # Must be run on GPU partition.

module load MATLAB/2018b
module load CUDA # Drivers for using GPU

#Job run
matlab -nodisplay -r "gpuDeviceCount()"

Note

One GPU hour is costed the same as 56 CPU hours. The GPUs are a powerful resource and should only be used if you expect significant speedup.

Labels: mahuika tier1 engineering general app
Was this article helpful?
1 out of 1 found this helpful
a.homepage:before