If you are unsure about using our job scheduler Slurm, more details can be found here.
Slurm Commands
A complete list of Slurm commands can be found here, or by entering man slurm into a terminal
sbatch | sbatch submit.sl |
Submits the Slurm script submit.sl |
squeue | squeue |
Displays entire queue. |
squeue --me |
Displays your queued jobs. | |
squeue -p long |
Displays queued jobs on the long partition. | |
sacct | sacct |
Displays all the jobs run by you that day. |
sacct -S 2019-01-01 |
Displays all the jobs run by you since the 1st Jan 2019 | |
sacct -j 123456789 |
Displays job 123456789 | |
scancel | scancel 123456789 |
Cancels job 123456789 |
scancel --me |
Cancels all your jobs. | |
sshare | sshare -U |
Shows the Fair Share scores for all projects of which you are a member. |
sinfo | sinfo |
Shows the current state of our Slurm partitions. |
sbatch options
A complete list of sbatch options can be found here, or by running man sbatch
Options can be provided on the command line or in the batch file as an #SBATCH
directive. The option name and value can be separated using an '=' sign e.g. #SBATCH --account=nesi99999
or a space e.g. #SBATCH --account nesi99999
. But not both!
General options
--job-name | #SBATCH --job-name=MyJob |
The name that will appear when using squeue or sacct |
--account | #SBATCH --account=nesi99999 |
The account your core hours will be 'charged' to. |
--time | #SBATCH --time=DD-HH:MM:SS |
Job max walltime |
--mem | #SBATCH --mem=512MB |
Memory required per node. |
--partition | #SBATCH --partition=long |
Specified job partition. |
--output | #SBATCH --output=%j_output.out |
Path and name of standard output file. |
--mail-user | #SBATCH --mail-user=bob123@gmail.com |
Address to send mail notifications. |
--mail-type | #SBATCH --mail-type=ALL |
Will send a mail notification at BEGIN END FAIL |
#SBATCH --mail-type=TIME_LIMIT_80 |
Will send message at 80% walltime | |
--no-requeue | #SBATCH --no-requeue |
Will stop job being requeued in the case of node failure. |
Parallel options
--nodes | #SBATCH --nodes=2 |
Will request tasks be run across 2 nodes. |
--ntasks | #SBATCH --ntasks=2 |
Will start 2 MPI tasks. |
--ntasks-per-node | #SBATCH --ntasks-per-node=1 |
Will start 1 task per requested node |
--cpus-per-task | #SBATCH --cpus-per-task=10 |
Will request 10 logical CPUs per task. See Hyperthreading. |
--mem-per-cpu | #SBATCH --mem-per-cpu=512MB |
Memory Per logical CPU.
|
--array | #SBATCH --array=1-5 |
Will submit job 5 times each with a different $SLURM_ARRAY_TASK_ID (1,2,3,4,5) |
#SBATCH --array=0-20:5 |
Will submit job 5 times each with a different $SLURM_ARRAY_TASK_ID (0,5,10,15,20) |
|
#SBATCH --array=1-100%10 |
Will submit 1 though to 100 jobs but no more than 10 at once. |
Other
--qos | #SBATCH --qos=debug |
Adding this line gives your job a very high priority. Limited to one job at a time, max 15 minutes. |
--profile | #SBATCH --profile=ALL |
Allows generation of a .h5 file containing job profile information. |
--dependency | #SBATCH --dependency=afterok:123456789 |
Will only start after the job 123456789 has completed. |
--hint | #SBATCH --hint=nomultithread |
Disables hyperthreading, be aware that this will significantly change how your job is defined. |
Tip
Many options have a short and long form e.g.
#SBATCH --job-name=MyJob
&#SBATCH -J=MyJob
.echo "Completed task ${SLURM_ARRAY_TASK_ID} / ${SLURM_ARRAY_TASK_COUNT} successfully"
Tokens
These are predefined variables that can be used in sbatch directives such as the log file name.
%x |
Job name |
%u |
User name. |
%j |
Job ID |
%a |
Job array Index |
Environment variables
Common examples.
$SLURM_JOB_ID |
Useful for naming output files that won't clash. |
$SLURM_JOB_NAME |
Name of the job. |
$SLURM_ARRAY_TASK_ID |
The current index of your array job. |
$SLURM_CPUS_PER_TASK |
Useful as an input for multi-threaded functions. |
$SLURM_NTASKS |
Useful as an input for MPI functions. |
$SLURM_SUBMIT_DIR |
Directory where sbatch was called. |
Tip
In order to decrease the chance of a variable being misinterpreted you should use the syntax
${NAME_OF_VARIABLE}
and define in strings if possible. e.g.echo "Completed task ${SLURM_ARRAY_TASK_ID} / ${SLURM_ARRAY_TASK_COUNT} successfully"