Hyperthreading

Hyperthreading is enabled on the NeSI machines, so for each physical CPU core, there are two logical cores. This increases the efficiency of some multithreaded jobs, but the fact that Slurm is counting in logical cores makes aspects of running non-hyperthreaded jobs confusing, even when hyperthreading is turned off in the job with --hint=nomultithread.

  • Non-hyperthreaded jobs which use  --mem-per-cpu requests should halve their memory requests as those are based on memory per logical core, not per the number of threads or tasks.  For non-MPI jobs it may be clearer to just specify --mem (ie: memory per node) instead.
  • Non-MPI jobs which specify --cpus-per-task and use srun should also set --ntasks=1, otherwise the program will be run twice in parallel, halving the efficiency of the job.

 

The precise rules about when hyperthreading applies are:

  • Jobs never share physical cores, so are always allocated an even number of logical cores.
  • Tasks do not share physical cores by default on Mahuika, but that can be overridden with --hint=multithread.
  • Tasks do share physical cores by default on Māui, but that can be overridden with --hint=nomultithread.
  • Threads do share physical cores by default, but that can be overridden with --hint=nomultithread.

Hyperthreading with Serial Jobs

A serial job, consisting of one task and one thread, cannot take advantage of hyperthreading, but must also use a full physical core. In other words, a serial job will reserve (and charge for) two logical cores, but will only be able to use one. This situation is the same whether you run with --hint=nomultithread or not. To get the best advantage out of our hardware, we suggest you consider whether you could run your work by means of multithreaded software.

Hyperthreading with Multithreaded Jobs

A multithreaded job, also known as a Shared-Memory Processing (SMP) job or an OpenMP job, can take advantage of hyperthreading, and will often get better performance by doing so. To use hyperthreading in your OpenMP job, you will need to do the following:

  • In your Slurm directives, set --cpus-per-task to an even number
  • Include the following statement near the top of your actual script, just below the Slurm directives: export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

We do not recommend setting the number of CPUs per task to an odd number unless there is a significant benefit to your scientific software from doing so. If you do request an odd number of CPUs per task, you can expect the number of logical cores used for the job to be rounded up to the nearest even number, for example if you set --cpus-per-task=5, the job will use six logical cores, or three physical cores.

Hyperthreading with MPI Jobs and Array Jobs

Both MPI jobs and array jobs have a complex relationship with multithreading. An MPI job consists of multiple tasks, which communicate with each other by means of the MPI protocol, but are otherwise independent executions of the same computer program. Meanwhile, an array job is a way of batching individual jobs for easier submission and management. Therefore:

  • If the program supports MPI but neither OpenMP nor some other multithreading, or if you set --cpus-per-task to 1, then the whole job will use one physical core, that is two logical cores, per MPI task. For example, if you specify --ntasks=8 and --cpus-per-task=1, the job will use 8 physical cores, that is, 16 logical cores. This is also true if your job is an array job. For this reason, if your software supports multithreading, we recommend that you use it even if you are also using MPI or array jobs.
  • If the program supports both MPI and multithreading (e.g. OpenMP), you can take advantage of hyperthreading by setting --cpus-per-task to an even number and putting export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} near the top of your job script, as for stand-alone multithreaded jobs.

 

Was this article helpful?
0 out of 0 found this helpful