Māui Slurm Partitions

NeSI Access to Maui (XC50) and Maui_Ancil (CS500)

As previously noted, NeSI Users have access to a subset of Maui and/or Maui_Ancil compute resources. Accordingly, to avoid confusion, partitions on these systems that may be used for NeSI workloads carry the prefix "nesi_".

Important Considerations

Hyperthreading Enabled

On Māui and its Ancillary Nodes (maui_ancil) hyperthreading is enabled. Accordingly, by default Slurm schedules hyperthreads (logical cores, or “CPUS” in Slurm nomenclature), of which there are 80 on each node.  For some Applications this will lead to improved performance on the system, but in cases where it does not, or even degrades performance, users can request that Slurm allocate physical cores.   To turn hyperthreading off you should

  • Use the srun option or sbatch directive  --hint=nomultithread

Even though hyperthreading is enabled, the resources will generally be allocated to jobs at the level of a physical core. Two different jobs will not share a physical core. For example, a job requesting resources for three tasks on a hyperthreading node will be allocated two full physical cores.

Charging on Maui

By default nodes are not shared on Maui, and the minimum charging unit is node-hs, where 1 node-h is 40 core-hs, or 80 Slurm CPU-hs.

Maui (XC50) Slurm Partitions

 Partition

 

Maximum Wallclock  Time

Maximum Nodes (or cores, or GPGPUs) available

Maximum Nodes per job

Maximum Jobs a user can have running

Maximum Queued Jobs per user

Brief description

nesi_research

24h

264 nodes

66 nodes

4

10

Standard partition for all small and long jobs. Base priorities implemented via QoS & 104 are Large Memory (192GB)

Maximum job size is 168 node-h

Quality of Service: nesi_debug

Orthogonal to the Maui partition, each job has a "QoS", with the default QoS for a job being determined by the allocation class of its project. Specifying --qos=nesi_debug will override that and give the job very high priority, but is subject to strict limits: 20 minutes per job, and only 1 job at a time per user. Debug jobs are limited to 8 nodes.

Māui Electrical Groups

The XC nodes in Māui are connected using a Dragonfly network. On each blade are 4 nodes, sharing a network interconnect (NICs). A chassis is build of 16 blades, where all NICs are connected all-to-all. Six chassis build one electrical group (two cabinets), where every NIC is connected to NICs of all other chassis (copper cables). Furthermore, the electrical groups are connect in an all-to-all fashion using optical cables.

Māui consist of 3 cabinets, where the first two contain 336 compute nodes, and Cabinet 2 (an electrical group of only one cabinet) holds 128 compute nodes (only two of the three chassis in Cabinet 2 are populated). In certain circumstances, applications might experience a slowdown if the application is placed across both electrical groups.

Users might prevent this situation by adding the SLURM flag #SBATCH --switches=1 to their batch script, which defines the maximum count of switches desired for the job allocation. We strongly advise that you manually set a maximum waiting time for the selected number of switches, e.g. #SBATCH --switches=1@00:01:00 will make the scheduler wait for maximum one hour before ignoring the switches request.

Caution: If SLURM finds an allocation containing more switches than the count specified, the job remains pending until it either finds an allocation with the desired switch count or the time limit expires. To determine the default wait time see scontrol show config | grep max_switch_wait.

Maui_Ancil Partitions

 

Partition

 

 

Maximum Wallclock  Time

Maximum cores, or GPGPUs) available

Maximum Cores per job

Maximum Jobs a user can have running

Maximum Queued Jobs per user

Brief description

nesi_prepost

3h

80 cores

2 cores

4

100

Used for pre and post processing tasks.

nesi_gpu

72h

40 cores

5 GPGPUs

8 cores, 1

GPGPU

1

5

Used for GPGPU jobs on Maui Ancillary nodes. And for visualisation. Between 7AM and 8PM weekdays, the number of GPGPUs available will reduce to 4. The other 1 being available for remote visualisation . Base Priorities are implemented via QoS

nesi_igpu

2h

8 cores, 1 GPU

8 cores, 1 GPU

1

5

For interactive GPGPU access

Sview

You may use sview to view the jobs in all Slurm clusters in the HPCF (i.e. mahuika, maui and maui_ancil)

 

 

Labels: maui slurm
Was this article helpful?
0 out of 0 found this helpful