Job prioritisation

Each queued job has a priority score.  Jobs start when sufficient resources are available and not already reserved for jobs with a higher priority.

To see the priorities of your currently pending jobs you can use the command sprio -u $USER.

Priority scores are determined by a number of factors:

1) Quality of Service

The "debug" Quality of Service can be gained by adding the sbatch command line option --qos=debug.
This adds 5000 to the job priority so raises it above all non-debug jobs, but is limited to one small job per user at a time: no more than 15 minutes and no more than 2 nodes.

2) Fair Share

Job priority decreases whenever the project uses more core-hours than expected, across all partitions. This Fair Share policy means that projects that have consumed many CPU core hours in the recent past compared to their expected rate of use (either by submitting and running many jobs, or by submitting and running large jobs) will have a lower priority, and projects with little recent activity compared to their expected rate of use will see their waiting jobs start sooner.  Fair Share contributes up to 1000 points to the job priority. To see the recent usage and current fair-share score of a project, you can use the command nn_corehour_usage.

3) Job Age

Job priority slowly rises with time as a pending job gets older - 1 point per hour for up to 3 weeks.

4) Job Size

This slightly favours jobs which request more CPUs, as a means of countering the inherently longer wait time necessary for a larger number of CPUs to become available.

5) Project Allocation Class

This depends on which "allocation class" entitles your project to use NeSI.

Project class Class Priority Score 
Proposal Development 10
Postgraduate 20
Collaborator 30
Merit 40
Commercial 40

 

Backfill

Backfill is a scheduling strategy that allows small, short jobs to run immediately if by doing so they will not delay the expected start time of any higher-priority jobs. Since the expected start time of pending jobs depends upon the expected completion time of running jobs it is important that you set reasonably accurate job time limits if backfill is to work well.

While the kinds of jobs that can be backfilled will also get a low job size score, it is our general experience that an ability to be backfilled is on the whole more useful when it comes to getting work done on the HPCs.

More information about backfill can be found here.

Limits

Cluster and partition-specific limits can sometimes prevent jobs from starting regardless of their priority score.  For details see the pages on Mahuika or Māui.

Was this article helpful?
0 out of 0 found this helpful