How can I profile a SLURM job?

Job resource usage can be determined on job completion by checking the following sacct columns;

  • MaxRSS - Peak memory usage.
  • TotalCPU - Check Elapsed x Alloc TotalCPU

 

However if you want to record resource usage over the run-time of your job,
the line #SBATCH --profile task can be added to your SLURM header.

On completion of your job;

Contact us for help analysing the data.

Or

Collate the data into a HDF5 file using the command sh5util -j <jobid>.

A file named job_<JobID>.h5 will be created.

Scripts for plotting HDF5 data.  

Labels: slurm profiling
Was this article helpful?
0 out of 0 found this helpful