Follow

Pan Bulletin: 1 August 2017

Contact

If you have any questions about the recent outage or the content of this bulletin, please contact support@nesi.org.nz.

Next Scheduled Outage

The next planned outage is scheduled for Thursday 7 September 2017. However, this date is subject to change pending staff training and implementation schedules for the new platforms.

Important Changes

MATLAB Licences

Because MATLAB licences are not provided directly by NeSI, the rules for their use differ depending on your institution. If you don’t have access to a MATLAB licence, you could consider using open-source alternatives such as:

Massey University

Your MATLAB jobs should always request a "massey_matlab" virtual licence token from Slurm so that it can avoid starting the job until there are enough licences available, thus:

#SBATCH --licenses=massey_matlab:1

The University of Auckland

Your MATLAB jobs should now always request the appropriate licence token ("sci_matlab" or "eng_matlab" for Faculty of Science and Faculty of Engineering personnel respectively) from Slurm so that it can avoid starting the job until there are enough licenses available, thus:

  • For Faculty of Engineering people: #SBATCH --licenses=eng_matlab:1
  • For Faculty of Science people: #SBATCH --licenses=sci_matlab:1

The University of Otago and Victoria University of Wellington

You don't need to worry about requesting MATLAB licence tokens as you have virtually unlimited MATLAB licenses.

I/O Licence Tokens

If your jobs are particularly I/O heavy and you run many of them at the same time, your work can slow down the shared GPFS filesystem. Therefore, to prevent too many such I/O heavy jobs running at once we have introduced another virtual licence, which we have named "io". There are 1,000 of these "io" licence tokens, corresponding to approximately 1 MB/s of capacity each. A typical I/O-bound job might use around 10 MB/s, so should include the following directive in the job submission script:

#SBATCH --licenses=io:10

This mechanism is similar to (and replaces) our previous --gres=io mechanism, but it is more flexible regarding which nodes such jobs can run on.

Project Allocations

NeSI's Access Policy governs how researchers may use NeSI (i.e., how we approve access and allocate computing resources). We have now provided all projects on NeSI facilities, including those from NeSI's investing institutions, with allocations of CPU core hours. Once you have used up your allocation, you will no longer be able to submit jobs to the queue until we allocate you more CPU core hours.

If you have run out of CPU core hours:

  • If you are still working on the same research programme, you can request more core hours by writing to us at support@nesi.org.nz. Please include your project code in your message.
  • If you would like to use our facilities to work on a new research programme, you can apply for a new project.

Scheduling of Postgraduate Projects

NeSI's Access Policy outlines that Merit projects have the highest priority level in our scheduling system. Our usage forecast is now showing that we will be operating under resource contention from time to time, so we have had to adjust our scheduling algorithm to make sure this policy is adhered to.

We have dropped the priority level of Postgraduate project allocations so that it is now lower than Merit project allocations. As a result, those researchers currently accessing NeSI via the Postgraduate allocation class may experience longer queue times.

If your notice that your research has been adversely affected by this change, please send us a message at support@nesi.org.nz. Depending on your institution, you may be eligible for an allocation from a class with a higher scheduling priority.

sacct

We have set the environment variable SACCT_FORMAT so that the sacct command gives more useful information. If you prefer the old default format, you can use the following command:

unset SACCT_FORMAT

The bash shell

We have improved command line tab completion in bash, especially for Slurm commands.

Job Constraints

Some jobs suffer long wait times in the queue because the job submission scripts request more resources than necessary. Recently we have noticed this happening with:

Memory

Many jobs use only a small proportion of the memory they request. If you are specifying more than 4 GB per core, please check that you really need to.

To check the amount of memory that was actually used by a completed job, you can run:

  • sacct -j JOBID (where JOBID is replaced by the numeric job ID)
  • sacct -u USERNAME -S YYYY-MM-DD (where USERNAME and YYYY-MM-DD are replaced by your username and the start of the period you want to search within)

We have configured the cluster to report the memory used (called "MaxRSS" by Slurm) in the right-most column of the sacct output by default.

Node Type

We sometimes see a large number of jobs specifying #SBATCH --constraint=sb (or equivalently #SBATCH -C sb). Anything that can run on a Sandy Bridge node can also run on an Ivy Bridge node, so unless you are doing benchmarking it is usually better to specify #SBATCH --constraint=avx (or #SBATCH -C avx) so that your job can run on either kind of node.

New Applications

We have installed the following applications on Pan for general use:

  • ESMF - The Earth System Modelling Framework
  • HISAT2 - Aligning next-generation sequencing reads
  • XBeach - A two-dimensional model for morphological changes of beaches during storms
  • Metaxa2 - Taxonomic classification of rRNA
  • RASPA2 - Simulation of molecules in gases, fluids, zeolites, aluminosilicates, metal-organic frameworks, carbon nanotubes and external fields
  • DIAMOND - A sequence aligner for protein and translated DNA searches
  • paprica - PAthway PRediction by phylogenetIC plAcement
  • LUMPY - DNA structural variant discovery
  • Phaistos - all-atom Monte Carlo simulations of proteins
  • gdc-client - Data Transfer Tool for the Genomic Data Commons

Application Upgrades

We have upgraded many existing applications since the last bulletin, including R, MATLAB and ANSYS. You can check the currently available versions of any application by using the following command:

module spider <application name>

ANSYS users are advised that some MPI options have changed in ANSYS 18. See the ANSYS article for details.

Introductory Workshops

This one hour workshop on using the Pan cluster is offered by the Centre for eResearch at the University of Auckland on Wednesdays at 3pm. Booking is required. Please contact them at eresearch@nesi.org.nz for registration enquiries.

If you are based at another institution and are interested in participating in a Hands-on-Introduction to NeSI workshop, please contact us at training@nesi.org.nz.

Comments

Powered by Zendesk