ANSYS

Availability

Licences

License Types

The three main ANSYS licenses are;

  • ANSYS Teaching License (ansys_t)

    This is the default license type, it can be used on up to 16 CPUs on models with less than 512k nodes

  • ANSYS Research license (ansys_r)

    No node restrictions. Can be used on up to 16 CPUs, for every additional CPU over 16 you must request additional 'ansys_hpc' licenses.

  • ANSYS HPC License (ansys_hpc)
    One of these is required for each CPU over 16 when using a research license.

License Order

Whether to use a teaching or research license must be set manually. If your job is greater than the node limit, not switching to the research license before submitting a job will cause the job to fail.

The license order can be changed in workbench under tools > license preferences (provided you have X11 forwarding set up), or by running either of the following (ANSYS module must be loaded first using module load ANSYS).

prefer_research_license
prefer_teaching_license

Note

License preferences are individually tracked by each version of ANSYS. Make sure you set preferences using the same version as in your script.

Journal files

Some ANSYS applications take a 'journal' text file as input. It is often useful to create this journal file in your SLURM script (tidiness, submitting jobs programmatically, etc). This can be done by using cat to make a file from a 'heredoc'.

Below is an example of this from a fluent script.

#!/bin/bash -e

#SBATCH --job-name      Fluent_Array
#SBATCH --time          01:00:00          # Wall time
#SBATCH --mem           3G                # Memory per node
#SBATCH --licenses      ansys_hpc:1       # One license token per CPU, less 16
#SBATCH --array 1:100
#SBATCH --hint nomultithread # No hyperthreading module load ANSYS/19.2
JOURNAL_FILE=fluent_${SLURM_JOB_ID}.in
cat <<EOF > ${JOURNAL_FILE}
rcd testCase${SLURM_ARRAY_TASK_ID}.cas
/solve/dual-time-iterate 10
/file/write-case-data testOut${SLURM_ARRAY_TASK_ID}.cas
exit
EOF
# Use one of the -v options 2d, 2ddp, 3d, or 3ddp fluent -v3ddp -g -i ${JOURNAL_FILE}
rm ${JOURNAL_FILE}

JOURNAL_FILE is a variable holding the name of a file, the next line cat creates the file then writes a block of text into it. The block of text written is everything betweenan arbitrary string (in this case EOF) and its next occurrence.

In this case (assuming it is the first run of the array and the jobid=1234567), the file  fluent_1234567.in will be created:

rcd testCase1.cas
/solve/dual-time-iterate 10
/file/write-case-data testOut1.cas
exit

then called as an input fluent -v3ddp -g -i fluent_1234567.in,
then deleted rm fluent_1234567.in

This can be used with variable substitution to great effect as it allows the use of variables in what might otherwise be a fixed input.

Fluent

fluent -help for a list of commands.

Must have one of these flags

2d 2D solver, single point precision.
3d 3D solver, single point precision.
2dpp 2D solver, double point precision.
3dpp 3D solver, double point precision.

 

Serial Example


Single process with a single thread (2 threads if hyperthreading enabled).

Usually submitted as part of an array, as in the case of parameter sweeps.

#!/bin/bash -e

#SBATCH --job-name      Fluent-Serial
#SBATCH --licenses ansys_r@uoa_foe:1 #One research license.
#SBATCH --time 00:05:00 # Walltime #SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem 3000 # total mem #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2
JOURNAL_FILE=/share/test/ansys/fluent/wing.in fluent 3ddp -g -i ${JOURNAL_FILE}

Distributed Memory Example


Multiple processes each with a single thread.

Not limited to one node.
Model will be segmented into -t pieces which should be equal to --ntasks.

Each task could be running on a different node leading to increased communication overhead. Jobs can be limited to a single node by adding  --nodes=1 however this will increase your time in the queue as contiguous cpu's are harder to schedule.

#!/bin/bash -e

#SBATCH --job-name          Fluent-Dis
#SBATCH --time              00:05:00          # Walltime
#SBATCH --licenses ansys_r@uoa_foe:1,ansys_hpc@uoa_foe:20
##One research license, (ntasks-16) hpc licenses
#SBATCH --nodes 1 # Limit to n nodes (Optional) #SBATCH --ntasks 36 # Number processes
#SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem-per-cpu 1500 # Standard for large partition #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 JOURNAL_FILE=/share/test/ansys/fluent/wing.in fluent 3ddp -g -t ${SLURM_NTASKS} -i ${JOURNAL_FILE}

Useful Links

CFX

cfx5solve -help for a list of commands.

Serial Example


Single process with a single thread (2 threads if hyperthreading enabled).

Usually submitted as part of an array, as in the case of parameter sweeps.

#!/bin/bash -e

#SBATCH --job-name      CFX-serial
#SBATCH --licenses ansys_r@uoa_foe:1 #One research license.
#SBATCH --time 00:05:00 # Walltime #SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem 3000 # total mem #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 input=/share/test/ansys/cfx/pump.def cfx5solve -batch -def "$input"

Distributed Memory Example


Multiple processes each with a single thread.

Not limited to one node.
Model will be segmented into -np pieces which should be equal to --ntasks.

Each task could be running on a different node leading to increased communication overhead
.Jobs can be limited to a single node by adding  --nodes=1 however this will increase your time in the queue as contiguous cpu's are harder to schedule.

#!/bin/bash -e

#SBATCH --job-name          ANSYS-Dis
#SBATCH --time              00:05:00          # Walltime
#SBATCH --licenses ansys_r@uoa_foe:1,ansys_hpc@uoa_foe:20
##One research license, (ntasks-16) hpc licenses
#SBATCH --nodes 1 # Limit to n nodes (Optional) #SBATCH --ntasks 36 # Number processes
#SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem-per-cpu 1500 # Standard for large partition #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 input=/share/test/ansys/mechanical/structural.dat cfx5solve -batch -def "$input" -part $SLURM_NTASKS

CFX-Post

Even when running headless (without a GUI) CFX-Post requires connection to a graphical output. For some cases it may be suitable running CFX-Post on the login node and using your X-11 display, but for larger batch compute jobs you will need to make use of a dummy X-11 server.

This is as simple as prepending your command with the X Virtual Frame Buffer command.

xvfb-run cfx5post input.cse

Mechanical APDL

Serial Example


Single process with a single thread (2 threads if hyperthreading enabled).

Usually submitted as part of an array, as in the case of parameter sweeps.

#!/bin/bash -e

#SBATCH --job-name      ANSYS-serial
#SBATCH --licenses ansys_r@uoa_foe:1 #SBATCH --time 00:05:00 # Walltime #SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem 3000 # total mem #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 input=/share/test/ansys/mechanical/structural.dat mapdl -b -i "$input"

Shared Memory Example


Single process multiple threads.

All threads must be on the same node, limiting scalability.
Number of threads is set by -np and should be equal to --cpus-per-task.


Not recommended if using more than 8 cores (16 CPUs if hyperthreading enabled).

#!/bin/bash -e

#SBATCH --job-name      ANSYS-Shared
#SBATCH --licenses ansys_r@uoa_foe:1 #SBATCH --time 00:05:00 # Walltime #SBATCH --ntasks 1 # Any more than 1 will not be utilised. #SBATCH --cpus-per-task 8 # Double if hyperthreading enabled #SBATCH --mem 12G # 8 threads at 1.5 GB per thread #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 input=/share/test/ansys/mechanical/structural.dat mapdl -b -np ${SLURM_CPUS_PER_TASK} -i "$input"

Distributed Memory Example


Multiple processes each with a single thread.

Not limited to one node.
Model will be segmented into -np pieces which should be equal to --ntasks.

Each task could be running on a different node leading to increased communication overhead
.Jobs can be limited to a single node by adding  --nodes=1 however this will increase your time in the queue as contiguous cpu's are harder to schedule.

Distributed Memory Parallel is currently not supported on Māui.

#!/bin/bash -e

#SBATCH --job-name          ANSYS-Dis
#SBATCH --licenses ansys_r@uoa_foe:1,ansys_hpc@uoa_foe:4 #SBATCH --time 00:05:00 # Walltime
#SBATCH --nodes 1 # Limit to n nodes #SBATCH --ntasks 16 # Number processes
#SBATCH --cpus-per-task 1 # Double if hyperthreading enabled #SBATCH --mem-per-cpu 1500 # Standard for large partition #SBATCH --hint nomultithread # Hyperthreading disabled module load ANSYS/19.2 input=/share/test/ansys/mechanical/structural.dat mapdl -b -dis -np ${SLURM_NTASKS} -i "$input"

Not all MAPDL solvers work using distributed memory. 

Sparse
PCG
ICCG
JCG
QMR
Block Lanczos eigensolver
PCG Lanczos eigensolver
Supernode eigensolver
Subspace eigensolver
Unsymmetric eigensolver
Damped eigensolver
QRDAMP eigensolver
Element formulation
Results calculation
Pre/Postprocessing

Useful Links

LS-DYNA

Fluid-Structure Example

#!/bin/bash -e
#SBATCH --job-name      LS-DYNA
#SBATCH --account       nesi99999         # Project Account
#SBATCH --time          01:00:00          # Walltime
#SBATCH --ntasks        16                # Number of CPUs to use
#SBATCH --mem-per-cpu   1500              # Memory per cpu
#SBATCH --hint          nomultithread     # No hyperthreading

module load ANSYS/18.1
input=3cars_shell2_150ms.k
lsdyna -dis -np $SLURM_NTASKS i="$input" memory=$(($SLURM_MEM_PER_CPU/8))M

FENSAP-ICE

FENSAP-ICE is a fully integrated ice-accretion and aerodynamics simulator.

Currently FENSAP-ICE is only available on Mahuika and in ANSYS 19.2.

The following FENSAP solvers are compatible with MPI

  • FENSAP
  • DROP3D
  • ICE3D
  • C3D
  • OptiGrid

Case setup 

With GUI

If you have set up X-11 forwarding, you may launch the FENSAP ice using the command fensapiceGUI from within your FENSAP project directory. 

1. Launch the run and select the desired number of (physical) CPUs.

2. Open the 'configure' panel.

FENSAP_GUI1.png

3. Under 'Additional mpirun parameters' add your inline SLURM options. You should include at least.

--job-name my_job
--mem-per-cpu memory
--time time
--licenses required licences
--hint nomultithread 

Note: All these parameters will be applied to each individual step.

4. Start the job. You can track progress under the 'log' tab.

FENSAP_GUI2.png

You may close your session and the job will continue to run on the compute nodes. You will be able to view the running job at any time by opening the GUI within the project folder.

Note

Submitting your job through the use of the GUI has disadvantages and may not be suitable in all cases.

  • Closing the session or losing connection will prevent the next stage of the job starting (currently executing step will continue to run).  It is a good idea to launch the GUI inside a tmux/screen session then send the process to background to avoid this.
  • Each individual step will be launched with the same parameters given in the GUI.
  • By default 'restart' is set to disabled. If you wish to continue a job from a given step/shot you must select so in the dropdown menu.

Using fensap2slurm

Set up your model as you would normally, except rather than starting the run just click 'save'. You do not need to set number of CPUs or MPI configuration.
Then in your terminal type fensap2slurm path/to/project or run fensap2slurm from inside the run directory.

This will generate a template file for each stage of the job, edit these as you would a normal SLURM script and set the resources requirements.

For your first shot, it is a good idea to set CONTINUE=FALSE for the last stage of the shot, that way you can set more accurate resource requirements for the remainder.

The workflow can then by running .solvercmd e.g bash .solvercmd. Progress can be tracked through the GUI as usual. 

Best Practices

GPU acceleration support

GPUs can be slow for smaller jobs because it takes time to transfer data from the main memory to the GPU memory. We therefore suggest that you only use them for larger jobs, unless benchmarking reveals otherwise.

Interactive use

It is best to use journal files etc to automate ANSYS so that you can submit batch jobs, but when interactivity is really needed alongside more CPU power and/or memory than is reasonable to take from a login node (maybe postprocessing a large output file) then an alternative which may work is to run the GUI frontend on a login node while the MPI tasks it launches run on a compute node. This requires using salloc instead of sbatch, for example:

salloc -A nesi99999 -t 30 -n 16 -C avx --mem-per-cpu=2G bash -c 'module load ANSYS; fluent -v3ddp -t$SLURM_NTASKS' 

As with any job, you may have to wait a while before the resource is granted and you can begin, so you might want to use the --mail-type=BEGIN and --mail-user= options.

Hyperthreading

Utilising hyperthreading (ie: removing the "--hint=nomultithread" sbatch directive and doubling the number of tasks) will give a small speedup on most jobs with less than 8 cores, but also doubles the number of ansys_hpc license tokens required.

Labels: mahuika application engineering
Was this article helpful?
0 out of 0 found this helpful
a.homepage:before