Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 222222
Loughborough University

IT Services : High Performance Computing

SLURM


Introduction

The primary way work is run on hydra is through the SLURM resource management system, which is a batch job submission system. What this means is that you have to create a shell script which runs your work, with hints to SLURM on how to run it, and then submit this to the batch job system. This then runs it on one or more of the Hydra nodes on your behalf when there is time to do so.

The primary unit of allocation is one 'node' - i.e. one computer of the system, which has a number of CPUs. This works best with parallel code (either your own MPI or OpenMP code, or packages which are parallel) but high throughput (HTC) jobs or small jobs of less than one node can be accomodated (see the links below).

SLURM is powerful, and has many options, so please take a look at the job submission concepts below and the overview above, and feel free to ask the team questions as required. Also see the appendix on the job queue lengths, and so on, supported on Hydra.