SLURM Partitions

From Wiki

Partitions are work queues that have a set of rules/policies and computational nodes included in it to run the jobs.

A list of partitions defined on the cluster, with access rights and resources definition, can be displayed with the command sinfo:

> sinfo -o "%10D %20F %P"

The command returns a more readable output which shows, for each partition, the total number of nodes and the number of nodes by state in the format "Allocated/Idle/Other/Total".

Partition Table

In the following table you can find the main features and limits imposed on the partitions.

Note: cpu refers to a logical cpu (1 HT).



Job QOS # cores/# GPUper job max walltime max running jobs per user/

max n. of cores/nodes/GPUs per user

priority notes
dev normal max = 8 CPUs 04:00:00 4 GPU

max mem = 40GB

students-dev normal max = 8 CPUs 02:00:00 4 GPU per account

max mem = 20GB

prod normal 24:00:00 max 25 jobs per user

14 GPU

special > 24:00:00 max 25 jobs per user

128 CPUs/600 GB/0 GPUs

10 reserved for non-interruptable jobs. Request to
special-dbg 4:00:00 max 25 jobs per user

32 CPUs/128 GB/0 GPUs

40 reserved for debugging jobs with qos special. Request to
lowprio 24:00:00 max 25 jobs per user

14 GPU

5 active projects/users with exhausted budget. Request to
students-prod normal 24:00:00 4 GPU per account 1 runs on a subset of 12 GPUs