SLURM Partitions: Difference between revisions
From Wiki
(Created page with "A list of partitions defined on the cluster, with access rights and resources definition, can be displayed with the command sinfo: > sinfo -o "%10D %20F %P" The command returns a more readable output which shows, for each partition, the total number of nodes and the number of nodes by state in the format "Allocated/Idle/Other/Total". In the following table you can find the main features and limits imposed on the partitions. Note: '''cpu''' refers to a logical cpu (1 H...") |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
A list of partitions defined on the cluster, with access rights and resources definition, can be displayed with the command sinfo: | Partitions are work queues that have a set of rules/policies and computational nodes included in it to run the jobs. | ||
A list of partitions defined on the cluster, with access rights and resources definition, can be displayed with the command <code>sinfo</code>: | |||
> sinfo -o "%10D %20F %P" | > sinfo -o "%10D %20F %P" | ||
The command returns a more readable output which shows, for each partition, the total number of nodes and the number of nodes by state in the format "Allocated/Idle/Other/Total". | The command returns a more readable output which shows, for each partition, the total number of nodes and the number of nodes by state in the format "Allocated/Idle/Other/Total". | ||
=== Partition Table === | |||
In the following table you can find the main features and limits imposed on the partitions. | In the following table you can find the main features and limits imposed on the partitions. | ||
Latest revision as of 21:38, 5 February 2023
Partitions are work queues that have a set of rules/policies and computational nodes included in it to run the jobs.
A list of partitions defined on the cluster, with access rights and resources definition, can be displayed with the command sinfo
:
> sinfo -o "%10D %20F %P"
The command returns a more readable output which shows, for each partition, the total number of nodes and the number of nodes by state in the format "Allocated/Idle/Other/Total".
Partition Table
In the following table you can find the main features and limits imposed on the partitions.
Note: cpu refers to a logical cpu (1 HT).
SLURM
partition |
Job QOS | # cores/# GPUper job | max walltime | max running jobs per user/
max n. of cores/nodes/GPUs per user |
priority | notes |
dev | normal | max = 8 CPUs | 04:00:00 | 4 GPU
max mem = 40GB |
40 | |
students-dev | normal | max = 8 CPUs | 02:00:00 | 4 GPU per account
max mem = 20GB |
20 | |
prod | normal | 24:00:00 | max 25 jobs per user
14 GPU |
10 | ||
special | > 24:00:00 | max 25 jobs per user
128 CPUs/600 GB/0 GPUs |
10 | reserved for non-interruptable jobs. Request to aimagelab-srv-support@unimore.it | ||
special-dbg | 4:00:00 | max 25 jobs per user
32 CPUs/128 GB/0 GPUs |
40 | reserved for debugging jobs with qos special. Request to aimagelab-srv-support@unimore.it | ||
lowprio | 24:00:00 | max 25 jobs per user
14 GPU |
5 | active projects/users with exhausted budget. Request to aimagelab-srv-support@unimore.it | ||
students-prod | normal | 24:00:00 | 4 GPU per account | 1 | runs on a subset of 12 GPUs |