GM4 Cluster Overview
This section of the documentation provides an overview of how the GM4 cluster is organized.
Partitions
The GM4 cluster consists of 29 4-way GPU nodes and 4 CPU-only nodes for a total of 33 nodes. The nodes are accessible thourgh two partitions, gm4
and gm4-pmext
. The gm4-pmext
partition includes all 33 nodes and is accessible to PI groups affiliated with the Pritzker School of Molecular Engineering (PME). The gm4
partition contains a subset (23 nodes) of the 33 nodes and is accessible to non-PME GM4 participants.
Partition Name | Users | Nodes | Node List |
---|---|---|---|
gm4-pmext | PME users | 29 GPU nodes and 4 CPU-only nodes | midway2-[0631-0663] |
gm4 | non-PME GM4 users | 19 GPU nodes and 4 CPU-only nodes | midway2-[0641-0663] |
All non-PME users will use the gm4
partition to submit jobs, whereas PME users will use the gm4-pmext
partition for job submission.
Slurm Quality of Service (QOS)
The fair-share use of the GM4 resources is managed through the Slurm scheduler Quality of Service (QOS) settings. The quality of service defines the type of resource, GPU or CPU-only, that a job can request and whether the job is a production or debug run. There are three QOS that can be used with both the gm4
and gm4-pmext
partitions. The QOS settings are the same for the gm4
and the gm4-pmext
partitions. The resource settings for each QOS are defined as follows:
QOS Name: gm4
(default if no QOS specified)
Per User Settings | Per Account Settings | |||||||
Max Wall Time | QOS Priority | Max Running Jobs | Max Jobs Submit | Max CPUs | Max GPUs | Max Jobs Submit | Max CPUs | Max GPUs |
1-12:00:00 | 10000 | 28 | 28 | 320 | 32 | 48 | 320 | 32 |
QOS Name: gm4-cpu
Per User Settings | Per Account Settings | |||||||
Max Wall Time | QOS Priority | Max Running Jobs | Max Jobs Submit | Max CPUs | Max GPUs | Max Jobs Submit | Max CPUs | Max GPUs |
1-12:00:00 | 10000 | 28 | 28 | 320 | N/A | 56 | 320 | N/A |
QOS Name: gm4-debug
Per User Settings | Per Account Settings | |||||||
Max Wall Time | QOS Priority | Max Running Jobs | Max Jobs Submit | Max CPUs | Max GPUs | Max Jobs Submit | Max CPUs | Max GPUs |
0-30:00:00 | 100000 | 1 | 1 | 40 | 4 | 2 | 80 | 8 |