Running Jobs on the GM4 Cluster

This section of the documentation describes how to use the GM4 cluster GPU compute nodes. In the examples below, PME users should replace the gm4 partition name with gm4-pmext.

GPU jobs

In order to utilize GPUs, you must specify the generic resource scheduling (gres) option in your resource request, otherwise the job will be rejected. If no QOS is specified the QOS will default to gm4. It is recommended a QOS is specified explicitly. If no wall time is specified in the resource request it will default to the 36 hour max wall time limit. An example sbatch resoure request script for a multi-threaded (10 threads) single gpu job that uses the gm4 partition is shown below.

#!/bin/bash
#SBATCH --time=1-12:00:00
#SBATCH --partition=gm4
#SBATCH --nodes=1          # Number of nodes
#SBATCH --ntasks=1         # Number of tasks
#SBATCH --cpus-per-task=10 # Number of threads per task
#SBATCH --gres=gpu:1       # Number of GPUs
#SBATCH --qos=gm4          # GPU QOS 

#
# SET NUMBER OF THREADS 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

#
# GPU DEVICE IDS
echo GPU ID: $CUDA_VISIBLE_DEVICES

# LOAD MODULES (e.g. python) 
module load Anaconda3/2018.12

# EXECUTE JOB
python myjob.py

#
# EOF

CPU-only Jobs

For CPU-only jobs, the gm4-cpu QOS should be specified when requesting resources from the job scheduler. If no wall time is specified in the resource request it will default to the 36 hour max wall time limit. An example sbatch resoure request script for a 2 node, 80 core, CPU-only mpi parallel job that uses the gm4 partition is shown below.

#!/bin/bash
#SBATCH --time=1-12:00:00 
#SBATCH --partition=gm4
#SBATCH --nodes=2              # Number of nodes
#SBATCH --ntasks-per-node=40   # Number of tasks
#SBATCH --cpus-per-task=1      # Number of threads per task
#SBATCH --qos=gm4-cpu          # CPU-only QOS 

#
# SET NUM TASKS 
NTASKS=$(($SLURM_NTASKS_PER_NODE * $SLURM_JOB_NUM_NODES))

#
# SET NUMBER OF THREADS 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

#LOAD MODULES (e.g. intelmpi)
module load  intelmpi/2018.2.199+intel-18.0

# EXECUTE JOB
mpirun -np $NTASKS myjob.x

#
# EOF

Debug Jobs (both GPU and CPU-only)

For debugging purposes, there is a debug QOS (gm4-debug) that can be used with either GPU or CPU-only jobs. The max wall time for debug jobs is 30 minutes. You must specify a wall time that is 30 minutes or less, otherwise the job will be rejected. An example 1 node, 2 core, 2 gpu debug job sbatch submission script is specified below.

#!/bin/bash
#SBATCH --time=0-30:00:00 
#SBATCH --partition=gm4
#SBATCH --nodes=1              # Number of nodes
#SBATCH --ntasks-per-node=2    # Number of tasks
#SBATCH --cpus-per-task=1      # Number of threads per task
#SBATCH --gres=gpu:2           # Number of GPUs
#SBATCH --qos=gm4-debug        # Debug QOS 

#
# SET NUMBER OF THREADS 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

#LOAD MODULES (e.g. Anaconda python)
module load Anaconda3/2018.12

# EXECUTE JOB
python myjob.py

#
# EOF