DOGWOOD SLURM EXAMPLES
This guide will provide you with enough basic information to run straight-forward jobs on Dogwood. Please use the table of contents to access more detailed information, or email research@unc.edu with questions.
Table of Contents
Method 1: The Submission Script
Example: Submitting mvapich2_2.3rc1 without mpirun
These are just examples to give you an idea of how to submit jobs on Dogwood for some commonly used applications. You’ll need to specify SBATCH options as appropriate for your job and application.
To connect to <onyen>@dogwood.unc.edu, see: Getting Logged on
Notable Directories
Your home directory is: /nas/longleaf/home/<onyen>
.
Your scratch space is: /21dayscratch/scr/o/n/<onyen>
.
Dogwood uses SLURM to schedule and submit jobs. Below are the most common methods to submit jobs to Dogwood. Always submit your compute jobs via SLURM. Never run the compute jobs from the $ prompt (the node where are you are logged in).
The Submission Script
Create a bash script using your favorite editor. If you don’t have a favorite editor, use nano (for now).
nano example.sh
The script contains job submission options followed by application commands. Please enter the following into your script:
#!/bin/bash
#SBATCH --job-name=first_slurm_job
#SBATCH -N 2
#SBATCH -p 528_queue
#SBATCH --ntasks-per-node=44
#SBATCH --time=5:00:00 # format days-hh:mm:ss
mpirun my_parallel_MPI_job
Save your file and exit nano
. Submit your job using the sbatch
command:
sbatch example.sh
You have created a script, example.sh
that will ask for 2 nodes each running 44 tasks, for up to 5 hours. It will name the job “first_slurm_job” and run the MPI executablemy_parallel_job
using the mpirun command.
Inline Submission
If you would like to submit your job at the command line without creating a script, please try the following:
$ sbatch -t 120 -p 528_queue -N 2 --ntasks-per-node=32 -o hello.out.%j --wrap="mpirun my_parallel_MPI_job"
This requests 32 tasks running on each of two nodes. We left out the --job-name
option, and used a shorthand option for time
and specified it in minutes. We also specified our own output file where the SLURM jobid will be substituted for the “%j” in the output file name.
Note: Openmpi throws a weird error on this (probably due to a bug) but you can get around this error by adding -oversubscribe
after mpirun and it will do the right thing, which you can verify by also adding --report-bindings
.
If you see the following error message then you used the sbatch command without the --wrap
option or a valid shell script. Use the Script submission method or Inline submission method as shown above.
/var/spool/slurmd/job1535767/slurm_script: line 445: /var/spool/slurmd/bin/util/arch.sh: No such file or directory
Example: Submitting mvapich2_2.3rc1 without mpirun
The build of mvapich2_2.3rc1 does not include an mpirun command but you can still run these MPI jobs with one small modification. You can use srun and specify the number of processes as follows.
#!/bin/bash
#SBATCH --job-name=first_slurm_job
#SBATCH -N 2
#SBATCH -p 528_queue
#SBATCH --ntasks-per-node=44
#SBATCH --time=5:00:00 # format days-hh:mm:ss
srun -n $SLURM_NPROCS my_parallel_MPI_job
Interactive Debugging Example
$ module add matlab
$ srun -n 1 -p debug_queue --mem=5g --x11=first matlab
This requests 1 tasks running with 5g memory.
Other SLURM Information
Partition (Queue) Information
sinfo squeue squeue -u <onyen> squeue -u <onyen> -l
Dogwood Partitions and User Limits
Job details
scontrol show jobid <job_id_number>
Cancel Job
scancel <job_id_number>
Details of Completed Job Note that
-j
below has a single hyphen ‘-‘, and--format
has two hyphens ‘–‘.sacct -j <jobid> --format=JobID,JobName,Partition,ReqMem,MaxRSS,NTasks,AllocCPUS,Elapsed,State scontrol show job <jobid>
See man sacct and scontrol
for details.
Last Update 11/21/2024 1:36:46 AM