GPUs in Sycamore

Status as of 2/2/2026

There are two 4xH100 GPU nodes available in Sycamore at this time: 8 H100's in total, yielding an aggregate of 640GB gpu memory and 135k cuda cores.

Within each node, nvlink enables aggregating up to four H100's for a given job. However, the best use of this equipment is for multi-node gpu runs, and debugging multi-node gpu runs in preparation for running elsewhere such as NC Share or national resources such as DOE, NSF, etc.

Sycamore has an InfiniBand high speed low latency network between nodes to provide multi-node GPU capability.

Reach out to our helpdesk if you can take advantage of this multi-node gpu infrastructure, we want to hear from you and help get your code running!

Accessing these H100 GPUs

You must have an account on Sycamore to access these GPUs. Special qos access is required for access to h100_mn (H100 multinode).

Submitting jobs

Like all jobs on Sycamore, job submission to the H100 GPUs is handled using SLURM and requires constructing a SLURM job submission command to submit your code to the dedicated partition requesting appropriate compute resources (e.g., CPUs, RAM, GPUs, etc.) and then running the job submission command. Note the full node specs below; there are a lot of CPU cores and system RAM available; if you are taking all of the GPUs on a node, do not be shy to take a lot of CPU and memory as well - but leave some for the underlying OS.

As an example of submitting a multi-GPU (in this case four GPUs) job, you can create a SLURM script called example.sl using a text editor. Enter the following into your example.sl script:

#!/bin/bash

#SBATCH -n 4
#SBATCH --gpus=4
#SBATCH --partition=h100_mn

# put module commands here
# module purge
# module add etc.

my_gpu_job

Then submit your job using the sbatch command:

sbatch example.sl

Here example.sl is our SLURM job submission script being used to run the executable my_gpu_job (which needs to be on your PATH) on Sycamore. The executable my_gpu_job is the GPU code that uses four GPUs.

So we have created a script example.sl that can run four tasks (-n 4) and allocates four GPUs (--gpus=4) in the h100_mn partition (--partition=h100_mn). Note you can optionally add any module commands after the SBATCH directives.

More Sycamore SLURM examples

Monitoring

Monitoring the fit between allocated and utilized resources is of great benefit to everyone, but especially to your own workloads. With high demand for GPUs, it is essential to monitor batch jobs and verify they are making reasonable and expected usage of allocated hardware resources.

Thank You Lenovo

These nodes are available courtesy of Lenovo via a Free Trial program. They are also available for purchase as patron nodes by faculty, staff, departments, centers, institutes, etc. via the patron program. If interested, reach out to us for a consult.

Per node specifications:

  • 4x Nvidia H100 GPU – 80GB; with NVLink
  • 2x AMD “Bergamo” 128 Core Sockets (256 physical CPU cores)
  • 1.5TB System RAM
  • NDR CX-7 with 2x 800Gbps ports
  • 25Gbe Ethernet ports
  • Liquid Cooling

 

Last Update 2/15/2026 5:32:38 PM