GETTING STARTED ON LONGLEAF

Table of Contents

Introduction

System Information

Getting an Account

Logging In

Main Directory Spaces

Applications Environment

Job Submission

Longleaf Partitions

Additional Help

Introduction

Longleaf is a high throughput computing cluster available to researchers, students, faculty and staff across the University. With over 300 nodes, dedicated GPU partitions and highly performant storage, it is optimized for memory- and I/O-intensive, loosely coupled workloads with an emphasis on aggregate job throughput over individual job performance. New hardware is getting added Longleaf on an annual basis. Longleaf is particularly suited for serial workloads, consisting of many jobs each requiring a single compute host.

Longleaf users can also access the cluster though Open OnDemand, a web-based portal offering job submission, file system browsing and many preconfigured applications.

High-Level System Information

  • Operating System:
    • Red Hat Enterprise Linux Server release 8
  • Resource management:
    • Job submissions are handled by the Slurm batch processing scheduler

The Longleaf cluster is comprised of the following nodes:


Node Count CPU Count Memory (GB) GPU Gres
13 384 1500
4 256 991
5 256 927
3 256 734
3 256 732
6 256 500
8 128 991 gpu:nvidia_a100-pcie-40gb:3
1 128 754
5 128 2970
27 72 732
1 64 991 gpu:nvidia_l40s:8
7 56 488 gpu:nvidia_l40:4
40 56 230
99 48 355
19 48 230
3 40 487 gpu:tesla_v100-sxm2-16gb:8
16 40 235 gpu:tesla_v100-sxm2-16gb:4
24 24 235
7 24 112
3 8 41 gpu:nvidia_geforce_gtx_1080:8

Open OnDemand Portal

Open Ondemand is a web-portal that provides a terminal, file browser, and graphical interface for certain apps on the Longleaf cluster.See here for further information.


GPU Hardware and Partitions

Please refer to our page detailing use of GPUs and best practices.



Getting an Account

Follow the steps listed on Request a Cluster Account page and select Longleaf Cluster under subscription type. You will receive an email notification once your account has been created.

Logging In

Linux:

Linux users can use ssh from within their Terminal application to connect to Longleaf.

If you wish to enable x11 forwarding use the “–X” ssh option. Be sure to use your UNC ONYEN and password for the login:

ssh -X <onyen>@longleaf.unc.edu

Windows:

Windows users should download MobaXterm (Home Edition). Then use the Session icon to create a Longleaf SSH session using longleaf.unc.edu for “Remote host” and your ONYEN for the “username” (Port should be left at 22).

Mac:

Mac users can use ssh from within their Terminal application to connect to Longleaf. Be sure to use your UNC ONYEN and password for the login:

ssh -X <onyen>@longleaf.unc.edu

To enable x11 forwarding Mac users will need to download, install, and run Xquartz on their local machine in addition to using the “–X” ssh option. Furthermore, in many instances for x11 forwarding to work properly Mac users need to use the Terminal application that comes with Xquartz instead of the default Mac terminal application.

A successful login takes you to “login node” resources that have been set aside for user access. The login node is where you will edit your code, execute basic UNIX commands, and submit your jobs from to the SLURM job scheduler.

DO NOT RUN YOUR CODE OR RESEARCH APPLICATIONS DIRECTLY ON THE LOGIN NODE. THESE MUST BE SUBMITTED TO SLURM!

In order to connect to Longleaf from an off-campus location, a connection to the campus network through a VPN client is required.

Main Directory Spaces

1. NAS home directory space

Quota (per user): 50 GB soft limit, 75 GB hard limit.
Backed up: Yes, via snapshots.

Your home directory will be in /nas/longleaf/home/<onyen>.

If your home directory ever becomes full (i.e., reaches the hard quota limit) it can affect your overall ability to use the cluster so you will want to monitor and manage your home directory usage.

We recommend not running any heavy I/O workloads out of your home folder. These type of workloads should be run on either the /users, /work, or /proj file systems discussed below.

2. /users storage

Quota (per user): 10 TB.
Backed up: No.

Your users directory will be in /users/<o>/<n>/<onyen>. It can be thought of as a capacity expansion to your home directory. This high capacity storage is provided by the same hardware as /proj.

The following rules of thumb should be kept in mind regarding your /users storage:

  • OK to compute against, however as IO increases, consider copying or moving data to your /work storage for processing.
  • OK for holding inactive data sets like a near-line archive.
  • If a meaningful amount of cold data accrues, it can be packaged and MOVED to cloud archive, providing more working space for your warm data.

Note that /users is not intended to be used for team oriented shared storage (instead the /proj file system serves this purpose); it is intended to be your personal storage location.

3. /work storage

Quota (per user): 10 TB.
Backed up: No.

Your /work directory will be in /work/users/<o>/<n>/<onyen>.

/work is built for high-throughput and data-intensive computing, and intended for data that is actively being computed on, accessed, and processed. Please note that work is NOT intended to be a personal storage location. For inactive data, please move it to /users or /proj - contact us if you have no place or insufficient quota to move inactive data off of /work.

File systems such as /work in an academic research setting typically employ a file deletion policy, auto-deleting files of a certain age. At this time, there are no time limits for files on /work. We rely upon the community to maintain standards of appropriate and reasonable use of the various storage tiers.

4. /proj storage

Quota: Varies by need.
Backed up: Not by default.

/proj space is available to PIs (only) upon request and is intended to be used as team oriented shared storage. The amount of /proj space initially given to a PI varies according to their needs. There is no file deletion policy for /proj space, but users should take care in managing the use of their /proj space to stay under assigned quotas.

For further information and to make a request for /proj space please email research@unc.edu.

Applications Environment

The application environment on Longleaf is presented as modules using lmod. Please refer to the Help document on modules for information and examples of using module commands.

Modules are essentially software installations for general use on the cluster. Therefore, you will primarily use module commands to add and remove applications from your Longleaf environment as needed for running jobs. It’s recommended that you keep your module environment as sparse as possible.

Applications used by many groups across campus have already been installed and made available on Longleaf. To see the full list of applications currently available run module avail.

Users are able to install their own applications and create their own modules.

Job Submission

Job submission is handled using SLURM. SLURM is an open source, fault-tolerant, and highly scalable cluster management and job scheduling software used on Linux compute clusters around the world. As a cluster workload manager, SLURM has three main functions:

  • orchestrating access to resources to cluster users for some duration of time in order to run code
  • providing a framework for starting, executing, and monitoring work on the allocated resources
  • managing contention for resources by maintaining a queue of pending jobs

On Longleaf, in order to run your code (i.e., a "job"), you'll need to submit a job request to SLURM to run your code. As part of this job submission request, you'll need to use the appropriate SLURM flags to request resources for your job. Common resources to request for your job will include the number of CPUs, the amount of memory, runtime limit, etc. that your code needs to run properly to completion. Upon submitting your job to SLURM, it immediately goes into a queue where it waits to be dispatched (by SLURM) to run somewhere on the Longleaf cluster. SLURM uses a fair-share algorithm to schedule user jobs on Longleaf.

We first recommend looking at our SLURM guide for an overview of SLURM. After gaining a basic understanding of SLURM, you can refer to this doc for specific examples of how to submit SLURM jobs on Longleaf using the appropriate syntax.

Longleaf Partitions

A SLURM partition is a collection of nodes. When you submit your job it will run in a SLURM partition. You can optionally specify in your job submission which partition your job should use. If you don't specify the partition, your job will run in the default SLURM partition.

The partitions on Longleaf can be thought of as being divided into two types. Partitions that have access to a GPU and can run GPU jobs (i.e., GPU partitions) and partitions that do not have GPUs (i.e., CPU-only partitions) and therefore can not run GPU jobs. Unless otherwise indicated, the maximum job runtime limit for a partition is 11 days.

If your job does not use a GPU you should submit your job to a CPU-only partition. Longleaf has the following CPU-only partitions available to users for job submission:

1. general

  • Used for standard CPU-only jobs that do not have special requirements.
  • Suitable for most jobs.
  • general is the default partition.

2. bigmem

  • Used only for extreme memory jobs.
  • Email research@unc.edu to request access to this partition.

3. datamover

  • Used for jobs that primarily involve a large amount of copying files between Longleaf file systems; or for jobs that require extensive access to the web (i.e., web scraping, large download of files from the web, etc.)

4. interact

  • Used for jobs that require an interactive session for running a GUI, debugging code, profiling code, etc.
  • Maximum job runtime limit is 8 hours.

5. snp

  • Used for parallel jobs that are not large enough to warrant multi-node processing on Dogwood, yet require a sufficient percentage of cores/memory from single node to be worthwhile scheduling a full node.
  • Email research@unc.edu to request access to this partition.
  • Maximum job runtime limit is 6 days.

Longleaf has the following GPU partitions available to users for job submission:

1. gpu

  • Used to access GTX 1080 GPUs.

2. volta-gpu

  • Used to access V100 GPUs.

3. a100-gpu

  • Used to access A100 GPUs.
  • Maximum job runtime limit is 6 days.

4. l40-gpu

  • Used to access L40/L40S GPUs.

Additional Help

For further assistance email research@unc.edu with any questions or open a support ticket.

 

Last Update 2/27/2025 8:54:49 PM