A variety of storage systems with varying characteristics make up the data platform available to the ITS-RC community. As datasets increase in size, complexity, and/or sensitivity, more attention is required to ensure best match of resources to the tasks at hand.

/proj

  • High capacity storage
  • OK to compute against, however as IO increases, /work becomes the better option
  • OK to hold inactive data sets like a near-line archive
  • If you have 30TB or more of truly inactive data, we can assist in moving it to Cloud based cold archive, achieving greater data durability and lower storage costs
  • more info

/users
Your primary storage directory is: /users/{o}/{n}/{onyen}
This storage is provided by the same hardware as /proj

  • High-capacity storage
  • OK to compute against, however as IO increases, consider a temporary copy/move to /work for processing
  • OK to hold inactive data sets like a near-line archive
  • If a meaningful amount of cold data accrues, it can be packaged and MOVED to cloud archive, providing more working space for your warm data
  • /users is not intended for team-oriented, shared storage, as is /proj; it is intended to be your personal storage location; think of it as a capacity expansion to your home directory. Note that /work is NOT intended to be a personal storage location; /work is for data actively being processed, especially for workloads with high IO requirements.
  • 10 TB quota

/work Your /work directory is: /work/users/{o}/{n}/{onyen}

  • High performance SSD storage
  • Intended for active, "hot" data, being read or written
  • /work is NOT intended for holding inactive data (e.g., not used within 90 days); move such data to /users or /proj)
  • Please MOVE inactive /work data to your /users directory. Older data being stored in /work is subject to being moved to a user's personal storage by RC staff when deemed appropriate
  • more info

Cloud based cold data archive

  • The most durable, least expensive storage option available, presuming there is low probability of needing to retrieve/access frequently or at large data volumes

/ms

  • Currently read-only, the /ms file system hardware is end-of-life and being decommissioned
  • High capacity tape archival system
  • Data in /ms is being migrated to other tiers/systems
  • All data in /ms must be re-homed
  • more info

The following two systems are built upon old hardware that is no longer on maintenance, and thus at non-trivial risk of loss as compared to the storage systems listed above. This is the prior generation of /proj hardware; high capacity, decent performance.

/overflow

Useful for:

  • Data that can be re-acquired
  • Intermediate results of analysis and workflows
  • Staging area to re-organize data on its way somewhere else, such as Cloud based cold archive
  • etc.

/datacommons

  • Read only
  • Suggestions for data sets to add are welcome
  • Disposable data; if systems fail, data can be re-acquired

Examples of data sets in /datacommons:

  • 1,000 Genomes
  • Berkeley DeepDrive, bdd100k
  • SCAMPS Dataset, Camera Measurement of Physiology
  • Indoor Scene Recognition
  • Stanford Dogs

 

Last Update 1/30/2026 2:21:14 AM