A variety of storage systems with varying characteristics make up the data platform available to the ITS-RC community. As datasets increase in size, complexity, and/or sensitivity, more attention is required to ensure best match of resources to the tasks at hand.
/proj
- High capacity storage
- OK to compute against, however as IO increases, /work becomes the better option
- OK to hold inactive data sets like a near-line archive
- If you have 30TB or more of truly inactive data, we can assist in moving it to Cloud based cold archive, achiving greater data durability and lower storage costs
/work
- High performance storage
- New as of 2022
- Intended for active data only, aka hot or warm data
- more info
/pine
/pine will be placed into READ ONLY mode on May 23, 2023 -- all running jobs attempting to write to /pine begining on this date will fail. Since the files can no longer be modified or "touched", all data will be deleted from the /pine file system according to the scheduled deletion script cleaning out files of the maximum age.
High performance storage
Near end of life hardware
/ms
- High capacity
- Read-only at this time
- more info
Cloud based cold data archive
- The most durable, least expensive storage option available, presuming there is low probability of needing to retrieve/access frequently or at large data volumes
The following two systems are built upon old hardware that is no longer on maintenance, and thus at non-trivial risk of loss as compared to the storage systems listed above. This is the prior generation of /proj hardware; high capacity, decent performance.
/overflow
Useful for:
- Data that can be re-acquired
- Intermediate results of analysis and workflows
- Staging area to re-organize data on its way somewhere else, such as Cloud based cold archive
- etc.
/datacommons
- Read only
- Suggestions for data sets to add are welcome
- Disposable data; if systems fail, data can be re-acquired
Examples of data sets in /datacommons:
- 1,000 Genomes
- Berkeley DeepDrive, bdd100k
- SCAMPS Dataset, Camera Measurement of Physiology
- Indoor Scene Recognition
- Stanford Dogs
Last Update 6/4/2023 9:45:42 AM