Storage¶
The HPC clusters provide several shared storage systems used for different purposes during the lifecycle of computational workflows. These filesystems differ in performance characteristics, persistence, and intended usage.
Typical HPC workflows involve several stages of data handling:
-
Input preparation – source code, scripts, and input data are stored in the user's home directory or project storage.
-
Job execution – simulations and analysis jobs generate large temporary datasets. These files should be written to high-performance scratch storage to avoid overloading persistent filesystems.
-
Result preservation – important outputs should be transferred to project storage or external archival systems once the computation completes.
Using the correct storage location is essential for maintaining good I/O performance and ensuring fair access to shared resources.
Most storage systems on the cluster are shared parallel filesystems. They are optimized for high-throughput workloads involving large files and concurrent access from many compute nodes. Workflows that generate very large numbers of small files or perform frequent metadata operations may experience reduced performance.
This section explains how storage works on the HPC platform and how to manage data efficiently. Topics include:
- controlling file access permissions
- understanding storage quotas
- recommended practices for organizing datasets
- transferring data between systems
- handling workloads that generate large numbers of files
Cluster-specific information are documented in the Devana and Perun available mountpoints sections.