- Home
- Getting Started
- Using the Cluster
- Software and Development
- Policies
- Courses
File Systems and Data Management
AImageLab-HPC provides two storage areas with different characteristics and intended uses. Neither area is backed up - users are solely responsible for protecting their important data.
Important: There are no environment variables such as
$HOMEor$WORKon AImageLab-HPC. Always use absolute paths:/homes/<username>and/work/<project>.
Overview
| Path | Filesystem | Quota | Backup | Deleted on expiry |
|---|---|---|---|---|
/homes/<username> |
NFS | 100 GB per user (default) | No | 6 months after username expiry |
/work/<project> |
BeeGFS | Per project (set at provisioning) | No | Yes - immediately |
/tmp (node-local) |
tmpfs | Varies by node | No | At job end |
/homes - Personal Area
/homes/<username> is your personal home directory, hosted on NFS. It is intended for:
- Personal configuration files (
.bashrc,.ssh, etc.) - Small scripts and source code
- Software installations that produce many small files (e.g. Python virtual environments)
Do not use /homes for large datasets or production I/O. NFS is not suited for the high-throughput parallel access patterns typical of HPC workloads. Large reads and writes from compute nodes should always go through /work.
The default quota is 100 GB. If you need more, contact the HPC Helpdesk.
/homes/<username> is retained for 6 months after your username expires, then permanently deleted.
/work - Project Area
/work/<project> is a shared project workspace hosted on BeeGFS, a high-performance parallel distributed filesystem. It is the primary area for all production data, model checkpoints, datasets, and job outputs.
Key properties:
- Shared: all members of a project have read/write access to
/work/<project>. - Per-project quota: set at provisioning time and visible with
squota(see below). - No backup: data loss is permanent.
- Deleted on expiry: when a project expires, its
/work/<project>directory is deleted immediately with no grace period. Ensure you export any data you wish to keep before the project end date.
The PI is the owner of the root /work/<project> directory. Collaborators are advised to create personal subdirectories:
mkdir /work/<project>/<username>
By default, files you create are readable and writable only by you. To share files with project collaborators:
chmod 770 /work/<project>/<username>/my_shared_dir
Since /work/<project> is not accessible to users outside the project, opening permissions to group level is safe within the project.
/tmp - Job-local Temporary Storage
Most compute nodes are equipped with fast local storage (tmpfs). When a job starts, /tmp on the compute node is available as a private temporary area that is automatically cleared when the job ends.
Request tmpfs storage in your job script:
#SBATCH --gres=gpu:1,tmpfs:50G
A typical pattern for I/O-intensive workloads is to copy input data to /tmp at the start of the job, write outputs there, then copy results back to /work before the job ends:
cp /work/<project>/dataset.tar /tmp/
tar -xf /tmp/dataset.tar -C /tmp/
python train.py --data /tmp/dataset --output /tmp/checkpoints
cp -r /tmp/checkpoints /work/<project>/results/
Note:
/tmpis local to each compute node. For multi-node jobs, each node has its own independent/tmp- it is not shared across nodes.
The available tmpfs capacity varies by node. Use sinfo -o "%n %G" -p all_usr_prod to inspect the gres configuration per node.
Checking Storage Quota - squota
The squota command shows your current storage usage and quota for all areas you have access to:
squota
Example output:
Filesystem User/Project Usage (chunks) Quota (chunks) % (chunks) Usage (GB) Quota (GB) % (GB)
------------ ------------------------- ---------------- ---------------- ------------ ------------ ------------ --------
/work ai4a2026 0 0.00 100.00 0.00
/work baraldi_doxee_ix_studio 11045886 2239.66 3072.00 72.91
/work baraldi_doxee_pwd 2043548 246.20 1024.00 24.04
Column descriptions:
| Column | Description |
|---|---|
| Filesystem | Storage area |
| User/Project | Username (for /homes) or project name (for /work) |
| Usage (chunks) | Number of BeeGFS storage chunks currently used |
| Quota (chunks) | Chunk quota, if set (usually not enforced - GB quota applies) |
| Usage (GB) | Actual storage used in gigabytes |
| Quota (GB) | Maximum allowed storage in gigabytes |
| % (GB) | Percentage of GB quota consumed |
Tip: Use
squotarather thandu -shto check disk usage on/work.dutraverses the filesystem and generates heavy metadata load on BeeGFS;squotareads quota counters directly and is instantaneous.
BeeGFS Best Practices
BeeGFS is a distributed parallel filesystem optimised for high-throughput sequential I/O on large files. Understanding its characteristics helps avoid performance pitfalls.
What BeeGFS is good at
- Large sequential reads and writes (datasets, model checkpoints, video files)
- High-bandwidth parallel access from many compute nodes simultaneously
- Storing a moderate number of large files
What to avoid
Many small files. BeeGFS metadata performance degrades significantly when directories contain millions of small files (e.g. unzipped ImageNet, raw frames, pip-installed packages). Each file open/stat/close requires a metadata server round-trip. Symptoms include very slow ls, find, and job startup times.
Mitigations:
- Store datasets as archives (tar, zip) or in container formats (HDF5, Zarr, WebDataset, LMDB) and read them streaming.
- Install Python environments in /homes, not /work - NFS handles small files better for this workload.
- If you must have many small files in /work, organise them into subdirectories of ≤ 10,000 files each.
Recursive metadata operations. Commands like ls -lR, find /work/<project>, and du -sh /work/<project> walk the entire directory tree and can generate enormous metadata load, slowing the filesystem for all users. Use squota for quota checks, and scope find with -maxdepth to limit traversal depth.
Frequent small random writes. Appending tiny amounts of data in a tight loop (e.g. writing one line to a log file per iteration) is inefficient. Buffer output in memory and flush in larger chunks, or write logs locally to $TMPDIR and copy them to /work at job end.
Concurrent writes to the same file from multiple processes. Avoid having multiple parallel workers append to the same file simultaneously. Use a single writer process, or have each worker write to its own file and merge afterwards.
Practical tips
- Open files once per job, perform all I/O, then close - avoid repeatedly opening and closing the same file.
- For distributed training checkpoints, write one file per process rather than having all ranks write to a single shared file.
- Use
/tmpfor intermediate files that are read and written repeatedly during a job; copy only final outputs to/work.
Data Transfer
Login node transfers (small files)
For small transfers that complete quickly, use scp, sftp, or rsync directly via the login nodes:
# Upload from local machine to cluster
scp /local/path/to/file <username>@ailb-login-02.ing.unimore.it:/work/<project>/
# Download from cluster to local machine
scp <username>@ailb-login-02.ing.unimore.it:/work/<project>/results.tar.gz /local/path/
# Sync a local directory to the cluster
rsync -avP /local/dataset/ <username>@ailb-login-02.ing.unimore.it:/work/<project>/dataset/
Login nodes enforce a CPU time limit. For large transfers, use the dedicated data mover instead.
Data mover (large transfers)
For large or long-running transfers, use the dedicated data mover node ailb-data.ing.unimore.it. It has no CPU time limit and is optimised for sustained transfer throughput.
The data mover supports scp, sftp, and rsync. You cannot open an interactive shell on it - it only accepts file transfer commands.
# Upload a large dataset
rsync -avP /local/large_dataset/ <username>@ailb-data.ing.unimore.it:/work/<project>/large_dataset/
# Download results
rsync -avP <username>@ailb-data.ing.unimore.it:/work/<project>/results/ /local/results/
# Using scp
scp -r <username>@ailb-data.ing.unimore.it:/work/<project>/outputs/ /local/outputs/
Interactive SFTP
sftp provides an interactive session for exploring and transferring files:
sftp <username>@ailb-login-02.ing.unimore.it
Useful commands inside an sftp session:
| Command | Description |
|---|---|
ls / lls |
List remote / local directory |
cd / lcd |
Change remote / local directory |
pwd / lpwd |
Print remote / local working directory |
get <file> |
Download file from remote |
put <file> |
Upload file to remote |
exit |
Close session |
Windows users
scp and rsync are available via Git Bash or Windows Subsystem for Linux (WSL). GUI alternatives include FileZilla and WinSCP.