Maintenance changelog, 2024-07-26
Article
Dear users,
This is to inform you that aimagelab-srv is now back online and available to regular users. As this maintenance slot involved a set of breaking changes, it is advisable to carefully read the maintenance changelog below before starting using the system.
Login/frontend nodes
- We are introducing a new enhanced login/frontend node, termed
ailb-login-03
, with 8 NVIDIA P100 and 6.TB of local scratch disk space. This node also supports newer CUDA versions. - We are now disabling the access to regular users to
ailb-login-01
(which acts as the main controller of the cluster), to improve the cluster stability and resilience. - Older frontend DNS names (
aimagelab-srv-login
,aimagelab-srv-00
) are now disabled.
A summary of the login/frontend nodes is reported below.
Hostname | Description |
---|---|
ailb-login-01 |
Not accessible to users. |
ailb-login-02 |
Basic node. Supports CUDA up to 11.7. |
ailb-login-03 |
Enhanced node. Supports newer CUDA versions, 6.5TB local scratch. |
Data mover node
- Our data mover node,
ailb-data
, is now available again. Recall to use this node for heavy data transfers that do not fit the 10 minutes CPU limit of login/frontend nodes.
Changes to /homes quota management
- Quotas on /homes are now soft quotas with a 15 minutes grace period. This should ease the installation of heavy Python packages.
Software updates
- Updated kernel version to 5.15.0-101-generic
- Updated NVIDIA drivers to 535.183.01 (except
ailb-login-02
) - Updated BeeGFS client versions to 7.4.4
Published: July 28, 2024