RWTH High Performance Computing (HPC)

Full system maintenance.

Wartung
Di, 18.11.2025 08:00 - Di, 18.11.2025 19:00

Due to various necessary maintenance works, the entire CLAIX HPC System will be unavailable.

The initial phase of the maintenance should last until 12:00. After which filesystem and login nodes should be available.
Jobs will not run until the full maintenance works are completed.
We aim to also upgrade the Slurm scheduler during the downtime.

Please note the following:
- User access to the HPC system through login nodes, HPC JupyterHub or any other connections will not be possible during the initial part of the maintenance the maintenance.
- No Slurm jobs will be able to run during the entire maintenance.
- Before the maintenance, Slurm will only start jobs that guarantee to be finished before the start of maintenance; any running jobs must finish by then or might be terminated.
- Nodes might therefore remain empty leading to the maintenance, as Slurm tries to clear the nodes from user jobs.
- Waiting times before and after the maintenance might be higher than usual, as nodes are emptied before or the queue of waiting jobs increases in size afterwards.
- Files on your personal or project directories will not be available during the initial part of the maintenance.

10.11.2025 14:35
Updates

The first part of the maintenance will take unexpectedly longer. At the moment, we cannot estimate when the maintenance work of the first part will be finished.

18.11.2025 11:56

The network maintenance is still ongoing.

18.11.2025 16:01

The maintenance works were sadly delayed due to circumstances out of our control. This will delay the HPC systems availability by a few hours.

18.11.2025 17:15

Due to issues during the network maintenance and a short-hand failure of the storage backend of one infrastructure server, required in the maintenance, the maintenance tasks were delayed. Due to the delay, some tasks had to be postponed. The Cluster is operational again.

18.11.2025 19:08