RWTH High Performance Computing (HPC)

You can find more information about the service in our documentation portal.


Recently expired reports

Slurm crash over the weekend

Partial Outage
Sunday 02/22/2026 12:00 PM - Monday 02/23/2026 03:00 AM

Slurm unfortunately crashed over the weekend.
This might have affected Jobs and Slurm commands between 2026-02-22 12:00:00 and 2026-02-23 3:00:00.

23.02.2026 09:19

Connection to the cluster fails

Partial Outage
Saturday 02/14/2026 04:25 AM - Tuesday 02/17/2026 10:15 AM

Due to the limitted availabiloty of the GPFS, the connection to the cluster could not be established.

The problem has been resolved and the cluster is available again.

17.02.2026 11:03

Maintenance Announcement for Munge Security Update

Maintenance
Tuesday 02/17/2026 12:00 PM - Tuesday 02/17/2026 01:00 PM

We would like to inform you that a security issue (CVE) has been discovered in the Munge software, which allows for the potential exposure of the Munge key. This key is critical for user authentication in Slurm jobs. To address this security vulnerability, we have rolled out a new version of Munge.
Although we assess the likelihood of the key being compromised as very low, we will be conducting maintenance on the cluster to exchange the key.
We kindly ask all users to review their jobs. If you notice any unknown jobs in your list, please delete them and inform us immediately.
For users with extremely sensitive data in their directories, we offer to temporarily disable job submission for your account upon request. Your account will be re-enabled after the maintenance is completed.

11.02.2026 11:53

GPU Partition Under Maintenance

Partial Maintenance
Tuesday 02/17/2026 01:00 PM - Wednesday 02/25/2026 12:58 PM

After Tuesday’s maintenance, some GPU nodes were found to be running the wrong driver version and are currently being updated accordingly. As a result, the GPU partition is available but with limited capacity.

19.02.2026 13:07