Zurück | Archiv

Rechner-Cluster - System Maintenance

Dienstag 23.04.2024 07:00 - Dienstag 23.04.2024 17:00

The whole clusters needs to be updated with a new kernel such that user namespaces can be reenabled again, please compare https://maintenance.itc.rwth-aachen.de/ticket/status/messages/14/show_ticket/8929 Simultaneously the Infiniband Stack will be updated for better performance and stability. During this maintenance, the dialog systems and the batchsystem will not be available. The dialog systems are expected to be reopened in the early morning. We do not believe that the maintenance will last the whole day but expect the cluster to open earlier.

Mi 10.04.2024 11:22

Updates

Due to technical problems, we will have to postpone the maintenance to 23.04.2024 07:00.

Di 16.04.2024 16:22

Rechner-Cluster - Migration from lustre18 to lustre22

Dienstag 23.04.2024 07:00 - unbekannt

In the last weeks, we started migrating all HPCWORK data to a new filesystem. In this Maintenance we will do the final migration step. HPCWORK will not be available during this maintenance.

Mi 10.04.2024 11:26

Updates

Due to technical problems, we will have to postpone the maintenance (and the final lustre migration step) to 23.04.2024 07:00.

Di 16.04.2024 16:23

Rechner-Cluster - Performance Problems on HPCWORK

Montag 08.04.2024 11:00 - Mittwoch 24.04.2024 17:00

We currently register recurring performance degradations on HPCWORK directories which might be partly worsened by the on-going migration process leading on to the filesystem migration on April, 17th. The problems cannot be traced back to a single cause but are actively investigated.

Fr 12.04.2024 11:35

Updates

Due to technical problems, we will have to postpone the maintenance (and the final lustre migration step) to 23.04.2024 07:00.

Di 16.04.2024 16:21

Rechner-Cluster - Deactivation of User Namespaces

Mittwoch 27.03.2024 08:15 - unbekannt

(German version below) Due to an open security issue we are required to disable the feature of so-called user namespaces on the cluster. This feature is mainly used by containerization software and affects the way apptainer containers will behave. The changes are effective immediately. Most users should not experience any interruptions. If you experience any problems, please contact us as usual via servicedesk@itc.rwth-aachen.de with a precise description of the features you are using. We will reactivate user namespaces as soon as we can install the necessary fixes for the aforementioned vulnerability. --- Aufgrund eines ausstehenden Sicherheitsproblems müssen wir sogenannte Usernamespaces auf dem Cluster vorübergehend deaktivieren. Dieses Feature wird hauptsächlich von Containervirtualisierungssoftware wie Apptainer genutzt, und die Abschaltung hat einen Einfluss darauf, wie diese Container intern aufgesetzt werden. Die meisten Nutzer sollten von diesen Änderungen nicht direkt betroffen sein und nahtlos weiterarbeiten können. Sollten Sie dennoch Probleme entdecken, kontaktieren Sie uns bitte via servicedesk@itc.rwth-aachen.de und schildern Sie uns, wie konkret Sie Ihre Container starten. Sobald wir einen Patch für die Sicherheitslücke einspielen können, werden wir User Namespaces wieder aktivieren.

Mi 27.03.2024 08:14

Updates

A kernel update addressing the issue was released upstream and will be available to the compute cluster, soon. Upon the update, usernamespaces can be enabled, again.

Do 04.04.2024 11:11

Rechner-Cluster - Top500 - Benchmark

Donnerstag 11.04.2024 17:00 - Freitag 12.04.2024 09:10

During the stated time Claix-2023 will not be available due to a benchmark run for the Top500 list[1]. Batch jobs which cannot finish before the start of this downtime or which are scheduled during this time period will be kept in queue and started after the cluster resumes operation. [1] https://www.top500.org

Do 11.04.2024 17:09

Updates

The nodes are available now again

Fr 12.04.2024 09:27

Rechner-Cluster - Longer waiting times in the ML partition

Mittwoch 03.04.2024 16:00 - Donnerstag 11.04.2024 13:11

There are currently longer waiting times in the ML partition as the final steps of the acceptance process are still being carried out.

Do 04.04.2024 10:09

Updates

The waiting times should be better now

Do 11.04.2024 13:11

Rechner-Cluster - RegApp Service Update

Mittwoch 03.04.2024 14:00 - Mittwoch 03.04.2024 14:30

+++ German version below +++ The RegApp will be updated on 2024-04-03. During the update window, the service will be unavailable for short time intervals. Active sessions should not be affected. +++ English version above +++ Am 03.04.2024 wird die RegApp aktualisiert. Während des Updatefensters kann der Dienst für kurze Zeit unterbrochen sein. Aktive Sitzungen sollten nicht betroffen sein.

Mi 27.03.2024 13:59