I am part of a university with an HPC cluster, which has just slowed to an almost-standstill for no clear reason. The login nodes and the compute nodes both seem to be affected. I can connect, and do basic things (
ls) but anything more just seems to hang. My internet connection is fine. There is no scheduled maintenance.
Is this cluster under attack?
Is this a problem that needs to be solved urgently (as in “call people in out of hours”) to prevent some kind of damage?