Minor SLURM Update - RESOLVED 1/9/2025
RESOLVED: 1/9/2025 - All nodes are back online. Users affected by job failures will be contacted and refunded any used CPU or GPU hours.
UPDATE: 1/9/2025 - In the process of updating SLURM, a compatibility issue has taken many nodes offline and caused about 400 jobs to fail during a change of job step. Running jobs do not appear to be affected as of now. We are working to resolve this issue and bring nodes back online. If you have any questions about this update or you are experience issues, please contact us.
On Thursday, January 9th, we will be deploying a minor update to the SLURM scheduling software. This update includes bug fixes to improve system stability. Running and queued jobs should not be affected. No interruptions are expected to client command functionality (e.g. squeue, sbatch, sacct). If you have any questions about this update or you experience issues following this update, please contact us.