We'll post information about ICER's system downtimes, updates, new features, and other information for the ICER user community here.

Current Maintenance Activities

May 4, 2026

IPF has identified a leak on our primary cooling system, and have received the parts to repair it this week. In order to complete this work quickly and reduce the total downtime expected, amd24 will need to be taken offline one day earlier than planned, at 7 AM this Wednesday, May 6th, and will remain offline during our regularly scheduled maintenance on 5/7. Jobs will not start that will overlap this window; jobs running on amd24 on the 6th may or may not be canceled, depending on the duration of the repair.

Apr 20, 2026

We are currently performing maintenance on the scratch filesystem with assistance from our storage vendor. Work is ongoing and no downtime is required — scratch remains mounted and available on all cluster nodes throughout.

While this work is underway, users may notice intermittent slowdowns when reading from or writing to scratch. This can show up as jobs appearing to stall briefly, longer-than-usual file operations from the command line, or pauses when listing directory contents. These delays are expected and should resolve on their own as the maintenance progresses.

What you can do:

  • No action is required. Running jobs will continue to run.
  • If a job is unusually sensitive to I/O latency, you may want to hold off on submitting it until we post an all-clear.
  • Avoid large bulk operations against scratch (mass rm -rf, tar of huge directories, rsync of full trees) during this window if you can defer them.

We will post a follow-up once the maintenance is complete. If you encounter anything beyond general slowness — jobs failing with I/O errors, files that won’t open, or scratch appearing unmounted on a node — please open a ticket so we can take a look.

Thanks for your patience.

Apr 1, 2026

Over the next several weeks, ICER will be performing minor operating system updates across all nodes in the HPCC. During this time, users may notice longer queue times and some nodes unavailable in Slurm while they are being updated. Once the updates are complete, nodes will be returned to service and jobs will continue to schedule and run. Because these are minor updates, the software and module system is unaffected. No issues have been identified in initial testing, but please open a ticket if you experience any issues.

Current Announcements

Apr 9, 2026

The HPCC will be unavailable on Thursday, May 7 starting at 5AM for regularly scheduled maintenance. No jobs will run during this time.

Jobs that will not be completed before this date will not begin until after maintenance is complete. For example, if you submit a four day job three days before the maintenance outage, your job will be postponed and will not begin to run until after maintenance is completed.

If you have any questions, please contact us

Feb 25, 2026

What is happening?

On Monday, May 4, 2026, the current version of the Miniforge3 module (which provides access to the conda command), 24.3.0-0 will be removed. All users should update to using the new version 25.11.0-1.

Continue reading

Subscribe via RSS