Home filesystem issues - update
At approximately 6:15PM on 5/13/2024, users began reporting issues accessing their home directory on HPCC. We are aware of the issue and are working with our vendors to address it.
Update: 11:55 PM, 5/13. We are currently on a call with the vendor team to diagonse.
Update: 2:30 AM, 5/14. Shortly after the cutover to new hardware, there was a bug that took the home file system offline. The vendor is examining diagnostic data to determine possible solutions.
Update: 9:00 AM, 5/14. Home directories are still unavailable. Vendor continues to work on a solution. No ETA yet.
Update: 3 PM 5/14. The vendor has identified problems that may have caused this outage. The system is recovering the file system but it requires a scan of all the data on the system. Our expected ETA for recovery is currently mid-afternoon, 5/15.
Update: 9 PM 5/14. The scan is about 22% complete.
Update: 10:45 AM 5/15. The file system scan has completed and has successfully mounted the file system on the storage servers. HPCC staff is work on restoring access to gateway and compute nodes.
Update: 5:15 PM 5/15. The file system is online without data loss. Most nodes now have the filesystem properly mounted and ready for jobs. We will continue to monitor and fix problems as we detect them.