High Performance Computing (HPC) Cluster Maintenance Policy

Summary

To ensure the ongoing performance, security, and stability of our High Performance Computing (HPC) resources, we have established regular maintenance windows each semester. This planned downtime allows our team to apply necessary upgrades, address hardware and software issues, and implement performance enhancements without unexpected disruptions.

Body

Overview

To ensure the ongoing performance, security, and stability of our High Performance Computing (HPC) resources, we have established regular maintenance windows each semester. This planned downtime allows our team to apply necessary upgrades, address hardware and software issues, and implement performance enhancements without unexpected disruptions.

Scheduled Maintenance Windows

HPC cluster maintenance will occur for three consecutive days following the final grade submission deadline each semester (Fall, Winter, Summer). The maintenance window is timed to align with academic scheduling and minimize the impact on users' work flows.

(Note: There may be times where maintenance is needed outside of these windows.)

Scope of Maintenance Activities

During the maintenance window, the following actions may take place:

  • System upgrades (software, firmware, and hardware)
  • Security patches and vulnerability remediation
  • Performance optimization and configuration adjustments
  • Backup and data integrity verification
  • Routine hardware inspections and replacement of parts, if necessary

Impact on Users

The HPC cluster will be unavailable to all users during the maintenance period. We advise users to plan their research activities accordingly to avoid any disruptions.

Thank you for your understanding and cooperation as we work to maintain a robust and reliable HPC environment.

Details

Details

Article ID: 21907
Created
Tue 10/29/24 3:40 PM
Modified
Mon 11/25/24 9:25 AM