This morning around 3:30 AM Eastern we received alert that the hosted Exchange 2010 network in Australia (MATILDA) went offline. Upon login via KVM we noticed the server was repeatedly rebooting after faulting to a blue screen, however no diagnostic information was provided. It was soon determined that the operating system needed to be repaired which completed around 5:20 AM Eastern. Once we were able to successfully boot into Windows we performed a database integrity check to ensure no actual data was corrupted or lost which completed around 7:45 AM Eastern and service was restored.
Once service was restored we looked into the server logs which provided no information or logged entries regarding the server fault, however, a memory dump was created. Unfortunately the memory dump wasn’t of much help and the issue appeared to possibly be hardware related. We temporarily switched service for Matilda over to the backup node as we tested the hardware components. We were able to determine that the issue was related to a power supply that randomly dropped output voltage during high load. After replacing the power supply we ruled that the server faulted while running a backup. With regard to the OS repair we can only deduce that a system file was corrupted from the fault and needed to be replaced.