The work described below is schedule to begin at 9:00 PM Eastern May 29th 2012
- LOUIE – LOUIEMBOX1 & LOUIEMBOX2 – Update Network Driver
Will cause brief interruption, 15-30 seconds while the driver updates
- LOUIE – update exchange to service pack 2
Will not cause interruption to clients
- ROCKERDUCK – Reseeding databases between RDMBOX1 and RDMBOX2 for fail over
Will not cause interruption, but OWA users may see slight delays in accessing content (including public folders) since the replication is going to use the MAPI NICs instead of the replication NIC. This will only be during night as to not flood the network during the day
- ROCKERDUCK – Redistributing disk layout on RDMBOX3
RDMBOX3 is one of the additional fail over clusters and does not actively hold any mailbox databases
Comments Off
Today (5/14/12) there was an outage with ROCKERDUCK between 11:50 AM – 12:20 PM (Eastern) that affected client access to mailboxes. In short the outage was caused by our Los Angeles site experiencing Active Directory communication issues with Dallas, eventually taking LA offline. Normally, this event wouldn’t cause any issue as Dallas would be able to maintain quorum as Dallas holds the majority vote in the DAG quorum. Unfortunately the tie breaking voter (MBOX3) was offline for maintenance (As it does not actively host DBs – it’s the internal failover server for Dallas), which before the communication issue between sites left the vote as 4/5 online. Once LA experienced connectivity issues the vote dropped to 2/5 online and was forced offline. Our first alerts from monitoring started to come in at 11:52AM and we were able to respond by 11:55AM to the issue and post an advisory. Around 12:10 PM Eastern we were able to reestablish quorum in Dallas and bring the cluster online. Around 12:15 PM MBOX1 mounted all active databases and by 12:20 PM MBOX2 mounted all databases. Since this event was a communication issue between sites, the passive site was marked as Blocked (Unable to automatically mount databases for fail over) as network interruptions were detected.
Comments Off