12:28 PM (EST) on 6-22 Our engineers discovered an issue on a couple of individual nodes within the ExchangeDefender network that may have caused some temporary delay to both Inbound and Outbound messages. As of this moment our ExchangeDefender Engineers are working diligently on resolving this issue. From all of us here at OWN Web Now we would like to offer all of our partners our sincerest apologize for this unforeseen issue we experienced today but please rest assured the issue will be resolved.
12:40 PM (EST) This issue has been resolved all spooled mail has been delivered. This was caused by a delay in response between two of our core systems within ExchangeDefender and should not reoccur.
June 22, 2011Delivery Delays
Comments Off
June 9, 2011HUEY Maintenance ContinuedPer our previous NOC posting, we’ve been redesigning our maintenance plan for rebalancing the user distribution on HUEY. The original plan to defrag the database was abandoned as the timeframe for completion was not acceptable. Tonight starting at 9PM Eastern we will be taking the HUEY database offline for about 5 minutes as we clear out the memory cache in preparation for mailbox moves tonight. Throughout the night we will be moving users between two new databases to even out the load. During the move, mailboxes that are actively moving will be inaccessible to users as Exchange 2007 did not feature Online moves. Upon completion, users will be able to access their mailbox on the new database. Move times will depend on the mailbox size and item count. Update 6:45 PM Eastern: After our previous update our metric test completed and we’ve noticed that there are write lock delays on the OS drive for HUEY. We’ve made an adjustment to our above outline. Prior to starting the mailbox moves we will be performing a full database backup at the NTFS file level. Unfortunately this means we will have to offline the database as we are capturing a raw file backup instead of a VSS backup. After the backup is completed we will scan the surface error on the OS drive for HUEY for any corruption. We anticipate this entire process will take up to 4 hours to complete. We will update this post as progress is made starting at 9PM when work begins. Update 8:30 PM Eastern: The backup job of the OS is taking a bit longer than expected. We are pushing back dismounting the database to 9:30 PM. We will update this blog after 9:00 PM if we anticipate the backup taking longer. Update 8:50 PM Eastern: We’ve received a request from a few west coast customers asking us to postpone maintenance until 10. In the interest of disturbing service as least as possible, we will be postponing maintenance until 10 PM Eastern. Update 10:45 PM Eastern: The backup is estimated to complete in the next hour. We will then begin the surface error test on the OS drive. This is estimated to be the longest part of the process and will require a disruption of service as we take the server offline. We estimate the entire process to be 4 hours as described earlier. We will update this post once the work begins. Update 11:30 PM Eastern: We are beginning to dismount the mailbox databases and stop Exchange services. Friday 6/10/11 Update 2:05 AM Eastern: The surface area test has revealed issues on the OS drive. We are running a repair to on the drive and monitoring the progress. Update 4:17 AM Eastern: We’ve replaced a bad drive in HUEY on the OS drive and we are proceeding to perform integrity checks before turning on any services. Update 5:47 AM Eastern: The integrity check failed and we will be restoring from the backup image taken prior to maintenance. We will continue to update this blog as progress is made. To clarify the issue is specifically with the operating system and not the database integrity. Update 9:44 AM Eastern: The restoration process is proceeding as planned, this is courtesy update to ensure partners work is continuing. Update 12:20 PM Eastern: The restoration surface test is underway and we are looking to confirm data consistency on the OS drive. Update 1:57 PM Eastern: In order to achieve resolution in the fastest manner possible, we are beginning to concurrently restore the backup image on a spare server to eliminate any potential issues that may be affecting the physical host. Update 7:00 PM Eastern: The integrity check has processed half of the files on the OS drive and overall progress is about 25 percent complete Update 9:00 PM Eastern: This is a courtesy update as the process above is still continuing successfully without any halts. We understand this is an urgent issue and we appreciate your patience with this process. Saturday 6/11/11 Update 1:30 AM: The integrity check has processed about 90% of the files on the drive and overall progress is near 75% completed. Update 2:35 AM: The integrity check has completed and we’ve successfully booted windows into safe mode. We are now proceeding to boot normally and resume services on HUEY Update 3:15 AM: Service on HUEY has been restored and all queued mail is being delivered to user mailboxes.
Comments Off
June 8, 2011HUEY Maintenance tonightreTonight starting at 9:00 PM Eastern we will be taking the mailbox databases on HUEY offline to perform an offline defragmentation. We anticipate the scan will take up to 4 hours, which will leave mailbox access offline until the database is remounted. Clients are able to utilize livearchive during the maintenance schedule to continue working with live mail. Update 8:35 PM Eastern: We will begin work in 30 minutes, starting with dismounting the database and copying it to a temporary storage drive, and then starting the offline defrag. After the defrag completes, we will mount the database from the temporary location and stress test the integrity. After we’ve assured integrity we will copy the database back to the active RAID controller. We estimate that each step may take up to 2 hours to complete , but we will update this post along the way. Update 9:10 PM Eastern: We are pushing back maintenance one hour as we rearrange the temporary storage iSCSI server to increase overall speed, expecting to lower the overall time. Update 10:09 PM Eastern: We will be begin the above outlined process in 5 minutes. Update 11:00 PM Eastern: By estimation of current progress, we do not feel that we have enough time allocated for this process even with our earlier changes. We’ve remounted the current mailbox database. We are formulating a new plan to bring an solution that will run in parallel.
Comments Off
HUEY Memory replacementWe’ve received alerts on our monitor software about faults in the memory in HUEY. We’ve dismounted the database to avoid any corruption as we test the memory and replace if needed. We currently do not have an ETA for service restoration, but we will update this blog as information is obtained. Update 2:50 AM Eastern: Service was restored at 2:30 AM and all services have been confirmed online.
Comments Off
June 2, 2011DEWEY CAS IssuesWe’re going to perform an emergency reboot on DEWEY. While we’re unable to replicate the reported issues connecting to profiles on DEWEY the reports are enough that we need to ensure that all of our clients can access their profiles. We’ll be doing this in 10 minutes. We should be back online within 15 minutes of that timeframe.
Comments Off
Powered by WordPress |





