There are reports from some customers that they are still not receiving backup reports. We’ve been unable to find a pattern or a common link between these clients and have enlisted the help of Ahsay. We will update this ticket when we have received a solution from Ahsay.
Update 5:37 pm Eastern: We’ve finished patching backup73 and service is back online. SMTP reports should now be functional.
Update 5:25pm Eastern: We will be shutting down the OBS service on backup73 to install a patch to resolve the SMTP issue.
Our US OBS server, backup73 has been experiencing problems authenticating users and sending out successful backup reports. To address the issue, we will be rebooting backup73 today. There will be a scheduled downtime of 15-20 minutes as the server comes back online. We will update this posting when the reboot has been initiated.
Update 10:48 PM -4 GMT: Backup73 is back online. All services are now operational.
Update 10:29 PM -4 GMT: The reboot will be initiated momentarily.
Update: The reboot has been delayed to tonight. Will update this post when the reboot has been initiated.
We will be conducting out-of-band maintenance on backup73.ownwebnow.com to introduce new monitoring systems and improve performance. The maintenance will start at 8 PM EST and complete by 10 PM EST (~ 2 hours)
We hope this does not cause undue inconvenience.
We are currently tracking issues that have been reported by multiple users:
- Email reports for ExchangeDefender SPAM quarantines are not being delivered to the users that have been configured to receive them. So far we have narrowed it down to the 00:00 EST time reporting interval for daily reports. We will know more about this around midnight.
- Offsite Backup reports are not reaching some clients. We are working with AhSay to isolate the issue and will likely be applying a hotfix later in the day. This is not a widespread issue either but we are taking it seriously since it has been reported multiple times.
We will update as we get more information.
We are currently performing maintenance on the USA offsite backup infrastructure to add more capacity to the logging partitions. The service will be restored today and emergency restore access is still available.
This issue does not affect Offsite Backups Europe.
We must have angered the Internet gods because this Monday has been nothing short of tremendously disappointing. Pictured below is my staff working on the issues:
On to the specifics:
ExchangeDefender reports did not run last night and will likely remain offline until close of business today. We have had two switch crashes on our load balancers in front of our shared mail1 and www1 hosting services. Our offsite backup upgrade does not seem to be validating the certificate requests so https:// requests are failing (http:// still works fine, and data is encrypted on the client side so the transport mechanism isn’t as relevant – but if you’ve set https:// your backups are failing so we are treating this as a very serious issue)
Somehow, the roof is still above us and we have power. For now.
All the outstanding issues are being filtered through by my teams and will have service restored to 100% across the entire product portfolio – by the end of business today.
Update: As of 5 PM EST the ExchangeDefender reporting is back online, all the network issues have been resolved. The Offsite Backup service is still available via http:// but we are still working with AhSay to get the certificate issue resolved. Will update further on this as soon as I have more information.
Update: As of 11 PM EST all offsite backup grids now respond with the valid SSL certificates on the SSL port.
Looks like the ugly Monday is finally behind us.
Vlad Mazek, CEO
We are conducting maintenance on our Offsite Backups architecture. 08:00–14:00 EST is our slowest time of the day and we’ll have the systems back online in time for the nightly backups.
We are currently performing some investigative maintenance on backup73.ownwebnow.com to assure data integrity and troubleshoot some failed logins by our clients. Please stand by, we will update with the resolution time momentarily.
Update: Service restored.
At roughly 5 AM EST (GMT -5) our primary backup proxy server in Dallas, TX went down for basic hardware maintenance. Upon restart, the primary RAID array controller lost its boot configuration and the system hung after all the drives were initialized. As you may imagine our backup infrastructure is huge and a restart can take up to 30-40 minutes to spin up all the drives over all controllers. It takes a while to determine the issue when all hardware reports correctly, the case was escalated and addressed right away.
We are sorry about the inconvenience.
Note: Our storage infrastructure does not follow the same maintenance interval as the remainder of our network. While almost all of our services have the least amount of usage during early Saturday morning hours (EST), offsite backups tend to have the strongest usage during those hours. Large backup sets are usually scheduled to start Friday afternoon after most 9-5 workers leave and it generally runs through the weekend. Likewise, we do global network snapshots over the weekend right before major maintenance tasks on the network. For this reason, our maintenance window for offsite backups is pushed up one day.
We have noticed a number of offsite backup reports included weird formatting today, for backup reports generated since midnight. We are currently researching what may have caused these template problems and we will be updating this ticket when we have it resolved.
Update: 5:17 PM EST (GMT -5): We believe the issue has been resolved but are still watching it. We will be powering down the grid tomorrow and applying another patch to improve performance. We will update later.
We are currently investigating a critical failure on our offsite backup grid. Several nightly maintenance scripts that control retention area expiration and quota calculations are causing the backup process to hang and eventually stop taking requests. This causes backup processes on client systems to fail.
We are currently working on this, be advised that some customers that have their backup schedules running during periods of 1 AM to 2 AM EST (06:00 GMT – 07:00 GMT) would have had their backup fail. We are currently investigating the issue and will be updating this ticket with the further advisory when the problem has been isolated.
Update: Tuesday 8 AM EST (13:00 GMT) – we have been able to isolate and fix the issues with Offsite Backups with the help of our vendor. The server no longer hangs around 1 AM thanks in part to a hardware upgrade. As a precaution, we will be providing the same memory upgrade to all our systems which will cause slight outages over the next few days. We will advise on this site when such a maintenance cycle will be called, you should not expect to see any instability with the service.