Earlier last week we identified an issue with ExchangeDefender whitelists. At first we believed the issue was simply that of certain notes not receiving full replication but after that turned out not to be the case we focused on the internal processes that generate note whitelists. The problem turns out to be a bug in the replication of whitelist content between users and master whitelist.
We have been working on it through our maintenance interval but have added a few more hours for additional testing today to make sure it is working properly. As a result of the bug we have rewritten how the whitelist data is replicated and as a result monitoring scripts that watch for replication failures had to be adjusted as well.
We are currently addressing a hardware issue on the Linux hosting cluster, affecting web, ftp and sql services. We will update the ticket when significant progress has been made, at this time we do not have an ETA as the file systems are being checked for errors.
Update: 6 PM EST: At this point over 90% of the service has been restored. We are moving on to more complex, database driven systems at this point.
Update: Saturday 8 PM EST: At this point all sites, even complex SQL driven ones, are online. Service upgraded to new versions of SQL and PHP and a more robust flexible storage array.
At roughly 5:10 PM EST we experienced a routing issue that resulted in massive packet loss on our corporate subnet. It took approximately 4 minutes to restore services to 100% while access was available nearly immediately.
This issue would not have affected services as our corporate subnet only hosts OWN services (Shockey Monkey, support portals, monitoring systems).
While most people likely never noticed, this is now an open issue because the BGP connection should have failed over instantly. It was covered in realtime on our Twitter feed.
Earlier today we encountered several tempfail mail delay scenarios due to a piece of software relying on DSBL realtime blacklist that was no longer actively maintained. We have taken action to remove DSBL from our web hosting mail servers which has resolved any delays and is currently processing the mail backlog.
Note: The issue only applied to mail1.ownwebnow.com. It did not affect any other services, such as Exchange hosting or ExchangeDefender. The issue was resolved and there should be no delays on the network at this time.