Update: We are currently investigating performance problems from 5 AM – 8:20 AM on our outbound network. One of the load balancers handling the outbound mail failed, creating a performance issue on the other server and backing up customers SMTP queues due to load throttling. No mail was lost.
This issue has been resolved at 8:00 AM EST but is being monitored further as the systems catch up and sync up.
Earlier this morning you may have received the following error while relaying through ExchangeDefender:
< outbound2.exchangedefender.com #5.7.1 SMTP; 550 5.7.1 … Access denied to 18.104.22.168 by new.spam.dnsbl.sorbs.net DNSBL (http://dnsbl.sorbs.net/)>
This issue has been corrected and mail is flowing through correctly. If you received this error please resend the message as it was permanently rejected by the recipients mail server.
AT&T is having RBL issues again, we are working with them to resolve the problem. You may receive this problem when emailing the AT&T network for the time being:
<<< 521-22.214.171.124 blocked by sbc:blacklist.mailrelay.att.net.
<<< 521 DNSRBL: Blocked for abuse. See http://att.net/blocks
554 5.0.0 Service unavailable
We have put in place a workaround and are working with AT&T to resolve the issue. You should not continue to see this problem. However, the issue is still open.
We will be conducting a major maintenance window this Sunday, November 8th, 2008.
We will be deploying series of hotfixes provided by Microsoft for a slew of bugs Own Web Now Corp has reported over the past six months. We have also received a lot of guidance in the way of optimizing our setup and will with Microsoft’s help proceed to make major adjustments to the platform.
Unfortunately, this means that some users may experience issues during Sunday early AM hours. Although our systems are clustered some changes require database moves and service restarts which will have to be done in sync and will unfortunately lead to service interruptions.
Our goal as always is to keep these service interruptions to the minimum and limit them to maintenance hours, however, since these issues will be sporradic throughout the night we wanted to note them here.
After the initial test on our own Exchange 2007 network we will be applying the same fixes and optimizations to our dedicated server clients running Exchange 2007.
ExchangeDefender will not be impacted, however, your mail may experience slight delay if you are on Exchange 2007 mailbox store which is being cycled and ExchangeDefender is not able to immediately deliver the message. In this case we recommend all our mission critical 24/7 operations to fall over to LiveArchive which will be available.
Over the past 12 months we have had a 99.999% uptime on our Exchange 2007 network and 100% uptime on our ExchangeDefender network. Those numbers are impressive but only possible thanks to preventive maintenance and optimizations as noted above. We apologize in advance for any inconvenience you experience during the maintenance cycle.
We have been made aware of an issue with email reports showing all 0’s for SPAM stats. The issue has been resolved as of 11:30 AM EST.
Please note that we do not recommend using email reports and encourage everyone to migrate to the new methods of accessing SPAM: realtime web portal, desktop agent or Outlook 2007 agent.
At approximately 4AM EST we have noticed a failure in updates from one of our AV vendors. That failure produced higher than expected virus matches which ended up queuing a larger than normal amount of messages. We have resolved the issue with the update and are currently re-processing all the mail that was quarantined over the past few hours.
Please stand by, we will deliver all mail.
Update: 7:29 AM EST: Nearly all the mail that was affected by the faulty AV update has been processed and has been dispatched to delivery queues. As of the previous update, all new mail has been delivered in realtime. It is important to note that we are only processing the backlog for the messages that did get trapped by the faulty AV update.
Update: 9:15 AM EST: 99% of the messages have been flushed out. By the time you read this posting all the mail would have been delivered. No mail has been dropped during the period, if you experience further issues with delays please follow our deployment guide and support documentation, we find most delays are related to the on-premise issues relating improper firewall configuration, connection rate limiting (by far) and other SPAM/malware scanning that does not properly whitelist ExchangeDefender systems.
As you may have noticed over the past few weeks, the SPAM levels have increased slightly. Unfortunately, even a slight increase in the SPAM levels as a percentage can result in getting a piece or two an hour as opposed to a piece or two a day. Yesterday we finally isolated the issue that was causing this thanks to a few of our partners and the new ExchangeDefender Outlook 2007 addin. We are still working on automating the distribution and monitoring of the new processes that will keep this from coming up again.
ExchangeDefender has multiple grids around the world. All grids use a central RBL distribution database that is centrally managed and monitored. Every grid has it’s own DNS caching servers that hold both the RBL data as well as our clients IP address information for delivery, routing and SPAM definitions. Since the latest update to our core distribution the DNS server performance has been flaky and would simply stop returning results. Because our RBL code is set to look for matches in the RBL zone the servers lack of response, or lack of correct response, means that the messages that were certainly SPAM were allowed to go through the less-restrictive SPAM scanning and unfortunately that contributes to 1-2% difference in the SPAM load and in some cases latency for nodes that are about to go into the shutdown/maintenance mode and are flushing out their queues. Because ExchangeDefender delivery queues run off the same DNS infrastructure (technical limitation) this compounds the problem and issues as the resolutions do not come from the primary (on-node) or secondary (on-grid) but a tertiary (central OWN NOC) DNS server.
What we have done so far is implementing a system that does local resolver check and restarts the DNS service if it is not returning proper data.
What we are currently working on is a monitoring system to centrally report the issues with the resolver latency (one of the things we currently do not measure) as the lookups have to skip to the secondary or tertiary systems.
We expect to have all the issues handled by the end of the weekend. From statistical breakdowns we know that the issue has not been widespread (only certain users would even have noticed the difference) and only about a dozen people have complained so far. Unfortunately for us, the people likely to notice are the people that get the most mail and the ones that likely love our product the most. We’ll get this one taken care of for you folks, thanks for your patience.
We have received several reports of issues with BT. You may receive this error when sending messages to btinternet.com recipients.
The e-mail system was unable to deliver the message, but did not report a specific reason. Check the address and try again. If it still fails, contact your system administrator.
< outbound2.exchangedefender.com #5.0.0 SMTP; 554 <firstname.lastname@example.org>: Relay access denied>”
We have notified BT by e-mail and phone regarding the issue, the problem is on their end. Since this is a configuration issue on BT network we have no ETA, no resolution time or idea of what may be going wrong.
For more information about proxy errors, click here.
We are currently tracking issues that have been reported by multiple users:
- Email reports for ExchangeDefender SPAM quarantines are not being delivered to the users that have been configured to receive them. So far we have narrowed it down to the 00:00 EST time reporting interval for daily reports. We will know more about this around midnight.
- Offsite Backup reports are not reaching some clients. We are working with AhSay to isolate the issue and will likely be applying a hotfix later in the day. This is not a widespread issue either but we are taking it seriously since it has been reported multiple times.
We will update as we get more information.
We will be extending our maintenance window for the report services this weekend in order to implement the new ExchangeDefender 4.0 functionality. While the reporting should not be impacted during this time, our support teams will have limited visibility to the backend and might not be able to effectively troubleshoot the issues. We are sorry for any inconvenience this might cause your clients but we’re confident you will be pleased with the results.
Maintenance: Sunday, 1 AM EST – 6 AM EST.
We have restored the service to the small portion of users that were affected by its outage this morning. The issue had to do with a hotfix provided by Microsoft. Hope everyone in the midwest and Ohio weathers through the storms that have knocked much of the power out in that region.
We must have angered the Internet gods because this Monday has been nothing short of tremendously disappointing. Pictured below is my staff working on the issues:
On to the specifics:
ExchangeDefender reports did not run last night and will likely remain offline until close of business today. We have had two switch crashes on our load balancers in front of our shared mail1 and www1 hosting services. Our offsite backup upgrade does not seem to be validating the certificate requests so https:// requests are failing (http:// still works fine, and data is encrypted on the client side so the transport mechanism isn’t as relevant – but if you’ve set https:// your backups are failing so we are treating this as a very serious issue)
Somehow, the roof is still above us and we have power. For now.
All the outstanding issues are being filtered through by my teams and will have service restored to 100% across the entire product portfolio – by the end of business today.
Update: As of 5 PM EST the ExchangeDefender reporting is back online, all the network issues have been resolved. The Offsite Backup service is still available via http:// but we are still working with AhSay to get the certificate issue resolved. Will update further on this as soon as I have more information.
Update: As of 11 PM EST all offsite backup grids now respond with the valid SSL certificates on the SSL port.
Looks like the ugly Monday is finally behind us.
Vlad Mazek, CEO
We have received several reports this morning about our IP address blocks being on Verizon’s RBL. The following errors were given to some of our customers on ExchangeDefender:
outbound1.exchangedefender.com #5.5.0 SMTP; 571 Email from 126.96.36.199 is currently blocked by Verizon Online’s anti-spam system. The email sender or Email Service Provider may visit http://www.verizon.net/whitelist and request removal of the block.>
In our calls and discussions with Verizon we have received a confirmation that we are not and have not been on their RBL. At this point the mail is routing correctly so we are just chalking this up to there being a temporary glitch with Verizon’s RBL systems.
Over the past two days that the reports service has been restored we’ve discovered a few bugs in the system that prevented proper delivery and branding of the SPAM reports. Even though they were generated properly, the reports got routed through the ExchangeDefender inbound network instead of direct to the servers. This unfortunately may have gotten trapped in the junk mail again.
This issue was corrected at 11 PM EST (4 AM GMT).
We are talking advantage of an extended holiday weekend in United States to perform network upgrades and maintenance as well as a software rollout on our email reporting grid for ExchangeDefender. We have rolled in ExchangeDefender 4.0 upgrades to this system and are taking an extra day to put it through it’s paces and make sure it’s 100% solid.
For our customers abroad that will be affected by the email reports please keep in mind that this legacy system is just one of the ways to access junk. The recommended and preferred way of accessing SPAM quarantines for ExchangeDefender is the web portal at https://admin.exchangedefender.com and we also offer the SPAM Monitor desktop software with hourly alerts. We anticipate regular daily reports to resume on Tuesday.
Earlier today we completed the rollout of 450 new servers to the ExchangeDefender family all over our American network. The introduction and initial sync of the new nodes did allow some junk through as well as introduce a slight today (maximum reported 1 hour from one system that nearly immediately went into maintenance mode) but as of roughly 11:30 AM EST all is good.
Additional 600 nodes are planned in our global expansion leading up to ExchangeDefender 4.0 launch. We are also looking at additional data centers on both coasts at the moment scheduled to go live this fall.
Update: 2:24 PM EST: We are happy to report that all the nodes have now converged in the scanning network and the SPAM filtering is back at its usual levels (and to be tightened up even further later tonight). You may have seen an increase in SPAM over the past few hours while the nodes were joining the network and accepting new programming but you should be seeing far less SPAM going forward.
We have several reports from our UK and Ireland customers of the rise in the amount of junk mail passed through ExchangeDefender this morning. Aside from a strain of CNN-forged SPAM we are not seeing any issues in ExchangeDefender nor do our stats show anything out of the ordinary at the moment. We are investigating the situation.
The SPAM regarding CNN is already in the filters and should be stopped going through further. For anything else that may slip through please forward the message with SMTP headers to email@example.com and we will gladly investigate it.
Update: We had a rule update that unfortunately offsite all the other CNN rules and let that junk through. The team is now filtering it through both the pattern search and hyperlink drop on the domains used to get traffic. We are seeing a few other SPAM strains getting more popular today as well (Wall Street Subscription scam, fake MSN alert to download Internet Explorer 7). All of these are now effectively being filtered by ExchangeDefender which undergoes thousands of updates a day but due to the CNN rules that have been changing a lot over the past few days, and in light of the six complaints we got this morning, we felt it was important to update in more detail than usual.
Update 2: We are seeing things under more and more control as we continue to filter out the strains of the three major junk items. As a matter of policy we do not publish our filtering technology or keywords or scores but we are currently tracking the variants of CNN, WSJ, Internet Explorer 7 and a few smaller ones.
Over the weekend we tested and perfected a new method for managing archive embedded dangerous content. During the deployment of the new software some archives were improperly classified as dangerous and archives (.zip, .arj) removed. That issue has been solved as of Sunday evening.
As a point of reference, ExchangeDefender does not allow executable attachments (.exe, .bat, .com or .pif) in either standalone or archived mode. That means even if you zip the file up it will be picked up by a scanner. If you zip a zip file, the system will reject to process it. This has been our long standing tradition of not allowing dangerous content through the network because virus scanners sometimes do not react as quickly to the rise in malware and our responsibility is to protect our customers. If you need a dangerous attachment really bad, for the safety of the less IT savvy people in your organization, please try to find alternate means such as a web sharing tool or a freemail account.
We have also addressed this need in ExchangeDefender 4.x which is scheduled for August 19th.
We are currently addressing a processing delay in ExchangeDefender antivirus scanning engine. One of our virus engine vendors had distributed a faulty update which has caused a backlog of messages that have been quarantined for further inspection.
The ExchangeDefender system is designed to apply stronger scrutiny and more intensive checks against any attachments or messages that have produced any sort of an error in any of our antivirus scans. The reason we run multiple scanner engines is because not all engines are as thorough or as rapidly updated as the threats emerge and change in the wild. Once an issue is encountered we scan with more options to find out if the message is indeed dangerous or if there is something wrong with a portion of it (attachment, envelope).
In this case the corrupt message was passed on to ExchangeDefender which quarantined messages for further scanning which is far more expensive and processor intensive. We have responded immediately and removed the engine, however, even slight issues can cause huge problems when you process as much mail as we do and it has introduced a slight delay in the processing of messages. The issue started at roughly 3:10 and was resolved by 3:40. At the time of this message we see around 60% of our nodes processing messages within our ordinary SLA (seconds) and we expect the rest of the network to catch up shortly.
If you experience any delays, even extensive in nature, it is due to the above problem which will within 30 minutes be completely under control.
Earlier today we had to flush the queues on ExchangeDefender outbound server due to the large number of corrupt queue files sent by one of our customers malfunctioning servers. If your messages were not delivered during the window between 5am – 7 am central (GMT -6) please resend them.
The problem has been solved temporarily, but we will be holding an urgent maintenance window this Wednesday, 5/14, to address the core of the problem.
P.S. Significant number of servers were backlogged during this process. That mail has been processed without issue.