We are currently working with RoadRunner (formerly Time Warner, AOL) service provider in United States, they are experiencing issues with their SMTP servers and randomly rejecting SMTP traffic. Currently mail is flowing through but some is bouncing back from them due to a reason they are still trying to narrow down. We will update when we have further information or a resolution.
This issue affects our entire global network, and some external sites we have tested.
Update: 6:34 PM EST: Even though we have not been officially updated, the problems with RoadRunner appear to have been resolved.
Earlier today we identified a major bug in the system that was used to generate statistics for SPAM email daily and intraday reports for some users. Although the issue affected only a few thousand people, I have chosen to pull it out of the production systems to avoid further confusion and lack of email report integrity. As soon as the bug fix is tested thoroughly, we will be placing it back into production. In the meantime, you will not see “Non SPAM Mail” total under statistics anymore.
ExchangeDefender daily and intraday reports are built using SQL queries against the mail log database. There are three queries executed for each report, one to obtain the SPAM messages, one to obtain SureSPAM messages and one to obtain the total number of rows in the table, both SPAM, SureSPAM and messages let through. Each SPAM query is executed within a check that verifies if the user settings are to store/quarantine junk mail because otherwise we have nothing to report if the messages are delivered and/or deleted. Totals for SPAM and SureSPAM are calculated within the respective settings check blocks. For example:
if (User Quarantines SPAM)
Get SPAM Total
Get SureSPAM Total
Get Total Messages Received
Not SPAM = Total Messages – SPAM Total – SureSPAM Total
The problem with the Not SPAM count came in if the user did not store/quarantine their SPAM or SureSPAM which would mean the blocks of code that calculate the totals for the group would not get executed. The Non SPAM total would not get the correct amount of SPAM or SureSPAM subtracted from it and it would appear to the user as if they were missing messages because they surely were not receiving the amount that the report had indicated.
We figured we could save a few cycles by not running an extra query and total if the users did not store/quarantine SPAM or SureSPAM. Unfortunately, the equation for Not SPAM did not take that check into account and instead of subtracting the correct totals for SPAM and SureSPAM which are still logged but never reported, we were subtracting a zero thereby inflating the Not SPAM total for certain users.
The good news is that it was simple enough to fix, sorry for all the frustration that has come out of this as both my support, my partners and my clients were seeing different results across the network. Considering I am responsible for the above I apologize for all the problems this has caused for you.
Over the past six months as the volume of SPAM has increased nearly exponentially, we are seeing more and more larger mail servers fail and start rejecting mail outright. Here is an update from 1&1:
Dear Mr. Kim Lannia,
thank you for your request.
Due to a problem in our systems on Friday we rejected your emails between 4:30 pm and 6:00 pm (central european time).
We apologize for these problems, sending email is now possible again.
1&1 Internet AG
Amtsgericht Montabaur HRB 6484
Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren
We have been working on a new infrastructure upgrade to address some of these misconfigurations, some popular others not so much. We have directly investigated 100’s of “why didn’t my mail get there?” support tickets and in all but three (and orange.co.uk) the mail got to the recipients mail server without problem. As a result, we throttled down our notifications so that the users receive an alert within 3 hours and within 1 day of deliveries being delayed, deferred, rejected or dropped so that the users have a way to contact the person directly if the communication is urgent but the mail systems are not working as they should.
Later this week we will put into production split mail relays over multiple networks that will implement the same intelligent routing technology we use in our inbound servers. Paperwork to get the agreements with larger ISPs (AOL, Yahoo) take a little while but we are confident they will get done this week. Some system changes will be required if you use SPF records and they will be noted here as we get closer to putting those into production.
Earlier this morning Yahoo and Yahoo UK & Ireland started experiencing problems with their RBL code. As a result, large number of messages have been rejected from our customer base to theirs. We are still working with Yahoo to resolve the issue, and a case is still open.
Users that had their email bounced would have seen returns in their inboxes. Please ask them to resend the messages. We will update the ticket when the issue is completely resolved.
Mail is flowing without issue at the moment, but the case with Yahoo is still open. We will update this site when the issue has been resolved completely.
ExchangeDefender 4.0 engine has been online for a few days now and as of Monday/Tuesday night (USA) time we have addressed all the outstanding issues regarding latency, non-delivery, rejections and the garden variety of performance problems. Today we had a relatively flawless day, with the highest SPAM detection rate ever, lowest false positive day ever and just the best thing we could have hoped for given the very smooth upgrade.
With that in mind, we want to beg you to open a support ticket if you see or notice anything even minor with the performance and reliability with ExchangeDefender over the last 24 hours at most (don’t bother going back further than that as we cannot address the problems that have very likely already been fixed).
So here is to a great ExchangeDefender 4.0 engine. If you see anything unusual please open a support request and we will investigate it with the highest priority free of charge. We absolutely appreciate your help and the time you put in to help us improve the product.
We are currently in the process of bringing the next generation of reporting infrastructure to ExchangeDefender. The new grid is currently being activated, will be completed within two hours. This work is meant to create a more reliable way to deliver email reports.
We will update as soon as everything is back to 100%.
Update: (8:00 GMT, March 5th, 2008): Maintenance window completed after 53 minutes, 8 seconds. All systems are back to normal and are catching up. We expect the transaction latency to get back to realtime within the next 20 minutes. Update coming shortly on resuming of daily and intraday reports.
Update: (9:30 GMT, March 5th, 2008): Currently working on restoring intraday, daily and ondemand reporting features. We expect the work to be completed by noon, EST.
Update: (15:00 GMT, 10 AM EST, March 5th, 2008): Maintenance completed. Reports (daily and intraday) will resume within a few minutes (at 10 AM central) and will be faster than before. This also opens up a whole new range of reporting options within ExchangeDefender which will launch with ExchangeDefender 4.x. We will also make another offering available immediately to address those of you who have a critical need for ExchangeDefender reports.
The following is an update on the extended network maintenance on the ExchangeDefender networks spanning both ExchangeDefender Exchange Hosting network and ExchangeDefender SMTP Security networks. The work continues, full details on the changes, upgrades and enhancements will be published Monday:
Completed network maintenance tasks:
- Full implementation of split queue and RBL prioritization and resource weights completed.
- Throttle control of delivery queues completed.
- Sender watermarking completed.
- New AS engine and SPAM policy handling completed (99.1% effectiveness, 0 false positives against the new rules, lowest complaint rate at SPAM levels in history of the product, woohoo!)
- Slave server database management tools and self-healing systems online.
- Additional ExchangeDefender nodes online in Dallas and Los Angeles.
- Load balancer enhancements including traffic shaping of spamming networks through the ExchangeDefender staging systems for R&D completed.
- Enahnced logging for NDR, Reject, Malformed Headers, Data Format Errors and more.
- Network latency problems fixed.
Remaining tasks for Sunday:
- Microsoft Exchange 2007 SP1 rollout
- ExchangeDefender split queue delivery handlers with skipahead enabled
- Provisioning and testing of EU LiveArchive
on Wednesday (1 AM – 3 AM CMT, 6:00 – 9:00 GMT) we will be provisioning the new reporting server infrastructure and full integration into the ExchangeDefender core network/systems. This will allow us (and you) to directly manage the behavior of each filter on a granular level.
More on this on Monday.
The issues on the daily and intraday email reports have been fixed, our new reporting engine will go online during the massive maintenance cycle this Saturday PM / Sunday AM and we feel confident that the recent issues with email based reports will not be coming back. While the new reporting engine does have an email component to it, our major area of investment is in the area of gadgets and desktop utilities (along with an XML based API) so we can provide access to junk mail in a convenient and effective way.
We apologize for any inconvenience the reporting problems may have caused you.
It appears that messages marked as SPAM and SureSPAM have again been delivered to our customers during early morning hours on Sunday, February 3rd. The error has been fixed almost immediately but a number of SPAM messages has been delived instead of quarantined.
We are still researching this issue and will have it patched shortly. The system fixed the problem automatically by itself but should not have happened in the first place.
We are currently dealing with an exponential increase in SPAM throughout the network, mostly originating from Asia. At this point the overall utilization of the network is at 61% (daily peak) and we are looking to isolate the cause of the behavior. It appears to either be a new worm or the restart of the spambot network we noted earlier this year.
Of note, Rogers Canada has started blocking port 25 traffic on the major part of their residential network. ExchangeDefender is not supposed to be implemented on dynamic / residential services but if you have done so and are experiencing issues, please be advised that your port 25 may be blocked.
Several of our ranges have been placed on PBL blacklist although there seems to be no evidence that our servers have relayed junk mail. As of 5:21 PM EST it appears our address spaces have been removed from the RBL but we will continue to monitor the issue. So far no abuse has been reported to our network aliases and there have been no issues on the network as far as outbound SPAM is concerned.
We will keep you updated if we get any further information.
We are currently investigating a bug in ExchangeDefender reports that is creating empty (all zero) email messages in both daily and intraday report. The root cause has been identified as an error in replication that hangs the copying of data from master to slave servers, as a result no new data flows in for certain customers past the problematic table.
We believe we have addressed the issue but are continuing to monitor it and will provide an update within 72 hours. This issue is flagged as urgent.
We’re officially starting the ExchangeDefender bugfix maintenance mode today and expect it to last through Sunday.
During this time you may notice slight delays in email and some features of the site will be marked as “Under Maintenance – Please check back in a few minutes” although we will try to run those events later at night.
We will update this ticket should anything unexpected happen.
Between 12:45 and 14:10 PM EST today some messages may have been delayed on our outbound network, outbound.exchangedefender.com. This delay was in part due to the new configuration being loaded on the outbound nodes to allow for more in-depth logging to allow you to track outbound mail flow and receive mail logs.
We are tracking a replication failure issue on our email reports systems in ExchangeDefender. This problem was caused by some schema modifications that went in place on Monday.
We have identified an issue with replication of data to the servers that generate email SPAM reports. While the reports do run correctly, they do not report any SPAM because no data has been replicated to them. This is obviously incorrect and has required us to completely replicate all data from master servers, leading to nearly a day of missed reports.
All systems should be back to normal in a few hours.
In the meantime, please remember that email reports are merely a convenience, the recommended way to access quarantined SPAM is through the control panel at https://admin.exchangedefender.com and any reports sent from there (domain admin panel) will contain correct SPAM counts because it is ran against the master databases.
Several minutes after midnight on October 30th, 2007 we completed all primary burnins of new ExchangeDefender inbound systems and went live. The DNS updates can take (at times several days) to be recognized by some name servers so the impact should be negligible. The network and analytics are running at the fraction of network capacity but with constant product feature evolution comes contstant upgrade.
As mentioned earlier, the reason for the upgrade is simply to increase the brute force of the ExchangeDefender security perimeter. With the changing scope of junk mail and threatening content being sent to compromise or damage systems, we are arming up for the problems we see on the horizon.
In a few moments we will be taking inbound38 and inbound33 ExchangeDefender networks offline in order to clone them to five new grids that will be processing ExchangeDefender network. System utilization is currently at 24% and we anticipate removal of inbound33 and inbound38 to increase the load to about 30% worldwide. We anticipate the three new ExchangeDefender grids to bring the peak network loads down several percentage points immediately and more as the burnin proceeds and routing takes them into account.
We are not deploying any new features nor will there be a significant topology or configuration change to our network. As a matter of fact, you will likely not even notice it. This action is merely to address the growth in the service popularity and further expansion.
From 10:00 AM to 12:30 PM EST we upgraded the outbound.exchangedefender.com DDoS network gear to protect from the recent surge in worm distribution. During the upgrade we went one node at a time which degraded the performance a little and introduced negligible (few minutes) unexpected latency on the outbound mail flow.
This was a routine hardware task, process complete and systems are back to normal.
We are commencing the regularly scheduled network maintenance on the ExchangeDefender admin network at 9PM EST on Saturday, October 20, 2007. We expect the maintenance cycle to complete by 5 AM on Sunday, October 21, 2007.
We will update this ticket when the maintenance has been completed.
12:44 AM EST – Hardware upgrades completed, proceeding with stress testing and load simulatious. Expect the system to respond slowly as we simulate a DDoS against our administrative infrastructure. JCD, PHE
2:22 AM EST – Completed ahead of schedule, all maintenance work completed. System notes under ticket 3emj1ba for further review. All systems normal, no latency or abnormalities during test. JCD
Maintenance cycle completed, JCD.
We are currently tracking an increase in SPAM messages that contain mp3 attachments. At the moment these are being discarded along with the regular SPAM because they are just random attachments to the SPAM messages which we can sort out pretty well. If you are concerned about this we advise you to block mp3 file access using your Exchange attachment filtering system.
The mp3 attachments are not dangerous (yet!) and appear to be of a teenager pushing pharmaceuticals and stocks.
Threat Level: Low.