Scaling Up and potential latency issues
As we prepare for the massive upgrades coming this weekend we are obviously testing systems and making intermediate changes to the network. As a result, over the next 48 hours you are likely to see some latency in DNS query results which virtually impact all other services such as backups, ExchangeDefender, virtual servers and everything else thats being brought online.
While you are unlikely to notice any of these changes directly, if you do see slight performance issues they are probably related to the maintenance work being done on our end.
Extended Maintenance Cycle for 7/21/2007
We are extending the maintenance cycle window for July 21, 2007. Our regular maintenance cycle is from 3AM to 7AM EST but due to the number of systems going online this week we are going to have to extend that window until noon EST. We will follow up with full details of the work being done but suffice to say it is significant.
If you are not familiar with our maintenance cycle activities do not worry, they are routine and they happen every week. Generally they are things you would expect – reboots to add more memory, storage, reboot for software patch installation, etc. At times there are other items such as electrical or network maintenance.
This particular weekend involves nearly all of the above plus a significant upgrade to our core infrastructure of ExchangeDefender, offsite backups and virtual servers. We’re bringing online 3 new data centers to top it off so we wanted to give ourselves more time to get everything done right.
Live at Microsoft WWPC
Dear Partners,
I (Vlad Mazek, CEO) am at the Microsoft World Wide Partner Conference this week.
I have intentionally kept Thursday completely open on my schedule and have absolutely no appointments after breakfast – so if you are interested in speaking to me just track me down and lets get together whenever you have room between meetings. I have done this last year and found it very valuable to catch up with those of you that are very busy and just can’t find time in connect to arrange for an official gettogether, or meetings get cancelled, moved, etc. So send me an email and lets meet up whenever there is time.
ExchangeDefender Policy Engine Bugfix
We recently started receiving complaints about certain users not having their SPAM and SURESPAM filtering policies applied correctly. For example, user would select to quarantine their SPAM and delete their SURESPAM but mail would still arrive in the inbox with the subject modified as [SPAM] or [SURESPAM].
As of 10 AM EST this bug has been fixed. If you have your mail set to quarantine on either of the SPAM presets the rules will be applied correctly. If that does not happen consistently and correctly please open up a support ticket at https://support.ownwebnow.com
Note: The issue was related to the legacy network policy server not syncronizing filtering rule tables in correct order. It would treat its local database as the most up-to-date one and would never apply the newer policies. This issue has been fixed.
Addressing recent increase in PDF SPAM
As you may have noticed over the last few days, there has been a huge increase in PDF SPAM. This spam is generally identified as a single message, with attached PDF containing JPEG image SPAM. This pattern easilly bypasses most appliances that have no ability to handle the processing power needed to decode images, much less those encoded inside a PDF file. Not that we’re gloating, but there are only 24 hours in a day and its not enough to talk about how different ExchangeDefender behavior is compared to RandomSpamApplianceFromTaiwan.
At the moment, there are also several unique characteristics to these images:
- they are all 7bit encoded.
- they all use a single useragent associated with the Mozilla Thurderbird mail software.
- they are all blank messages with no text in the body.
- the attachment matches the filename mentioned in the subject.
- pdf file is a legitimate PDF file with no publishing information except for a single JPEG
Based on all that its relatively trivial to trap these messages, however, we expect the pattern to continue and to escalate into making these messages seem more legitimate. While these PDFs are not dangerous in nature they can be annoying and your users should be warned to never open any attachments from contacts they do not trust/know.
As always, thank you for your business and we’ll keep your mail clean for you.
PBX Upgrade
You may have had a difficult time reaching us on Wednesday 6/27 and Thursday 6/28.
I have mistakenly redirected my DID to an unmonitored extension at the same time that the PBX was undergoing a software upgrade to Trixbox. This resulted to all incoming calls being routed to an unmonitored extension that didn’t even have a greeting assigned to it. I have been able to track down the voicemails and will be returning calls tomorrow on Friday, 6/29.
All other support mediums (email, web, portal) were working during this time, we did not become aware of the problem until one of our partners let me know. We’re really sorry about the inconvenience this has caused some 40 of our callers, we will match up the caller ID with the customer database and contact you all as soon as possible.
DNS and Time Infrastructure Overhaul
As our network grows even the most optimized of services need to scale. While its unlikely that you may have noticed an issue with DNS services, we have decided to both increase its capacity and reduce the scope of that service. We have also added the ability for you to sync with reliable internal time servers. Both modifications are nearing completion but you can take advantage of them right now as they prepare us for future growth.
DNS Modifications
Going forward our DNS servers will only answer authorative requests for the external network (ie, Internet) and full answers including caching will be provided to internal servers (ie, hosted networks, ExchangeDefender, colocation customers, infrastructure partners). More specifically, we will not provide “recursive lookups” for external users and will only answer authorative requests from the Internet.
Background: DNS servers resolve friendly hostnames such as www.ownwebnow.com into IP addresses such as 65.99.192.50. The DNS server, in our case ns1.ownwebnow.com is said to be authorative for a zone (in our case ownwebnow.com) if it is the official provider of the information that matches the hostname to the IP address. When you use a DNS registry such as Network Solutions to register your domain, you enter a set of name servers (ns1.ownwebnow.com and ns2.ownwebnow.com) which will provide resolution, or be authorative, for that domain. Clients, including remote networks, computers, servers and more use their own DNS servers to resolve hostnames into IP addresses so computers can locate one another over the Internet. When a remote server requests a lookup from their local server the local server checks if its authorative for the domain (ownwebnow.com) and if it is not authorative it starts the recursion process – it first looks at its root hints to find the top level domain (.com) and eventually receives an answer from the authorative server (ns1.ownwebnow.com) which it sends back to the client. By disabling recursion on our name servers we stand to reduce the load and increase performance on our network because we will only be providing the DNS service to our customers, not everyone on the Internet.
Time Server Modifications
As of late there have been many issues with the public pool of NTP servers that help computers and networks around the world syncronize their clocks. To make matters worse, there are many issues with virtual machines and the horrible drift (difference between real time and time in the virtual machine) in time thats introduced with new technologies.
If you are Internal to the Own Web Now network you can use time.ownwebnow.com as your time server. It should (and so far statistically it has) answer the time syncronization requests 100% of the time. Our previous time.ownwebnow.com was a round-robbin implementation that simply aliased time.ownwebnow.com to the various military and research organizations that had public time servers. Over time, that infrastructure has become less and less reliable so we’re providing the time sync for you if you’re on our network. Just use time.ownwebnow.com and you’re all set.
That is all for now, we expect all time and DNS related work to be complete by July 15th but you are welcome to use them now to improve your performance. This will be a very seamless and transparent implementation for our entire user base but we wanted you to be aware of what we’re doing to keep up. As always, thank you for your business.
ExchangeDefender gets tougher on NDR and Backscatter
Over the past year we have seen a steady increase in NDR traffic. We’ve done something about it previously but have since gotten far more aggressive on it to the point that virtually every fake bounce will be automatically quarantined.
It’s important to understand the motivation behind the spoofing and massive NDRs they produce. There are two ways in which spammers abuse the NDR system: one is to steal identity and the other is to diminish the confidence in the SPAM filtering solution. The first is quite easy, they want to use a legitimate sender address so that the remote servers will accept the mail. To combat this you can easilly enable SPF/SenderID on your domain and never worry about it. The second is a little more involved/contrived and involves systematically taking apart the ability of the “installed” SPAM filtering solution to adequately sort out mail. Most installed SPAM filtering solutions (the ones you install on your server) and appliances alike (that are devices on your network) build reputation models based on how often legitimate mail comes from certain addresses and IP blocks. They also build local bayesian databases that index known SPAM and non-SPAM; As such, by flooding the server with mail from all over the place those databases the reputation scores become increasingly less reliable – a process more commonly known as poisoning.
So what are we doing and how does it benefit you? Assuming you are using our outbound servers to relay messages, your messages will contain special tracking that will match up what we have in our internal databases. If an NDR is received with that tracking in tact, the message is allowed through. If the NDR is received without that tracking that means that the message didn’t come from you, from your server, that it was spoofed – and it adequately goes into the SPAM quarantine where you’ll likely let it die.
Offsite Backup Troubleshooting
As blogged here previously, our offiste backup maintenance is complete. Not only do we have the storage to sustain the growth for the forseeable future but things are running near flawlessly.
If you are still experiencing issues please open up a trouble ticket and we will help you get to the bottom of it. One problem some customers have reported is that backups do not seem to work automatically, namely, you receive an email saying “Reminder: Scheduled backup missed > Username > Backup set name”
If you receive this error please try following these steps (click) as they are the first thing we will try to do once the ticket is opened.
If you receive any other error, please open up a trouble ticket in our support portal.
We have seen a remarkable improvement in performance for backups and restores since the maintenance interval, and will shortly be announcing some additional services related to the offline backup service, mostly around the management, support and DR crisis management.
Offsite Backup Expansion Complete
The previously mentioned offsite backup work and SAN maintenance has been completed. All systems should be back to normal and performance should be back to rock solid. We will be doing some final stress testing scheduled for tonight, along with a few DR scenarios so its likely that tonight is the last night your backups get interrupted. If you do start experiencing problems with your backups on Tuesday please open up a trouble ticket and we will do our best to get to the bottom of the issues.
Thank you for your patience and sorry about the inconvenience. As mentioned before, we are crediting the month of service to all customers regardless of whether you were affected by the performance issues or not.