Mitigating DoS Attacks
Denial-of-Service attacks may not be front page news anymore, but their impact on Internet-connected organizations continues to take a heavy toll. In the last few weeks alone, there have been numerous large-scale attacks against a wide spectrum of interests. As a bare sampling, SCO and Kazaa were both crippled for techno-political reasons, and credit-card processing firms 2Checkout and Authorize-It were knocked off-line as part of an apparent extortion scheme, while the usual round of anti-spam, anti-spyware and anti-hacking self-help sites have taken their usual beatings in apparent revenge assaults. Those are just some of the attacks that are known to have occurred within a narrow four-week period, and it's safe to bet that there were many more which have not been made public.
The big picture is even more telling. The March 2004 Computer Crime and Security Survey, conducted by the Computer Security Institute and the FBI, says that DoS attacks made up 35% of all network attacks in 2003, and anecdotal evidence suggests that these attacks are continuing to increase in number, even to the point where they are becoming rote and even casual (one carrier has reported that they are now dealing with four attacks per day on average). More dramatically, the revenue losses reported in the CCS survey ranged from a low of just $500 to a high of $60 million, with an average loss of $1.4 million. If that figure seems unrealistic, keep in mind that a successful attack against your network resources has the potential to shut down your organization's public and private e-commerce capabilities for a week or more, and do your own math.
If these attacks are this common and carrying such extreme price tags, then clearly organizations need to be prepared for them in the same way that they would prepare for any other potential outage or attack that has the potential to do this much damage. As with other kinds of contingency planning, this means understanding your recovery options, choosing the appropriate path, and being prepared to execute your plan quickly when the time comes.
The recovery options available will depend in large part on the kinds of attack you face. In general terms, DoS attacks succeed by overwhelming an application within a host, or by attacking a weakness of the host itself, or by filling the network with noise such that the target system cannot by accessed by legitimate and desirable users. Even within this simple taxonomy, however, there are numerous variations. Defending against these attacks usually requires a multi-pronged strategy, incorporating elements of the application- and host-specific defenses locally, while simultaneously working with your service provider to filter out whatever traffic they can.
Keep in mind that the amount of assistance that you will receive from your service provider will vary according to a number of factors, but the biggest factor is whether or not you are paying for their time. ISPs are in the business of selling connectivity, and are not in the business of ensuring that you only get the traffic you want. Many of them will be happy to help a valued customer with simple packet filters where needed, but any help beyond that may require you to treat their professional services personnel as consultants instead of support agents.
Application Resource-Consumption Attacks
Application-specific attacks are typically designed to consume all of the resources available to a particular application, such as issuing thousands of simultaneous searches against a database that is only designed to handle a few hundred queries. These attacks also have the potential to be the most insidious, since they can be indistinguishable from legitimate traffic, and can therefore be the most difficult to block. Simply put, a DoS agent that is designed to load Internet Explorer system libraries and request a specific web page through normal means is going to be nearly indistinguishable from legitimate Internet Explorer users. Similarly, attacks on DNS servers are popular because they only require a single (easily-forged) UDP message that requires processing by the target servers, making them hard to block. If there is any detectable pattern difference in these attacks, it will likely be in the timing of the requests, where human users do not usually make thousands of requests for the exact same data over and over.
Companies such as Akamai offer a proactive defense against this type of attack by replicating content across localized caches, so that the load is distributed across a large number of targets, and these services have often proven to be successful defenses against certain kinds of attacks. Similarly, high-value DNS servers are often widely replicated to provide the same protections - the "F" root server is actually a globally distributed network of hosts that all share the same "anycast" IP address, allowing each server to process sub-sets of query traffic independently of the other nodes.
However, these kinds of architectures are usually limited to cacheable content, and are not always viable for defending against dynamic content or application sessions that are long-lived in nature -a large number of abandoned shopping carts or slow email transfers can kill a service with a minimum amount of effort since those services consume limited server resources by design. Although it is technically possible to build out a super-sized virtual cluster to handle these kinds of attacks, the economic cost is usually prohibitive and will likely dictate alternative defenses.
Host-based attacks are similar to application-specific attacks in that they are also designed to consume resources, but by attacking system weaknesses instead of application weaknesses. However, these attacks are usually the easiest to defend against since they frequently rely on simplistic attack vectors. For example, many DoS attacks try to fill a target's connection table with incomplete session requests, but these attacks can often be foiled by using proxy servers that aggressively discard incomplete sessions, and which only pass complete requests to the actual destination system. Similarly, attacks which rely on the server keeping track of fragmented datagrams can be avoided by simply filtering out fragmented packets at the network edge.
In those types of cases, a simple firewall can prevent most of these attacks from ever being felt, while attacks that leverage known bugs to cause localized shutdown or failure can often be preempted simply by keeping your host systems properly patched.
This isn't meant to imply that these kinds of attacks are ineffective or overly simplistic, but rather that they are often preventable with planning and rudimentary protection mechanisms. On the other hand, these attacks succeed precisely because these scenarios are not considered, meaning that these attacks often succeed at least once, and continue to succeed until the necessary work is performed.
One thing that application- and host-based attacks have in common is that are focused on consuming resources at the target, meaning that they can succeed with relatively low volumes of traffic, making them relatively easy to launch with limited client-side resources, while also making them relatively easy to defend against. For most organizations, the real trouble starts when these attacks are used at high volume, producing more traffic than the network itself can handle. At that point, the local defenses are mostly irrelevant since no desirable traffic can make it through anyway.
Saturation attacks usually take advantage of large numbers of client systems, but can often be just as effective with a comparatively small number of high-powered systems. In the former case, it is estimated that there are anywhere between 200,000 and 16 million personal computers that are infected with some kind of controllable agent, and even the most conservative of those numbers can produce frightening results at modest per-client loads. Worse is that the current generation of multi-gigahertz PCs with multi-megabit broadband connections means that it only takes a thousand of these systems to produce multiple gigabits of traffic that will melt the most tenacious defenses. In situations where statit content is being targeted, you may be able to distribute the content so that the attacking forces are also distributed, but if the attack is large enough or if it is targeting a non-cachable service, then you will likely require some kind of assistance from your network service providers as well.
Service Provider Options
The simplest option that a service provider will offer (and the option that is most often offered for free) will be to filter out all traffic that fits the broadest profile. In some cases, this may equate with a suicide option for your network- if your only public service is HTTP on TCP port 80 and that's where the attack traffic is going, then filtering out that traffic will also mean losing the good traffic too. In that regard, triggering this option is indistinguishable from surrendering to the attacker, since you will be denying access to legitimate users as a result of your actions. However, if you are paying for measured-rate bandwidth and your pipe is running at full capacity for an extended period, it may be worth the cost savings to simply expedite the inevitable while you research other options. Furthermore, in those cases where the attack traffic is overwhelming neighboring customers on the same router or subnet, your service provider may require you to accept the filter regardless of whether or not you want to, particularly if the attack threatens other customers of theirs.
Beyond the basic filtering, your options will vary dramatically between different providers. If the attack traffic can be easily profiled, then most providers will be able to implement a simple filter in their own routers, or will ask their neighbors and upstream providers to implement the filters in their routers so that the traffic is pushed back out of the local network entirely. In those cases where the attack traffic is mostly indistinguishable from the good traffic, the provider will have to rely on some sort of heuristic-based filtering system, and you will almost certainly be expected to help defray the costs of such a system.
Some of the pattern-based filtering systems can be fairly complex, sometimes involving multiple devices and routing protocols that work together to manipulate and massage traffic according to specific conditions. For example, some large-scale carriers and providers run multi-tiered monitoring and routing agents which inspect real-time traffic patterns for harmful anomalies and then route the potentially harmful traffic to an internal subnet where it is scrubbed according to the detected heuristics, with only the clean data being forwarded back to your connection point. Products from companies such as Arbor Networks and Riverhead Networks (the latter of which was recently acquired by Cisco Systems) are designed specifically for use in these kinds of scenarios.
Arbor's Peakflow software taps into existing traffic flow data (such as Cisco NetFlow output) so that it can monitor traffic statistics for the entire network. Historical and well-known attacks (such as an ICMP flood) can be recognized immediately by their well-known characteristics, but newer and more subtle attacks can also be recognized by their increased volume in comparison to the aggregated and correlated traffic history (does the high volume of packets have an unlikely combination of flags? does it appear to have a forged source address?). Once the attack traffic is detected, the network operators can be alerted and recovery mechanisms can be implemented.
The Riverhead Detector system provides similar kinds of monitoring and management services, while the Guard product line implements forwarding and scrubbing services that isolate and filter the affected traffic. Packets that match against the attack criteria are discarded, while traffic that appears to be legitimate is passed back into the network for forwarding to the destination. Packets that do not appear to be part of an active attack are not routed into the filtering sub-net, and can stay on the backbone.
Anecdotal evidence suggests that these types of products are extremely useful for large networks that are witness to frequent attacks, although their price tags also typically limit their use to high-end customers. If your line of business requires uninterruptible service, however, then buying into these kinds of options may be well worth the increased cost.
Step by Step
If attacks are inevitable, then so is your response. The only question remaining is whether you'll be prepared with a tactical and strategic plan that preserves your interest or if you'll be forced into making hasty decisions that actually make things worse.
Step 1: Plan for attacks and build your critical infrastructure accordingly. At the very least, front-line defensive equipment like firewalls and routers should be purchased with an eye towards availability during extreme loads; if the CPU in your edge device gets saturated and stops responding to your management and control traffic, it will be of no use when it actually counts. Meanwhile, having a means to distribute critical resources to other networks that are not under attack will also be essential towards reducing your actual losses - you will almost certainly be required to sacrifice access to some systems (at least temporarily), and being able to soak up these hits without suffering a complete loss will be mandatory.
Step 2: Build a support community that you can count on to be there when you need them, and make sure that they'll actually be there. Mailing lists and professional organizations are good resources, but don't stop there. You should arrange introductory meetings with the law enforcement agencies that handle electronic crimes in your area so that those lines of communication are open in case criminal charges become necessary. Consider contracting emergency services such as security firms or back-up carriers, as appropriate to your potential exposure. Insurance may be available to off-set monetary losses or abnormal expenses associated with sustained attacks. All of these parties can be important parts of a recovery strategy, and they should all be considered.
Step 3: Running a clean network is self-rewarding, in several respects. For one thing, it's cheaper and easier to regularly apply updates and patches than it is to regularly recover from an outbreak or attack, and a lack of the former practically guarantees the latter. Furthermore, good-neighbor practices like egress route-filtering keep you from being told that your network is contributing to somebody else's DDoS nightmare, even if you do get infected. Last but not least, clean networks are easier to debug and diagnose in high-pressure scenarios, which means faster recovery when it's your time on the mat.
Step 4: When a serious attack actually starts, your first course of action must be to adopt your default defensive posture so that you can analyze the attack and take the appropriate recovery steps in clear mind. Call in your first-response team and begin exercising your initial recovery plan. If you have standby systems ready to absorb some of the punishment then activate them, but otherwise be prepared to sacrifice target systems so that neighboring systems aren't also killed. Your top priority during this phase must be to fall into a non-disruptive posture so that the rest of the recovery flows smoothly.
Step 5: Analyze the attack as best you can and implement the correct defense. Are there any common packet signatures that are easy to filter against? Are all of the attackers hitting a single target that can be sacrificed? Which network is the attack coming from, and can you verify it (remember that spoofed packets can come from anywhere, including your own network). Once you've found a reasonable match for the attack, pass the filters to your upstream provider(s) and seek their help getting them propagated outwards. You need to make sure you get this right the first time, since nobody wants to experiment with your guesswork as that takes them away from their own diagnostic and recovery actions. If you do this right, you'll be able to filter or redirect traffic with a minimum amount of actual downtime.
Step 6: Get offensive. Now that you're not in a mandatory defensive posture, try to locate the source of the attack and do something about it. See if any of the zombies are on networks which should be managed (such as a university or corporate network), and work try to locate the control points (you'll probably need to pursue multiple paths of contact here, since different organizations will respond differently). Call in law enforcement (as appropriate to your losses) and give them what they need to proceed. At the very least, you should be looking to remove the platform of attack, if not its original source.