Bringing Prioritization Services To Ethernet
Historically, the need for prioritization hasn't really been all that great, as network utilization has been on a somewhat predictable growth curve for the past few years. But with the increased availability of new technologies with seemingly amazing bandwidth-absorption properties—particularly in the areas of network-centric multimedia and voice-over-IP services—the growth curve is being thrown out the window. Whereas before you may have had a relatively static level of utilization, you may now find the network being saturated without warning, as those videos from the company Christmas party are put online and viewed from every corner of the corporate empire.
Worse still, it is not the users of these applications who are going to complain about the availability and performance problems that they have wrought. Rather, it will be users of the mainstream business applications—e-mail, database-access and client/server applications—who will scream the loudest. Every time another user fires up his or her RealAudio player, a Notes user will feel the network slowly start to slip away, and you'll take the heat for it.
It is at times like this that congestion control and prioritization become much more than niceties. Rather than dismissing them as unnecessary luxuries that are irrelevant to your network, consider that these services offer what is perhaps the best hope for running a functional network.
Prioritization alone won't eliminate all of your bandwidth blues. If congestion is a real problem for your network, you'll need to address it with bigger network pipes, not just prioritization services. If you're losing packets now, then you'll still lose packets after implementing prioritization services. The difference is that prioritization allows you to decide which packets you're willing to lose. If you're not willing to throw away any of them, you must add bandwidth.
However, if the problem is not sustained congestion but sporadic bursts that do not yet justify the cost of additional bandwidth, your focus should definitely be on using prioritization to ensure smooth operations. In this scenario, you will want to have database queries and client/server data run across the network at higher precedence levels than less-critical traffic.
While many vendors have constructed schemes for implementing prioritization on Ethernet networks, these solutions have been restricted to vendor-specific offerings. With the advent of new bandwidth-sucking technologies like voice over IP and video on demand, the need for cross-vendor prioritization services in heavily utilized networks has become increasingly urgent.
In an effort to address this problem, the IEEE has introduced two drafts intended to provide just this service. The first of these is 802.1p, a draft extension to the 802.1D bridging standard that dictates how prioritization should occur within a MAC (Media Access Control)-layer bridge, regardless of the media in use. Meanwhile, the 802.1Q draft standard for VLANs (virtual LANs) is promising to add prioritization services to Ethernet in particular.
By using 802.1Q-compliant Ethernet frames and 802.1p-compliant switches, it is possible to implement full-featured prioritization services across your entire network, regardless of the topologies in use.
The 802.1p draft is an extension of the 802.1D standard, a well-worn specification that defines how MAC-level bridges should interoperate. The 802.1p addendum defines how prioritization should be implemented within these bridges.
This functionality is achieved through the use of a 3-bit, topology-independent "user priority." Incoming frames can be examined for a pre-existing priority value, which is then mapped to the 802.1p-specific priority value (according to a matrix provided in the 802.1p specification). The 802.1p priority value can then be assigned to an outbound frame on another medium using this same matrix, providing a standard and topology-independent priority-mapping service.
The use of 3 bits provides eight distinct priority levels ("0" through "7"), which map evenly to some topologies' native prioritization services. For example, all eight values map directly to the eight priority values used with 802.4 and 802.6. However, they do not map evenly to 802.5 or FDDI, even though they use eight priorities, since the highest value ("7") is reserved on token ring and FDDI. Furthermore, 802.9 does not use numeric priorities, and 802.12 uses only a couple of numeric values.
Most noticeably missing from the above is Ethernet, which has never had a native prioritization service, due to its legacy as a shared-media architecture. This shortcoming has been addressed with 802.1Q, a proposal for a general-purpose VLAN implementation that also provides prioritization capabilities to those topologies that do not already support them (such as Ethernet).
Implementation of 802.1Q is through an additional 4 bytes of data inserted into a frame's header. These 4 bytes contain a variety of fields, most of which are specific to VLAN data, although one field also provides a 3-bit priority flag. These 3 bits provide eight possible values, the same as those used in the 802.1p priority-mapping scheme. On Ethernet networks, the 802.1Q header fields are inserted into a frame's header immediately following the source and destination address fields and before the 802.3 "length" (or the Ethernet II "ethertype") field.
Implementation and Compatibility
The 802.1p addendum to the 802.1D standard should be finalized by the time you read this, and the 802.1Q specification is expected to be ratified this fall (we were unable to obtain exact dates from the IEEE). Vendors are preparing first-generation releases of their products that support these standards.
Since many Ethernet switch vendors already implement proprietary prioritization services, it is a relatively minor task for them to incorporate 802.1p support into their existing products. However, adding support for 802.1Q's additional header fields is proving to be a significant amount of work.
Furthermore, vendors are having to find ways to provide for backward compatibility with existing equipment. Obviously, changing the Ethernet frame is not something that can be done trivially. With the addition of 4 bytes to the frame's header, the frame is made incompatible with all of the legacy Ethernet devices that are trying to use the older frame format.
In particular, since the data is inserted before the 802.3 length field (or Ethernet II's ethertype field), any product that expects to see length or ethertype data at that location will not find it. Instead, it will see "8100," the new Tag Protocol Identifier field's default value for 802.1Q frames. Packet captures from Shomiti Systems' Surveyor network analyzer were able to decode 802.1Q data, but all the other decoders we tried displayed the frames as either "unknown ethertype" or "Wellfleet" (the latter used the 8100 ethertype at one time).
The change in field placement is not the only issue; overall packet length is a problem as well. Many devices cannot deal with a frame that is longer than 1,518 bytes. A debate has raged over whether the Ethernet frame should be lengthened by 4 bytes or the payload segment should be shortened by 4 bytes to accommodate the larger header. The result is that the 802.1Q specification allows for either implementation, with vendors left to ensure interoperability.
In actuality, getting legacy equipment to interoperate with 802.1Q-aware devices may not be that big of a deal, as most vendors will be providing support for legacy equipment on a per-port basis for years to come. If you need to support an older switch or NIC, you'll simply disable 802.1Q on that specific port, and all traffic will be sent in regular form.
How We Tested
In order to test how much of a difference prioritization made on a congested network, we brought in a 3Com Corp. CoreBuilder 3500 Ethernet switch and loaded it up with a handful of systems using 3Com's 802.1Q-compliant Ethernet adapters.
The CoreBuilder 3500 is a regular Ethernet switch, with four distinct queues and support for a variety of media types. The 3500 also supports 802.1p and 802.1Q in production mode at the time of this writing, which is rather unusual (although probably not so unusual by the time you read this). The device can support 802.1Q or legacy devices on any port, and either accept packets that already have been prioritized or prioritize packets upon entry by matching specific protocols to specific priority values. For example, if you want to set Notes traffic to a higher priority than RealAudio, simply enable a filter in the switch for those protocols, then define the specific priority values you want associated with them.
The 3Com NICs that we tested also support 802.1Q directly through the use of 3Com's DynamicAccess software. DynamicAccess implements prioritization much in the same way as the 3500 does: It analyzes the outbound network traffic and then sets the 802.1Q priority based on predefined mapping. DynamicAccess is currently available for Windows95 and Windows NT, and supports IP and IPX on Ethernet and token ring. Future versions will support Unix and NetWare, as well as FDDI adapters.
Testing consisted of a pair of 200-MHz Dell Computer Corp. clients sending a variety of UDP (User Datagram Protocol) and TCP packets to a 266-MHz Dell PowerEdge server. The clients were set to run at 100 Mbps, while the server was set to 10 Mbps, allowing us to create a significant amount of congestion on the egress port. We then modified the prioritization characteristics for each of the clients and measured the number of packets received from each.
In general, things worked as one would expect, with both clients sending roughly the same amount of data with prioritization disabled, and one client sending approximately four times as much data as the other when highly discrepant prioritization values were assigned. However, these results depended on a wide variety of variables.
At first we saw very different results from the UDP and TCP tests that we performed. While UDP performance on the prioritized client rose quite steadily, TCP performance did not appear to improve much.
Since TCP requires a significant amount of handshaking and acknowledgments, it is very dependent on both ends of the connection being able to communicate effectively. In our first tests, we had enabled prioritization only on the client ports, which allowed the traffic to be sent very quickly. However, we had not enabled prioritization on the server port, so the switch did not return the server's ACK responses to the client's data very quickly at all, instead placing the unmarked packets into the low-priority queue. Once we enabled prioritization on both ends of the connection, the TCP performance of the client rose dramatically.
Most e-mail and database systems use TCP, so having the ability to implement prioritization services at all of the end-point systems will play an important part in your ability to use prioritization services effectively with those applications. Even those applications that rely on UDP traffic from both end points (such as voice over IP) will find this to be an important design consideration.
This same basic conundrum manifested itself in many scenarios. During another test, we placed the server on another switch that was not 802.1Q-compliant and connected the legacy switch to the 3500. In this situation, the two clients were connected to the 3500 (with different levels of prioritization), while the server was connected to a legacy switch.
Although this model worked well during the UDP testing (where the clients were simply jamming packets through the 3500 to the older switch), it failed miserably in our TCP tests, as the ACK responses were unable to make it back through to the 3500 in a timely manner when sent from the legacy switch. Once we enabled per-packet prioritization on the 3500's legacy port, TCP performance started to climb again.
This type of environment will be quite common during migration efforts, and was the initial topology suggested to us by 3Com. It is important to realize that this architecture has a very negative impact on TCP connections, where both ends of the connection have to be able to communicate in order for either side to send data quickly. If you need prioritization to work with TCP, you'll need the gear everywhere.
Another hardware-specific aspect that can dictate your ability to effectively prioritize traffic is the number of queues that a switch provides and how much control you have over the amount of bandwidth and processing time that you want to allocate to those queues. For example, the 3500 we used only offered four queues, although there are eight different levels of prioritization available with 802.1Q. Thus, we were forced to group multiple priority levels to each queue, which shot any chance of our having eight distinct levels of service. You should demand eight queues if you plan to take advantage of distinct levels of prioritization in a big way.
Furthermore, we found that we only had a few levels of queue-processing available to us: "high," "low" and "best effort," with the additional option of flagging each queue as drop-eligible. Only by setting the high-priority queue to "high" and the low-priority queue to "low" with frame-dropping enabled were we able to get a diverse spread on our throughput tests. We would have liked to be able to devote more granular levels of priority to each of the queues, rather than simply defining a single high-priority queue and three low-priority ones.
... But Pipe Matters More
However, the most dramatic problem we encountered was with sustained testing. The longer we ran our prioritization tests, the more often we would encounter problems, with the low-priority node simply falling off the network. Since each topology has a maximum amount of time that a frame can be delayed, during periods of sustained congestion the high-priority traffic will eventually take up all the available queue space, resulting in the lower-priority traffic being dropped at ever-increasing rates.
With UDP, this goes unnoticed from the sender's perspective, as the packets are sent without worry. But from the destination's point of view, the effect is that the traffic has simply stopped. Conversely, with TCP things get quite a bit messier. First a packet gets dropped, and thus no ACK is generated. After a while, the client's TCP stack will try to resend and will fail again. Finally, the client just gives up.
This is exactly what happens with e-mail and database clients on a saturated wire, by the way, which is why it is important to give your mission-critical applications a sufficient priority level to maintain operations in periods of high congestion.
Along these same lines, it's important to realize that you can't compensate for this by bumping up the priority of everything. Having all traffic run at the highest priority is exactly the same as having everything run at low priority: Everything fights for limited resources, and everything suffers.
This simply illustrates the need for sufficient bandwidth in general. Prioritization only allows you to pick which frames should be sent through (assuming they can be), but does not give you any additional bandwidth.
A simple rule of thumb to remember is that the importance of prioritization is equal to the level of contention for bandwidth. If you never have any contention whatsoever, then you don't need any prioritization. But if your network is fully saturated, you absolutely need prioritization services in order to ensure that the really important data gets through.