The Trouble With Middleboxes – Part 3

In the first two parts, we discussed both the resource limitations of middlebox architecture, and the difficulties associated with maintaining symmetric traffic flow through them for proper stateful operation. In this third part, we turn our attention towards operational challenges and costs associated with middleboxes that can serious impair their potential value in deployment.

As defined by RFC-3234, a middlebox is an infrastructure device that performs traffic forwarding functions other than router or a switch. A large part of this is in defining the conditions by which traffic forwarding decisions are made. While this may be an over-generalization, for the purpose of this discussion, we shall assume this forwarding ‘policy’ is made up of three basic components: match condition, security function, and function instance.

The ‘match condition’ determines what packets shall be processed by the middlebox, and what forwarding action shall be taken by the middlebox on matching packets. It is important to understand that this is an additional packet match performed by the middlebox above and beyond a forwarding (routing/switching) table lookup, thus resulting in an additional cost when forwarding packets. Middlebox architectures are generally designed to perform an initial forwarding table lookup to determine the basic output interface for a packet, then use this in-port/out-port information as part of the match condition processing. Unlike forwarding tables which are optimized for a best/longest-prefix match, middlebox policy tables are generally designed to perform first-match lookups., in a similar fashion to access control lists (ACLs). Organizational care must be given in defining match conditions, in order to avoid shadow conditions and misconfigurations that can result in incorrect packet processing. Excessively long policy bases can also add significant latency for establishing new sessions through the middlebox. Additionally, this difference between the best-match (or longest-prefix match) by routers and first-match by security function policies can lead to unintended consequences.

As mentioned, match condition policy also defines the primary outcome for matching packets. These may go beyond pass/block, to include QA functions, network address translation (NAT), encryption/encapsulation functions, etc., which can result in additional processing overhead by the middlebox. The operational and performance costs of their use must be considered. In the previous part of this series, we talked about how these functions may be required to solve the issue of maintaining traffic symmetry for proper middlebox operation.

An optimal solution to the issues associated with policy match conditions is to do away with them completely. If a middlebox were to only have two interfaces and forward traffic transparently like a bridge, and if all packets were processed though that same set of security functions, a policy base would be unnecessary (or at least set to a single entry that matches all packets). Theoretically, this would have the benefit of all packets passing through a given middlebox being processed by the same set of security functions, resulting in quantatively stable performance. But is it practical with any given Internet traffic mix (IMIX) that all packets should be processed by the same set of functions? For example, in the US greater than 70% of Internet traffic now represents streaming video traffic from a finite set of content delivery networks (CDNs). Should all traffic from these CDNs be subject to deep packet inspection (DPI), when the likelihood is that little to none of it is malicious in nature?

A better answer exists in decoupling middleboxes into separate forwarding and security functions. With the advent of programmable switching, such as whitebox switches supporting software-defined networking (SDN), network OS (ex.: Open Network Linux, Cumulus, SnapRoute, OpenWRT), and future programmable protocol-independent packet processors (P4), it is becoming increasingly possible to use programmable switches to enforce network security decisions from analytical engines (pass, block, rate-limit, mark), as well as redirect or copy traffic towards associated near-line inspection engines.

In Part 1 of this series we discussed the issue of middlebox design limits, and how the best way forward was to increase the number of CPUs available to perform packet processing. A large number of VM-based network security functions could scale to meet this role when coupled with programmable switches capable of distributing (including load-balancing) across them. In Part 2 we discussed the concerns about traffic symmetry and middleboxes, and how asymmetric traffic tends to result in middlebox failure or degraded performance. Using programmable switches, it is possible to distribute traffic towards a common set of network security functions, such that relative to the security functions, the traffic appears symmetrical. The concern this part discusses regarding policy-related issues associated with middleboxes is mitigated by the combination of using programmable switching for traffic distribution, and simplifying network security functions into two-interface transparent service functions chains.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s