The Trouble With Middleboxes – Part 1

According to RFC-3234, “A middlebox is defined as any intermediary device performing functions other than the normal, standard functions of an IP router on the datagram path between a source host and destination host”.  The overwhelming number of network security appliances such as firewalls (FW) and intrusion prevention systems (IPS) are middleboxes.

Unlike routers and switches that are merely making forwarding decisions based on IP and MAC address information, security middleboxes use additional attributes in making forwarding decisions, such as a policy-based filter, session-state, and the use of deep-packet inspection (DPI) engines to examine packet payload information.

If one were to make a loose analogy, consider the role of airport security in taking a flight.  The fact that you have a ticket is analogous to a policy-based filter, as all passengers require valid tickets.  Whether you are entering the secure area or leaving it may be analogous to session-state (whether you are pre-cleared with light inspection, or not requiring a personal scan can be a sub-analogy of UDP vs. TCP).  The inspection of your carry-on via x-ray is analogous to the role of a DPI engine.  While this analogy breaks down quickly, it still makes the point that security middleboxes create operational challenges when integrated onto networks, particularly if all traffic is required to pass through them

Available resources become a significant challenge to middlebox operation.  Maintaining session-state or performing DPI functions requires a significant amount of CPU and RAM to accomplish.  Unlike merely passing packets, and even if hardware-based acceleration can be applied, performing first-packet checks against a variable policy base. or performing DPI functions require handling by a CPU.  Let’s examine the case of a basic stateful FW.  Very large appliances available today can support 500K new sessions/sec, and appliance-based units can exceed 1M new sessions/sec.  For perimeter-based defenses, this may appear adequate.  But what happens in datacenter deployments, particularly those supporting mobile carriers, with aggregation of large numbers of protected endpoints.  If a given stateful FW needs to support 100K endpoints, this results in an available 5 new sessions/sec/endpoint for a large appliance.  While a chassis may double or triple this, their associated session-rate costs become prohibitive.  If you add DPI functions on top of this, the scalability limits rapidly become untenable in the face of aggregated environments.

Middlebox limits such as session-rate are set based on maximizing the available CPU/RAM resources on the platform.  While hardware-based acceleration and CPU efficiency (such as the use of dedicated cores for specific functions) may aid in extending these limits, the reality is that the day a customer first purchases a middlebox, both the peak performance and functional limits have been established.  Customers must steady-state requirements, burst/peak requirements, and growth, in a similar way to our airport security analogy considering staffing requirement for an average day vs. peak holiday travel periods.

For datacenter architectures, the requirements are being driven not just from its size, but the size of its peering capacity.  The ongoing revolution in optical networking has resulted in multiple Tbps over a single physical fiber run, and where 10Gbps peer connections were previously common, 100Gbps is now the new norm, with 400Gbps not too far in the future.  Placing current middleboxes inline with such peering connections becomes too limiting a proposition.  This is one reason many of the network security manufacturers are moving towards virtualized middleboxes solutions, and re-establishing virtual perimeters around virtual assets.  One term for this is micro-segmentation.  The negative is that these products fail to benefit from the use of hardware-based acceleration, relying solely on CPUs to not only perform stateful and DPI functions, but to perform packet-forwarding functions as well.

In this first part, we’ve explored the basics of how resource limits on security middleboxes are presenting unscalable challenges, as they emerge from perimeter defenses into aggregated datacenter environments.  In subsequent parts, we will explore additional challenges into the network implementation of middleboxes, as well as development and operational (DevOps) alternatives, including the use of contextual forwarding.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s