Multipath TCP and load balancers
Load balancers play a very important role in today’s Internet. Most Internet services are provided by servers that reside behind one or several layers of load-balancers. Various load-balancers have been proposed and implemented. They can operate at layer 3, layer 4 or layer 7. Layer 4 is very popular and we focus on such load balancers in this blog post. A layer-4 load balancer uses information from the transport layer to load balance TCP connections over different servers. There are two main types of layer-4 load balancers :
The stafeful load balancers
The stateless load balancers
Schematically, a load balancer is a device or network function that processes incoming packets and forwards all packets that belong to the same connection to a specific server. A stateful load balancer will maintain a table that associates the five-tuple that identifies a TCP connection to a specific server. When a packet arrives, it seeks a matching entry in the table. If a match is found, the packet is forwarded to the selected server. If there is no match, e.g. the packet is a SYN, a server is chosen and the table is updated before forwarding the packet. The table entries are removed when they expire or when the associated connection is closed. A stateless load balancer does not maintain a table. Instead, it relies on hash function that is computed over each incoming packet. A simple approach is to use a CRC over the source and destination addresses and ports and associate each server to a range of CRC values.
With Multipath TCP, a single connection can be composed of different subflows that have their own five tuples. This implies that that data corresponding to a given Multipath TCP connection can be received over several different TCP subflows that obviously need to be forwarded to the same server by the load balancer. Several approaches have been proposed in the literature to solve this problem.
In Datacenter Scale Load Balancing for Multipath Transport, V. Olteanu and C. Raiciu proposed two different tricks to support stateless load balancers with Multipath TCP. First, the load balancer selects the key that will be used by the server for each incoming Multipath TCP connection. As this key is used to Token that identifies the Multipath connection in the MP_JOIN option, this enables the load balancer to control the Token that clients will send when creating subflows. This allows the load balancer to correctly associated MP_JOINs to the server that terminates the corresponding connection. This is not sufficient for a stateless load balancer. A stateless load balancer also needs to associate each incoming packet to a specific server. If this packet belongs to a subflow, it carries the source and destination addresses and ports, but those of a subflow have no releationship with the initial subflow. They solve this problem by encoding the identification of the server inside a part of the TCP timestamp option.
In Towards a Multipath TCP Aware Load Balancer, S. Lienardy and B. Donnet propose a mix between stateless and stateful approaches. The packets from the first subflow are sent to a specific server by hashing their source and destination addresses and ports. They then extract the key exchanged in the third ack to store the token associated with this connection. This token is then placed in a map that is used to load balance the SYN MP_JOIN packets. The reception of an MP_JOIN packet forces the creation of an entry in a table that is used to map the packets from the additional subflows.
In Making Multipath TCP friendlier to Load Balancers and Anycast, F. Duchene and O. Bonaventure leverage a feature of the forthcoming standard’s track version of Multipath TCP. In this revision, the MP_CAPABLE option has been modified compared to RFC6824. A first modification is that the client does not send its key anymore in the SYN packet. A second modification is the C that when when set by a server in the SYN+ACK, it indicates that the server will not accept additional MPTCP subflows to the source address and flows of the SYN. This bit was specifically introduced to support load balancers. It works as follows. When a client creates a connection, it sends a SYN towards the load balancer with the MP_CAPABLE option but no key. The load balancer selects one server to handle the connection, e.g. based on a stateless hash. Each server has a dedicated IP address or a dedicated port number. It replies to the SYN with a SYN+ACK that contains the MP_CAPABLE option with the C bit set. Once the connection is established, it sends an ADD_ADDR option with its direct IP address to the client. The client then uses the direct address to create the subflows and those can completely bypass the load balancer. The source code of the implementation is available from https://github.com/fduchene/ICNP2017
The latest Multipath TCP load balancer was proposed in Stateless Datacenter Load-balancing with Beamer by V. Olteanu et al. It assigns one port to each load balanced server and also forces the client to create the subflows towards this per-server port number. The load balancer is implemented in both software (click elements) and hardare (P4) and evaluated in details. The source code is available from https://github.com/Beamer-LB