The Multipath TCP foundations

When looking at a new protocol, it is also interesting to start by reading the initial motivations for its design. The initial design of Multipath TCP was heavily influenced by the Resource Pooling Principle written by Damon Wischick, Mark Handley and Marcelo Bagnulo [WHB08] and published as an editorial in SIGCOMM’s Computer Communication Review. Since the early days of computer networks, statistical multiplexing, failure resilience and load balancing have played a key role in enabling networks to carry a growing amount of traffic. However, many of the techniques that are used today were designed under the assumption that they needed to have a local impact. Many of these designs missed the opportunity of considering the problem of pooling all the available resources as an end-to-end problem.

Multipath TCP, by enabling endhosts to efficiently use different paths to exchange packets was designed to solve one aspect of this problem. Content Distribution Networks and the more recent Mobile Edge Computing approaches also contribute to this overal goal of improving the sharing of all the available network resources. The initial design for Multipath TCP is briefly sketched on page 3 :

../../../_images/pooling.png

Later on this page, Damon Wischick, Mark Handley and Marcelo Bagnulo provide an interesting comment about the design of Multipath TCP :

Adding multipath support to TCP is so obvious that it has been re-invented many times [Hui95], [HS02], [RA04], [DWPW07], and multihoming is built into SCTP, though no protocol that simultaneously uses multiple paths has ever been standardized let alone widely deployed. Why is there not more multipath at the transport layer? Perhaps because it has not been understood that multipath lets end systems solve network-wide resource pooling problems, and because the issues with cur- rent mechanisms are only now becoming pressing enough to fix.*

This paragraph clearly suggests that one of the objectives of Multipath TCP will be to put the endhosts in control for the selection and the utilisation of multiple end-to-end paths to reach a given destination. In fact, Resource Pooling Principle could be considered as a natural extension of Saltzer, Reed and Clark’s End-to-end arguments in system design paper [SRC84] when considering resource utilisation.

The paper ends with one definition and two observations that remain valid today:

Definition. Resource pooling means making a collection of networked resources behave as though they make up a single pooled resource. The general method of resource pooling is to build mechanisms for shifting load between various parts of the network

Observation 1. Resource pooling is often the only practical way to achieve resilience at acceptable cost.

Observation 2. Resource pooling is also a cost-effective way to achieve flexibility and high utilization.

References

[DWPW07]Yu Dong, Dingding Wang, Niki Pissinou, and Jian Wang. Multi-path load balancing in transport layer. In Next Generation Internet Networks, 3rd EuroNGI Conference on, 135–142. IEEE, 2007.
[HS02]Hung-Yun Hsieh and Raghupathy Sivakumar. Ptcp: an end-to-end transport layer protocol for striped connections. In null, 24. IEEE, 2002.
[Hui95]C Huitema. Multi-homed tcp. draft-huitema-multi-homed-01. Internet Engineering Task Force (IETF), 1995.
[RA04]Kultida Rojviboonchai and Hitoshi Aida. An evaluation of multi-path transmission control protocol (m/tcp) with robust acknowledgement schemes. IEICE transactions on communications, 87(9):2699–2707, 2004.
[SRC84]J. Saltzer, D. Reed, and D. Clark. End-to-end arguments in system design. ACM Transactions on Computer Systems (TOCS), 2(4):277–288, 1984.
[WHB08]D. Wischik, M. Handley, and M. Bagnulo. The resource pooling principle. SIGCOMM Comput. Commun. Rev., 38(5):47–52, September 2008. URL: http://doi.acm.org/10.1145/1452335.1452342, doi:10.1145/1452335.1452342.

Multipath TCP Tutorials

Many scientific articles and IETF documents have been published on Multipath TCP. A network engineer, researcher or student who wants to learn Multipath TCP will probable start from a search engine or Wikipedia. A sample result is provided below.

../../../_images/search-mptcp.png

The Multipath TCP page on Wikipedia provides some pointers, but this is probably not the simplest starting point to learn Multipath TCP. Fortunately, several tutorial articles that describe the basic principles of this TCP extension have been published.

One of the first tutorial articles is An overview of Multipath TCP that was published in USENIX login; in May 2012 [BHR12]. This article provides a basic overview of some of the principles of Multipath TCP.

The second article is simply entitled Multipath TCP and appeared in Communications of the ACM in 2014 [PB14]. It provides a more detailed overview of the protocols and some of its use cases. This is probably the most complete tutorial article on Multipath TCP.

If you prefer to listen to video tutorials instead of reading articles, several of them have been posted on youtube.

A long tutorial on the Multipath TCP protocol was given by Olivier Bonaventure at IETF’87 in Berlinin August 2013.

Christoph Paasch gave a shorter Multipath TCP tutorial earlier during FOSDEM’13 in Brussels.

Earlier, Costin Raiciu and Christoph Paasch gave a one hour Google Research talk on the design of the protocol and several use cases.

[BHR12]O. Bonaventure, M. Handley, and C. Raiciu. An Overview of Multipath TCP. Usenix ;login: magazine, October 2012.
[PB14]Christoph Paasch and Olivier Bonaventure. Multipath tcp. Commun. ACM, 57(4):51–57, April 2014. URL: http://doi.acm.org/10.1145/2578901, doi:10.1145/2578901.

The first ten years of Multipath TCP

Multipath TCP was designed within the FP7 Trilogy project that started in early 2008. The first ideas on Multipath TCP were discussed in 2008, slightly more than a decade ago. During this decade, Multipath TCP has evolved a lot. It has also generated a lot of interest within the scientific community with several hundreds of articles that use, extend or reference Multipath TCP. As an illustration of the scientific impact of Multipath TCP, the figure below shows the cumulative number of citations for the sequence of internet drafts that became RFC 6824 according to Google Scholar.

../../../_images/citations.png

The industrial impact of Multipath TCP is also very important as Apple uses it on all iPhones and several network operators use it to create Hybrid Access Networks that combine xDSL and LTE to provide faster Internet services in rural areas.

On all the remaining days until Christmas, a new post will appear on this blog to illustrate one particular aspect of Multipath TCP with pointers to relevant scientific papers, commercial deployments, … This series of blog posts will constitute a simple advent calendar that could be useful for network engineers and researchers who want to understand how this new protocol works and why it is becoming more and more important in today’s Internet.

../../../_images/advent.png

Multipath TCP and load balancers

Load balancers play a very important role in today’s Internet. Most Internet services are provided by servers that reside behind one or several layers of load-balancers. Various load-balancers have been proposed and implemented. They can operate at layer 3, layer 4 or layer 7. Layer 4 is very popular and we focus on such load balancers in this blog post. A layer-4 load balancer uses information from the transport layer to load balance TCP connections over different servers. There are two main types of layer-4 load balancers :

  • The stafeful load balancers

  • The stateless load balancers

    Schematically, a load balancer is a device or network function that processes incoming packets and forwards all packets that belong to the same connection to a specific server. A stateful load balancer will maintain a table that associates the five-tuple that identifies a TCP connection to a specific server. When a packet arrives, it seeks a matching entry in the table. If a match is found, the packet is forwarded to the selected server. If there is no match, e.g. the packet is a SYN, a server is chosen and the table is updated before forwarding the packet. The table entries are removed when they expire or when the associated connection is closed. A stateless load balancer does not maintain a table. Instead, it relies on hash function that is computed over each incoming packet. A simple approach is to use a CRC over the source and destination addresses and ports and associate each server to a range of CRC values.

With Multipath TCP, a single connection can be composed of different subflows that have their own five tuples. This implies that that data corresponding to a given Multipath TCP connection can be received over several different TCP subflows that obviously need to be forwarded to the same server by the load balancer. Several approaches have been proposed in the literature to solve this problem.

In Datacenter Scale Load Balancing for Multipath Transport, V. Olteanu and C. Raiciu proposed two different tricks to support stateless load balancers with Multipath TCP. First, the load balancer selects the key that will be used by the server for each incoming Multipath TCP connection. As this key is used to Token that identifies the Multipath connection in the MP_JOIN option, this enables the load balancer to control the Token that clients will send when creating subflows. This allows the load balancer to correctly associated MP_JOINs to the server that terminates the corresponding connection. This is not sufficient for a stateless load balancer. A stateless load balancer also needs to associate each incoming packet to a specific server. If this packet belongs to a subflow, it carries the source and destination addresses and ports, but those of a subflow have no releationship with the initial subflow. They solve this problem by encoding the identification of the server inside a part of the TCP timestamp option.

In Towards a Multipath TCP Aware Load Balancer, S. Lienardy and B. Donnet propose a mix between stateless and stateful approaches. The packets from the first subflow are sent to a specific server by hashing their source and destination addresses and ports. They then extract the key exchanged in the third ack to store the token associated with this connection. This token is then placed in a map that is used to load balance the SYN MP_JOIN packets. The reception of an MP_JOIN packet forces the creation of an entry in a table that is used to map the packets from the additional subflows.

In Making Multipath TCP friendlier to Load Balancers and Anycast, F. Duchene and O. Bonaventure leverage a feature of the forthcoming standard’s track version of Multipath TCP. In this revision, the MP_CAPABLE option has been modified compared to RFC6824. A first modification is that the client does not send its key anymore in the SYN packet. A second modification is the C that when when set by a server in the SYN+ACK, it indicates that the server will not accept additional MPTCP subflows to the source address and flows of the SYN. This bit was specifically introduced to support load balancers. It works as follows. When a client creates a connection, it sends a SYN towards the load balancer with the MP_CAPABLE option but no key. The load balancer selects one server to handle the connection, e.g. based on a stateless hash. Each server has a dedicated IP address or a dedicated port number. It replies to the SYN with a SYN+ACK that contains the MP_CAPABLE option with the C bit set. Once the connection is established, it sends an ADD_ADDR option with its direct IP address to the client. The client then uses the direct address to create the subflows and those can completely bypass the load balancer. The source code of the implementation is available from https://github.com/fduchene/ICNP2017

The latest Multipath TCP load balancer was proposed in Stateless Datacenter Load-balancing with Beamer by V. Olteanu et al. It assigns one port to each load balanced server and also forces the client to create the subflows towards this per-server port number. The load balancer is implemented in both software (click elements) and hardare (P4) and evaluated in details. The source code is available from https://github.com/Beamer-LB