Multipath TCP in the datacenter

In the scientific literature, one of the first important use case for Multipath TCP was to distribute the load datacenters. Several architectures have been proposed for datacenters. They differ in how links are organised, but all offer multiple paths between the servers. Measurement studies:cite:benson2010network have shown that datacenter traffic is composed of a lot of short flows called mice that are delay-sensitive, but most of the data is transported in long flows, called elephants that consume most of the bandwidth and can compete with the mice. One of the problems in a datacenter is that congestion can happen on some of the network links while others are unused. This is illustrated in the figure below that shows two TCP connections competing for the same link.


This problem was studied by Raiciu et al. by simulations [RBP+11]. They demonstrate that these collisions between competing flows significantly impact the performance of TCP.


Different techniques have been explored in the literature to solve this problem. Many of the proposed approaches used a centralised controller with Openflow or other similar techniques to reroute flows to avoid congestion.


With Multipath TCP, a completely distributed solution is possible. It leverages the utilisation of Equal Cost Multipath (ECMP) on datacenter switches. When a router/switch has several paths having the same cost towards a given destination, it can send packets over any of these paths. To maximise load-balancing, routers install all the available paths in their forwarding tables and balance the arriving packets over all of them. To ensure that all the packets that correspond to the same layer-4 flow follow the same path and thus have roughly the same delay, routers usually select the outgoing equal cost path by computing : H(IP_{src}||IP_{dst}||Port_{src}||Port_{dst})~mod~n when n is the number of equal cost paths towards the packet’s destination and H a hash function.

A consequence of this utilisation of ECMP is that TCP connections with different source ports between two hosts will sometimes follow different paths. This motivated the design of the ndiffports path manager in the Linux kernel. This path manager opens different subflows using the same source and destination IP addresses, the same destination port but different source addresses. The benefit of this approach is that the different subflows of a Multipath TCP connection will likely follow different paths inside the datacenter. With this path manager, Multipath TCP improves the utilisation of the datacenter as illustrated by the simulation results below.


One of the limiting factors of ECMP is that flows with different source ports may still use the same paths. This problem can be fixed by using a reversible hash function [DPVDL+13].

From an operation viewpoint, the most convincing argument of [RBP+11] was that similar results were obtained with the Linux implementation of Multipath TCP [PBarreD+14] in real datacenters by using Amazon EC2 servers.


The SIGCOMM‘11 article [RBP+11] attracted a lot of interest in the scientific community and is one of the most widely cited Multipath TCP articles. However, as of 2018, no real deployment of Multipath TCP in the datacenter has been publicly documented.


[DPVDL+13]Gregory Detal, Christoph Paasch, Simon Van Der Linden, Pascal Merindol, Gildas Avoine, and Olivier Bonaventure. Revisiting flow-based load balancing: stateless path selection in data center networks. Computer Networks, 57(5):1204–1216, 2013. URL:
[PBarreD+14]C. Paasch, S. Barré, G. Detal, F. Duchene, and others. Linux kernel implementation of Multipath TCP., 2014.
[RBP+11](1, 2, 3) C. Raiciu, S. Barre, C. Pluntke, A. Greenhalgh, D. Wischik, and M. Handley. Improving Datacenter Performance and Robustness with Multipath TCP. ACM SIGCOMM Computer Communication Review (CCR), 41(4):266–277, 2011. URL: