Multipath TCP encounters its first middlebox

Like many Internet protocols, Multipath TCP was designed by a working group. By involving several network engineers, one expects that they will think about all possible problems and that the final design will be stronger. However, there is also a risk that this design by committee becomes a series of compromises and that more and more options are added to the protocol. The best way to avoid this risk is to implement the protocol while it is being specified or base the key decisions of the design on an existing implementation. During the first year, the MPTCP working group mainly worked by emails and many ideas were exchanged.

Fortunately, two members of the working group started to write an implementation based on the first drafts. Sebastien Barre announced his first prototype implementation in the Linux kernel on November 18th, 2009. Costin Raicu replied on November 24th, 2009 with a user space implementation that also extended the Linux TCP stack.

Sebastien Barre ‘s implementation was designed to test the feasibility of implementing Multipath TCP inside an operating systems kernel. Costin Raicu ‘s implementation was designed to evaluate the performance of the coupled congestion control scheme [WRGH11]. These two implementations were complementary. Sebastien’s implementation leveraged his earlier experience in implementing shim6 RFC 5533 in the Linux kernel [BarreRB11]. For this reason, it only supported IPv6, while Costin’s implementation only worked over IPv4. Sebastien continued to improve his first implementation. Version 0.2 that was released in March 2010. When he added IPv4 support to this implementation, he sent the kernel sources to colleagues in Finland and UK. The Multipath TCP implementation was working well inside our labs and this was an opportunity to test it over longer distance paths to see how retransmissions and other techniques reacted to longer delays. The first tests were a disaster. This version of Multipath TCP could establish a connection to the remote server, but no data was exchanged. Sebastien looked at all the possible sources of problems and eventually took a packet trace on both servers to manually check the packets that were exchanged. Eventually, he found that a middlebox somewhere on the Internet was changing the TCP sequence numbers of the packets without modifying the Multipath TCP options. He summarized his findings in the email below.

../../../_images/mbox-1.png

He continued to explore the problem and found that the culprit was our campus firewall… This firewall was configured to configured to rewrite TCP sequence numbers to protect TCP connections from weak machines such as Windows98 …

../../../_images/mbox-2.png

This was the first time that a middlebox interfered with a Multipath TCP implementation, by far not the the last unfortunately. The protocol designers learned this lesson and looked at different ways to make Multipath TCP much more resilient to middlebox interference. We will explore these issues in more details later in another blog post.

An important lesson that we learned is that by having developed a fully functional implementation early, we could quickly detect operational problems and fix them. Sebastien’s initial implementation later became the reference Multipath TCP implementation and various developers have contributed to the current code base. The global Internet is far more complex than what students learn in textbooks…

References

[BarreRB11]Sébastien Barré, John Ronan, and Olivier Bonaventure. Implementation and evaluation of the shim6 protocol in the linux kernel. Computer Communications, 34(14):1685–1695, 2011. URL: https://inl.info.ucl.ac.be/publications/implementation-and-evaluation-shim6-protocol-linux-kernel.
[WRGH11]D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 2011.

Multipath TCP : controlling congestion

Since the publication of Van Jacobson’s seminal paper on Congestion Avoidance and Control in 1988 [Jac88], congestion control has been one of the most active topics in transport protocol research. This was one of the key scientific challenges for the design of a multipath transport protocol.

As explained in RFC 6182, one of the goals of Multipath TCP was to solve the congestion problem while remaining fair with regular TCP traffic. As Multipath TCP uses multiple parallel TCP connections that are called subflows, a naive implementation of Multipath TCP that would use a standard TCP congestion control scheme for each subflow would be unfair against single TCP flows that would compete for the same resources.

Several researchers had addressed problems that are similar to multipath congestion control earlier but from a mathematical viewpoint and without considering an implementation in a real protocol [KV05], [KMassoulieT07]. An interesting survey article was published later [KMassoulieT11].

The first pratical multipath congestion control scheme was proposed in the paper Design, Implementation and Evaluation of Congestion Control for Multipath TCP [WRGH11]. This is an evolution of the classical TCP congestion control scheme RFC 5681. Compared to the single flow TCP congestion control, the multipath congestion control scheme slows down the increase of the congestion window while still halving it when congestion occurs. A summary of this multipath congestion control algorithm can be found below (source [WRGH11]).

../../../_images/coupled.png

The IETF has adopted this multipath congestion control algorithm as the default one for Multipath TCP RFC 6356. It was initially designed based on simulations with htsim and a preliminary userspace implementation of MPTCP. It has later been added to the Multipath TCP implementation in the Linux kernel.

Since the publication of [WRGH11], several other congestion control algorithms have been proposed and imlemented. These include OLIA [KGP+12], BALIA [PWHL16] and a multipath adaptation of TCP Vegas [CXF12]. The authors of these three algorithms released their implementation in Multipath TCP implementation in the Linux kernel. OLIA [KGP+12] is often preferred by Multipath TCP users.

Each of the above articles discussed the merits of the proposed congestion control scheme and compares it by simulations or measurements with other congestion control schemes. It is likely that other multipath congestion control algorithms will be proposed by the research community. As the IETF is adopting the CUBIC congestion control algorithm for single path TCP RFC 8312, a multipath variant would cleary be a useful contribution. It would also be interesting to develop a multipath variant of BBR [CCG+16].

A recent survey [KL18] summarises the multipath congestion control schemes implemented in the Linux kernel in a nice table shown below.

../../../_images/mptcp-cc.png

References

[CXF12]Yu Cao, Mingwei Xu, and Xiaoming Fu. Delay-based congestion control for multipath tcp. In Network Protocols (ICNP), 2012 20th IEEE International Conference on, 1–10. IEEE, 2012.
[CCG+16]Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. Bbr: congestion-based congestion control. Queue, 14(5):50, 2016.
[Jac88]V. Jacobson. Congestion avoidance and control. ACM SIGCOMM Computer Communication Review, 18(4):314–329, 1988.
[KV05]Frank Kelly and Thomas Voice. Stability of end-to-end algorithms for joint routing and rate control. ACM SIGCOMM Computer Communication Review, 35(2):5–12, 2005.
[KMassoulieT07]Peter Key, Laurent Massoulié, and Don Towsley. Path selection and multipath congestion control. In INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE, 143–151. IEEE, 2007.
[KMassoulieT11]Peter Key, Laurent Massoulié, and Don Towsley. Path selection and multipath congestion control. Communications of the ACM, 54(1):109–116, 2011.
[KGP+12](1, 2) R. Khalili, N. Gast, M. Popovic, U. Upadhyay, and J.-Y. Le Boudec. MPTCP is not Pareto-Optimal: Performance Issues and a Possible Solution. In Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies (CoNEXT). 2012.
[KL18]Bruno Yuji Lino Kimura and Antonio Alfredo Frederico Loureiro. Mptcp linux kernel congestion controls. Technical Report arXiv:1812.03210, Arxiv, Dec. 2018. URL: https://arxiv.org/abs/1812.03210.
[PWHL16]Qiuyu Peng, Anwar Walid, Jaehyun Hwang, and Steven H Low. Multipath tcp: analysis, design, and implementation. IEEE/ACM Transactions on Networking (ToN), 24(1):596–609, 2016.
[WRGH11](1, 2, 3) D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 2011.

Multipath TCP : the Maastricht consensus

The work on Multipath TCP started in 2008 and quickly its designers approached the IETF to create a new working group. This process starts with a BOF that was held in July 2009 in Stockholm. The discussion at the MPTCP BOF was both open and constructive reading again the meeting minutes and the MPTCP working group was quickly approved.

Its initial charter was pretty ambitious:

Mar 2010          Established WG consensus on the Architecture
Aug 2010          Submit to IESG architectural guidelines and security threat analysis as informational RFC(s)
Mar 2011          Submit to IESG basic coupled congestion control as an experimental RFC
Mar 2011          Submit to IESG protocol specification for MPTCP extensions as an experimental RFC
Mar 2011          Submit to IESG an extended API for MPTCP as an or part of an experimental or informational RFC
Mar 2011          Submit to IESG application considerations as an informational RFC
Mar 2011          Recharter or close WG

All the design was supposed to be finished in less than two years. The congestion control problem that was considered as the most important one was almost solved and the protocol design did not seem too difficult.

Shortly after the creation of the working group, Alan Ford asked an interesting question on the multipathtcp mailing list : Given that endhosts will need to signal information through a TCP connection, should they use TCP options or the payload with a TLV format to encode this information ?

This was a very important design decision. On one hand, encoding control information in TCP options is the standard way of extending TCP. An important benefit of using TCP options is that there is a clear separation between the control information and the user payload. However, the extended TCP header has a limited size and it is difficult to exchange a lot of control information inside TCP option. Another issue is that TCP options are not exchanged reliably since they are not acked. On the other hand, placing control information inside the packet payload required the utilisation of a Type/Length/Value format in the packet payload to distinguish between user data and control information.

The debate over this key design question lasted for almost a year. Michael Scharf was convinced by the idea of placing control information inside the payload and proposed a detailed design in Multi-Connection TCP (MCTCP) Transport. An important advantage of using the payload to carry control information was that MCTCP could be implemented as a library that intercepts system calls without requiring modifications to the kernel TCP implementation. Michael Scharf provided additional information in a subsequent paper [SB11]. However, using the payload to carry control information has several drawbacks. First, it is difficult to implement flow control at the Multipath TCP level since it depends on the flow control of the underlying TCP connection. Second, the TLV format imposes a specific format to the packets that are exchanged by Multipath TCP hosts. This format might interfere with middleboxes such as firewalls. For example, consider a firewall that is configure to verify that all connections on port 80 only carry valid HTTP requests and responses. With MCTCP, this firewall will observe a traffic pattern that is not exactly HTTP.

The figure below (source Multi-Connection TCP (MCTCP) Transport) describes the architecture of MCTCP.

../../../_images/mctcp.png

The payload/options debate lasted for almost a year and the progress of the working group was slow. There was a risk that these two designs could evolve in parallel for a long period of time. In July 2010, the working group met in Maastricht and one of the main objectives of this meeting was to reach a consensus on this important design decision. Costin Raiciu explained the arguments that were in favor of using TCP options while Michael Scharf was in favor of using TLVs in the payload. In this end, the working group agreed to focus its energy on developing a Multipath TCP protocol that uses TCP options

References

[SB11]M. Scharf and T. Banniza. Mctcp: a multipath transport shim layer. In 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011, volume, 1–5. Dec 2011. doi:10.1109/GLOCOM.2011.6134021.

Multipath TCP: the architectural principles

When an IETF working group start, its members first need to agree on a charter and a set of principles that will guide their work. For Multipath TCP, the key architectural principles have been documented in :rfc`6182`.

../../../_images/rfc6182-1.png

The first, an very important, assumption for the design of Multipath TCP was that at least one of the two communicating hosts would have two or more IP addresses. This is captured in the figure below (source RFC 6182).

../../../_images/rfc6182-2.png

Then, it is interesting to recondiser the function goals that are listed in Section 2.1 of RFC 6182:

  • Improve Throughput
  • Improve Resilience

Today, given the wide deployment of Multipath TCP on Apple smartphones, it could be suprising that this document did not anticipate the need to support fast handovers, i.e. the ability to quickly switch a connection from Wi-Fi to cellular or the opposite. The resilience requirement takes into account the ability to retransmit data from one path to another.

After these functional goals, four important compatibility goals were listed in RFC 6182. The first one is that Multipath TCP should remain compatible with the existing socket API, although the document expected that another more advanced API would be developed later. Second, and this turned out to be a very difficult compatibility goal, Multipath TCP must preserve the ability to transfer data. This implies that if two hosts were able to exchange data over a given network path, they should still be able to exchange the same data once TCP has been replaced by Multipath TCP. In some network scenarios, preserving the ability to exchange data could rely on a fallback to regular TCP. The third goal is that Multipath TCP should not harm existing TCP flows from a congestion control viewpoint. Finally, the third goal is that Multipath TCP should not be less secure than regular TCP.

Section 4 of RFC 6182 provides a high level functional decomposition of Multipath TCP. One of the key elements of this decomposition is that a Multipath TCP will be composed of a set of subflows as illustrated in the figure below (source RFC 6182).

../../../_images/rfc6182-3.png

Then, Section 5 provides the key design principes that will be discussed in more details in other blog posts. Section 7 concerns the interactions with middleboxes, another important problem that will also be discussed in subsequent blog posts.