Apple uses Multipath TCP

The initial specification for Multipath TCP was published in January 2013 RFC 6824. Apple had participated to some of the discussions during the IETF meetings before, but never announced a deployment. Shortly after the publication of RFC 6824, Phil Eardley published a blank internet draft, draft-eardley-mptcp-implementations-survey-01 that explicitly asked questions to implementers.

../../../_images/survey.png

Four implementations were disclosed during the summer of 2013:

  • the stable Linux implementation discussed on page 11
  • an ongoing implementation on FreeBSD discussed on page 19
  • an anonymous implementation discussed on page 23
  • on implementation on Citrix load balancers discussed on page 31

Five years after, it is interesting to look at the characteristics of this anonymous implementation.

  • This implementation only supports client-initiated subflows
  • It uses 4 bytes DSN as a default, but can support 8 bytes DSN
  • The support for ADD_ADDR and REMOVE_ADDR was described as : It does not support sending ADD_ADDR or processing ADD_ADDR as it is considered a security risk. Also, we only have a client side implementation at the moment which always initiates the sub flows. The remote end does not send ADD_ADDR in our configuration. The client can send REMOVE_ADDR however when one of the established sub flow’s source address goes away. The client ignores incoming REMOVE_ADDR options also.
  • It does not implement the coupled congestion control defined in RFC 6356
  • It uses a private API and not the socket API proposed in RFC 6897
  • The proposed deployment is described as follows : MPTCP in mobile environments is very powerful when used in the active/backup mode. Since the network interfaces available on mobile devices have different cost characteristics as well as different bring up and power usage characteristics, it is not useful to share load across all available network interfaces - at least not currently. Providing session continuity across changing network environments is the key deployment scenario.

In September 2013, Apple launched iOS7 that included support for Multipath TCP. Apple’s motivation for using Multipath TCP on iOS have been explained in details in [BS16]:

Siri is the digital assistant in Apple’s iOS and macOS operating systems. Because speech recognition requires tremendous processing power, Siri streams spoken commands to Apple’s datacenter for speech recognition; the result is sent back to the smartphone. Although the duration of a user’s interaction with Siri is relatively short, Siri’s usage pattern made this data transfer a perfect client for MPTCP.

Many people use Siri while walking or driving. As they move farther away from a WiFi access point, the TCP connection used by Siri to stream its voice eventually fails, resulting in error messages.

To address this issue, Apple has been using MPTCP—and benefiting from its handover capabilities—since its iOS 7 release. When a user issues a Siri voice command, iOS establishes an MPTCP connection over WiFi and cellular. If the phone loses connectivity to the WiFi access point, traffic is handed over to the cellular interface. A WiFi connection that is still in sight of an access point can have a channel become so lossy that barely any segments can be transmitted. In this case, another retransmission timeout happens and iOS retransmits the traffic over the cellular link.

The article continues with additional information that describes how Apple has tuned Multipath TCP to this specific use case. A description of the Multipath TCP handshake used by Siri has been published in a previous blog post.

While Multipath TCP was part of iOS, it was only used by Apple’s own Siri applications. The regular applications could not leverage the benefits of Multipath TCP. This changed in 2017 with the launch of iOS11. During WDC2017, Christoph Paasch and his colleagues announced that any application would be able to use Multipath TCP on iOS11.

../../../_images/christoph.png

A detailed summary of these announcements appear on the tessares blog. iOS11 supports two modes of operation : Handover and Interactive.

Connection starts over the WiFi link and no packet is sent over the cellular interface. If the signal gets worse, a new TCP subflow will be created on the cellular interface automatically. The cellular subflow will be removed once the user is back in a WiFi network.

../../../_images/handover-apple.png

The interactive mode establishes both WiFi and cellular subflows for each Multipath TCP connection, even if the WiFi network appears to be working well. The objective of this mode is to reduce latency. The Multipath TCP scheduler will select the flow that provides the lowest latency.

../../../_images/interactive.png

Since the publication of iOS11, some applications have started to use Multipath TCP. One of them is the Multipath Tester, an application written by Quentin De Coninck that allows to compare the performance of Multipath TCP and Multipath QUIC on iOS11 [CB18]. You can download it from https://itunes.apple.com/us/app/multipathtester/id1351286809

../../../_images/mptester.png

References

[BS16]Olivier Bonaventure and SungHoon Seo. Multipath tcp deployments. IETF Journal, 12(2):24–27, 2016. URL: https://www.ietfjournal.org/multipath-tcp-deployments/.
[CB18]Quentin De Coninck and Olivier Bonaventure. Observing network handovers with multipath TCP. In Proceedings of the ACM SIGCOMM 2018 Conference on Posters and Demos, SIGCOMM 2018, Budapest, Hungary, August 20-25, 2018, 54–56. 2018. URL: https://multipath-quic.org/multipathtester/2018/08/28/sigcomm-poster.html, doi:10.1145/3234200.3234214.

Can Multipath TCP cope with middleboxes ?

As explained in a previous blog post, Multipath TCP had to cope with a variety of middleboxes which could interfere with this TCP extension.

Shortly after we detected the first interferences between a firewall and Multipath TCP, Honda et al. presented a detailed analysis [HNR+11] of the limits of the extensibility of TCP based on Internet measurements. To correctly understand the problems caused by middleboxes, we first need to remember that they can operate in any layer of the protocol stack as illustrated in the figure below.

../../../_images/mbox1.png

When a router forwards an IPv4 packet that contains a TCP segment, it may modify some fields of the IPv4 header but never changes any field of the TCP header. This is one of the basis of the layering principles.

../../../_images/mbox2.png

Middleboxes are different. As they potentially operate in any layer of the protocol stack, they can potentially change any field of the packet headers, in any layer. Some of them also modify packet payloads.

../../../_images/mbox3.png

The main difficulty in such a network environement is that the TCP state on the client and on the server are updated based on information carried out inside packets. When the information placed in these packets changes after their transmission by one of the communicating hosts, this can create strange problems. Several of the functions of the Multipath TCP were designed to cope with middlebox interference. Here are a few examples :

  • During the three-way handshake, the client sends the MP_CAPABLE option in the third ack to cope with a middlebox that could remove it from the SYN+ACK
  • The ADD_ADDR, REMOVE_ADDR and MP_JOIN option contain an address identifier to cope with Network Address Translation
  • The DSS option uses relative sequence numbers to cope with middleboxes that randomize the initial TCP sequence number
  • The DSS option maps of block of data from the bytestream onto the TCP subflow. The length field of the DSS option allows to cope with middleboxes (or fast NICs) that segment/reassemble packets
  • The DSS option contains a Checksum to cope with middleboxes that add/remove bytes in the payload

Multipath TCP and its implementation in the Linux kernel can cope with these interferences and others. This makes Multipath TCP very robust compared to older TCP extensions. An example with a strange middlebox was published in another blog post.

A detailed analysis of the reactions of Multipath TCP against those interferences was published in [HDP+13]. In some cases, Multipath TCP reacts by closing the subflow that passes through this middlebox. In other cases, it fallsback to regular TCP. A summary of this analysis may be found in the table below.

../../../_images/mbox4.png

If you suspect that there is a middlebox that interferes with Multipath TCP connections on a path, you can use tracebox [Gre] to detect the location of this middlebox. Examples of the utilisation of tracebox on Linux/MacOS and Android appeared on earlier blog posts.

References

[Gre]Detal Gregory. \texttt tracebox. http://www.tracebox.org.
[HDP+13]Benjamin Hesmans, Fabien Duchene, Christoph Paasch, Gregory Detal, and Olivier Bonaventure. Are TCP Extensions Middlebox-proof? In Proceedings of the 2013 Workshop on Hot Topics in Middleboxes and Network Function Virtualization (HotMiddlebox). 2013. URL: https://inl.info.ucl.ac.be/publications/are-tcp-extensions-middlebox-proof.html.
[HNR+11]M. Honda, Y. Nishida, C. Raiciu, A. Greenhalgh, M. Handley, and H. Tokuda. Is it still possible to extend TCP? In Proceedings of the 2011 ACM SIGCOMM conference on Internet Measurement Conference (IMC). 2011.

Fixing problems before the submission deadline

In the academic community, paper submission deadlines are sometimes strong incentives that encourage researchers to find solutions to problems that they ignored until then. While preparing the final version of a paper [RPB+12] that describes the design and the implementation of Multipath TCP, we thought that it would be interesting to add some measurement results to confirm that the protocol worked well for the important use case of combining the Wi-Fi and cellular interfaces on smartphones. We had already performed various experiments with such wireless networks and were expecting that the results could be obtained in a few hours.

Our initial objective was to meet one of the functional goals of as described in RFC 6581 :

*Improve Throughput: Multipath TCP MUST support the concurrent use
of multiple paths. To meet the minimum performance incentives for deployment, a Multipath TCP connection over multiple paths SHOULD achieve no worse throughput than a single TCP connection over the best constituent path.*

We created a small measurement setup in the lab by using two servers connected over Gigabit Ethernet with tc.

../../../_images/wifi0.png

We first verified whether TCP could use the two wireless links when used alone. This was indeed the case as shown in the figure below (source [RPB+12]).

../../../_images/wifi1.png

For this measurement, we looked at the impact of the receive window on the measured throughput. For TCP, the impact is low, except when the window is smaller than the bandwidth delay product, but this is not a surprise. When then ran the same experiments with the two interfaces with Multipath TCP. We were expecting some impact with a small window but did not anticipate the results shown below (source [RPB+12]).

../../../_images/wifi2.png

When the maximum window is large, Multipath TCP aggregates the cellular and the Wi-Fi interfaces as expected. However, when the receive window is smaller, Multipath TCP can transfer at a rate which is small than regular TCP. This result was annoying and we were less than a week before the submission deadline. It was difficult to submit the paper without describing this basic use case in the paper. We organised daily teleconferences to understand the problem and then try to solve it.

tcpdump helped us to understand the problem by collecting packet traces. The main issue was the difference between the delay of the cellular link and the delay of the Wi-Fi link. We observe frequently the following situation in the packet trace. The server sent many packets over the Wi-Fi interface and one over the cellular interface. The acknowledgements were coming quickly from the Wi-Fi interface, but the sender had frequently to wait for an acknowledgement over the cellular interface. During these periods, the receive window was full and the sender could not transmit packets over the Wi-Fi link although it was idle. This was the explanation for the reduced throughput with the small receive window.

Once the problem was identified, the problem could be solved. The solution is composed of two parts. First, when Multipath TCP detects that it is window-blocked and there is some unacknowledged data, it tries to re-inject the data over another subflow whose congestion window is open. If this data is acknowledged quickly, then the receiver will advertise a large receive window that will enable the sender to transmit. Unfortunately, this is not sufficient as the same situation could happen again later. The second part of the solution is to penalise the slow subflow by halving its congestion window. These two elements of the solutions fixed the problem over Wi-Fi and cellular.

../../../_images/wifi3.png

This heuristic was later improved after a detailed experimental evaluation over a wire range of network conditions [PKB13].

References

[PKB13]Christoph Paasch, Ramin Khalili, and Olivier Bonaventure. On the benefits of applying experimental design to improve multipath tcp. In Proceedings of the ninth ACM conference on Emerging networking experiments and technologies, 393–398. ACM, 2013. URL: https://inl.info.ucl.ac.be/publications/benefits-applying-experimental-design-improve-multipath-tcp.
[RPB+12](1, 2, 3) C. Raiciu, C. Paasch, S. Barre, A. Ford, M. Honda, F. Duchene, O. Bonaventure, and M. Handley. How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP. In Proceedings of the 9th Symposium on Networked Systems Design and Implementation (NSDI). 2012. URL: https://inl.info.ucl.ac.be/publications/how-hard-can-it-be-designing-and-implementing-deployable-multipath-tcp.html.

Multipath TCP inside the beast

One of the nice points about releasing open-source software such as the Multipath TCP implementation in the Linux kernel is that there are unexpected use cases. In early 2013, we were contacted by Niels Laukens who works for VRT, the Dutch speaking television in Belgium. He had been following the project and identified a nice use case. Journalists use more and more computers to prepare their articles, but also when they go off-site for interviews. Once the interview has been recorded, they often need to edit it locally before uploading it to the television services to broadcast it or place it on the web site.

For live videos, they often rely on dedicated satellite channels, but these are expensive and they need a large antenna. Such antennas are fine when an event is planned and they need a large coverage. However, there are many situations where they cannot send a large team to record interviews and short movies. To cover those cases, they have equipped a small “mini” that serves as a mobile studio. A single journalist can record an interview, edit it and then send it over the air. This last part is the most interesting one for us. Satellite links are expensive and there are many situations where it is difficult to use a satellite. 3G, 4G and Wi-Fi could help, but their performance differ. Asking each journalist to learn to select the best network to upload his work was not a feasible solution. Fortunately, Niels found the right solution with Multipath TCP. The mini is equipped with a simple Multipath TCP proxy that is attached to all the available networks. The journalist to use his/her regular laptop through the proxy to upload his/her movies via all the available connections. This is much faster and simpler than always moving the car to a location where the satellite works well.

VRT published a nice video of their mini that is internally called “The Beast” :

https://www.youtube.com/watch?v=JMRWq7aqi9o