Experimenting with MPTCP using raw sockets
Although Multipath TCP is already available on several platforms (Linux, FreeBSD, iOS11), applications like Tracebox (or Mobile Tracebox) are still a convenient choice for users eager to experiment with the new protocol without installing the full MPTCP stack. These tools (along with zmap, tcpexposure, etc) allow to forge custom packets (e.g. using raw sockets) emulating newer extensions or protocols.
For instance Tracebox can highlight middleboxes interfering with MPTCP by sending MP_CAPABLE Syn with increasing TTL and collecting ICMP Time Exceeded messages from intermediate routers. Even when they don’t respond or don’t quote the full TCP header, Syn Ack without MP_CAPABLE option received from well known MPTCP servers can still reveal interference. When path is clean from middleboxes, MP_CAPABLE Syn can also be used to assess if a server adopts MPTCP.
Unfortunately middleboxes can be very subtle: e.g. they can be completely transparent to MP_CAPABLE packets but still interfere with ADD_ADDR and DSS or they can fictitiously support all options carried by Syn.
This leads to the need of more articulated MPTCP tests: in this post we describe a test (included in Mobile Tracebox), that uses raw sockets to establish a MPTCP connection, exchange data and also associate a second subflow.
In the first part we detail for every step how options should be correctly crafted for MPTCP experiment to succeed, in the second part we explore further scenarios (e.g. options not perfectly compliant with the protocol) to see how a MPTCP stack reacts to them. This can benefit development of similar application, avoiding pitfalls when dealing at low level with MPTCP, but also can help to better understand how the protocol concretely works. The figure summarizes packets exchanged between A, our client running the test, and B, a MPTCP-enabled server (multipath-tcp.org).
MPTCP-compliant scenario
We report the output of Mobile Tracebox (only interesting header fields are included).
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(24d2) TCP::Option_MPTCP(00811000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (00810c4d5dfc94d0a464)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(24d2) TCP::Option_MPTCP(008110000000000000000c4d5dfc94d0a464)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(24d2) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d0000000100483aca) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300049) TCP::Option_MPTCP(2001fb4e43a5)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (cefc) TCP::Option_MPTCP(10023a03caf210000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (100256c7a377b2e33fdaa29163c5)
All fields are in hexadecimal format: we can easily acknowledge the MPTCP option subtype from the first digit. A full trace of the packets exchanged during the probe is also reported.
18:48:32.485197 IP client1.9426 > mptcp.info.ucl.ac.be.http: Flags [S], seq 19922944, win 65535, options [mptcp capable csum {0x1000000000000000}], length 0
18:48:32.573554 IP mptcp.info.ucl.ac.be.http > client1.9426: Flags [S.], seq 3005334072, ack 19922945, win 28800, options [mss 1452,mptcp capable csum {0xc4d5dfc94d0a464}], length 0
18:48:32.573792 IP client1.9426 > mptcp.info.ucl.ac.be.http: Flags [.], ack 1, win 65535, options [mptcp capable csum {0x1000000000000000,0xc4d5dfc94d0a464}], length 0
18:48:35.577198 IP client1.9426 > mptcp.info.ucl.ac.be.http: Flags [.], seq 1:73, ack 1, win 65535, options [mptcp dss seq 4216210269 subseq 1 len 72 csum 0x3aca], length 72: HTTP: GET / HTTP/1.1
18:48:35.664046 IP mptcp.info.ucl.ac.be.http > client1.9426: Flags [.], ack 1, win 28800, options [mptcp add-addr id 8 mptcp.info.ucl.ac.be,mptcp dss ack 4216210269], length 0
18:48:35.664556 IP mptcp.info.ucl.ac.be.http > client1.9426: Flags [.], ack 73, win 28800, options [mptcp dss ack 4216210341], length 0
18:48:35.666894 IP mptcp.info.ucl.ac.be.http > client1.9426: Flags [P.], seq 1:503, ack 73, win 28800, options [mptcp dss ack 4216210341 seq 2447520560 subseq 1 len 502 csum 0xc36b], length 502: HTTP: HTTP/1.1 200 OK
18:48:38.670543 IP client2.52988 > mptcp.info.ucl.ac.be.http: Flags [S], seq 1793048487, win 65535, options [mptcp join id 2 token 0x3a03caf2 nonce 0x10000000], length 0
18:48:38.756268 IP mptcp.info.ucl.ac.be.http > client2.52988: Flags [S.], seq 1665111958, ack 1793048488, win 28800, options [mss 1452,mptcp join id 2 hmac 0x56c7a377b2e33fda nonce 0xa29163c5], length 0
The test uses two client’s addresses (192.168.42.7, client1 – 192.168.1.102, client2) for the two subflows, but it’s still possible to use the same address just with different source ports.
Everything starts with a Syn carrying MP_CAPABLE option (subtype 0x0) with flags A (Checksum required) and H (use of HMAC-SHA1 as crypto algorithm) and a 64 bits key chosen by the client (0x1000000000000000). Server replies with a MP_CAPABLE Syn Ack containing same flags and its key: client takes note of server’s key to echo it on MP_CAPABLE Ack (but also to forge the subsequent MP_JOIN).
If the clients attempts to send a MP_JOIN message at this point MPTCP stack will discard the new subflow with a Rst, since no data has been actually exchanged on the first subflow. This means we have to send a packet with a real payload and a DSS option. To avoid the server dumping our packet or simply closing the connection payload must be a real HTTP request.
GET / HTTP/1.1
Host: blog.multipath-tcp.org
Connection: keep-alive
We also have to assemble a compliant DSS Option (subtype 0x2): we set flags to 0x4 (data sequence number of 32 bits), Subflow Sequence Number to 1, Data-Level Length to the length of our TCP payload (72 bytes); Data Sequence Number is generated from the SHA-1 hash of the client’s key; finally a DSS checksum has to be calculated on payload and DSS pseudo-header. Server answers with 2 packets, the first carries an ADD_ADDR option (subtype 0x3) advertising server IPv6 address: this is a symptom that MPTCP stack has acknowledged that we are speaking MPTCP language. The second contains a DSS Option: we can see how sent data is acked on both TCP and MPTCP levels.
Note
To avoid DSS checksum calculation we can use a Data-Level Length greater than the actual TCP payload length: in this case the packet will be accepted but DSS checksum will not be evaluated waiting for the next TCP segment (packet will be acked at TCP level but not MPTCP level).
After data exchange has taken place on the first subflow we can finally use a second subflow to join MPTCP connection. We send a new Syn from different source address and port (or just different port) with a MP_JOIN option (subtype 0x2) carrying a Token obtained from the key sent by server in its MP_CAPABLE Syn Ack (the first 32 bits of the SHA-1 hash of server’s key) and a Random number; Address Id is obviously set to 2. The server answers with a MP_JOIN Syn Ack (carrying a Hash-based Message Authentication Code and a Random number), sign that our MPTCP experiment has succeeded.
Other scenarios
Another advantage of raw sockets is that we can send packets not perfectly compliant with the protocol simulating how MPTCP stack reacts to possible malfunctioning or tricky middlebox interference.
Invalid MP_CAPABLE Key
In this scenario the client echoes a wrong server’s key in MP_CAPABLE Ack: this inconsistency is ignored and communication proceeds well on both TCP and MPTCP level on the first subflow. Also MP_JOIN still succeeds as long as the token is calculated from the correct server’s key.
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(fac4) TCP::Option_MPTCP(00811000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (0081422b61826574250e)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(fac4) TCP::Option_MPTCP(008110000000000000002000000000000000)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(fac4) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d0000000100483aca) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300049) TCP::Option_MPTCP(2001fb4e43a5)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (cc2e) TCP::Option_MPTCP(1002531ed1b010000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (1002ea5ea69620cd23fc2c3feb6a)
No DSS Checksum, despite requested by counterpart
In the next scenario client sends a DSS option without checksum, although server has requested DSS checksum in its MP_CAPABLE Syn Ack: server replies with a Rst terminating the subflow, but the subsequent MP_JOIN still succeeds.
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(c49a) TCP::Option_MPTCP(00011000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (008106ef03ac4b958a2f)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(c49a) TCP::Option_MPTCP(0001100000000000000006ef03ac4b958a2f)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(c49a) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d000000010048) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Rst Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(2001fb4e435d)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (6bf0) TCP::Option_MPTCP(100252ad76cd10000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (10028ce03d14fa336cdae211d46d)
Bad DSS Checksum (MP_FAIL)
In another scenario a wrong DSS Checksum is sent, in this case the server correctly acknowledges data at TCP level, but sends a MP_FAIL (subtype 0x6) option causing fall back to a single subflow. Obviously subsequent MP_JOIN Syn will be rejected.
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(f1ba) TCP::Option_MPTCP(00811000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (0081b144931ac84d865a)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(f1ba) TCP::Option_MPTCP(00811000000000000000b144931ac84d865a)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(f1ba) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d0000000100480100) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300049) TCP::Option_MPTCP(60008710f99bfb4e435d) TCP::Option_MPTCP(2001fb4e435d)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (ebbb) TCP::Option_MPTCP(10023d41ba9910000000)
64: 130.104.230.45 [TCP Rst Ack] -TCP::Option_MPTCP
Fall back without MP_FAIL
Fall back can also occur when client sets DSS Data-Level Length to 0 (“infinite mapping”): in this scenario server acknowledges data at TCP and MPTCP level and doesn’t send any MP_FAIL (since this case is interpreted as a choice by the client and not an anomalous event like a wrong DSS checksum), but fall back is still evident when client attempts to associate a new sufblow and MP_JOIN is not accepted.
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(de0a) TCP::Option_MPTCP(00811000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (0081b6e7a1b307358b82)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(de0a) TCP::Option_MPTCP(00811000000000000000b6e7a1b307358b82)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(de0a) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d0000000100000000) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300049) TCP::Option_MPTCP(2001fb4e43a5)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (2c6d) TCP::Option_MPTCP(10020edf2fd810000000)
64: 130.104.230.45 [TCP Rst Ack] -TCP::Option_MPTCP
Bad MP_JOIN Token
In the last scenario tested the client sends a wrong Token in MP_JOIN Syn. The server unsurprisingly replies with a Rst.
0: 192.168.42.7 [TCP Syn] TCP::SourcePort(1594) TCP::Option_MPTCP(00811000000000000000)
64: 130.104.230.45 [TCP Syn Ack] TCP::Option_MPTCP (008191d0ae47af67a0f2)
0: 192.168.42.7 [TCP Ack] TCP::SourcePort(1594) TCP::Option_MPTCP(0081100000000000000091d0ae47af67a0f2)
64: *
0: 192.168.42.7 [TCP Ack 72 bytes] TCP::SourcePort(1594) TCP::SeqNumber(01300001) TCP::Option_MPTCP(2004fb4e435d0000000100483aca) TCP::Payload ("GET / HTTP/1.1...")
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300001) TCP::Option_MPTCP(3608200106a8308f000102163efffec5c815)
64: 130.104.230.45 [TCP Ack] TCP::AckNumber(01300049) TCP::Option_MPTCP(2001fb4e43a5)
0: 192.168.1.102 [TCP Syn] TCP::SourcePort (d26e) TCP::Option_MPTCP(10020200000010000000)
64: 130.104.230.45 [TCP Rst Ack] -TCP::Option_MPTCP
Mobile Tracebox
The MPTCP test described in the first part has been included in the new version of Mobile Tracebox Screenshots show how to select destination address and the appropriate probe. To avoid a full traceroute on every packet sent, minimum TTL can be conveniently set to 64.
Since raw sockets are needed this probe is available only on rooted Android devices.
Multipath TCP on iOS11 : A closer look at the TCP Options
Multipath TCP uses a variety of TCP options to use different paths simultaneously. Several Multipath TCP options are defined in RFC6824 :
- subtype 0x0: MP_CAPABLE
- subtype 0x1: MP_JOIN
- subtype 0x2: DSS
- subtype 0x3: ADD_ADDR
- subtype 0x4: REMOVE_ADDR
- subtype 0x5: MP_PRIO
- subtype 0x6: MP_FAIL
- subtype 0x7: MP_FASTCLOSE
In this blog post, we explore in more details the packet trace collected on an iPhone using iOS11 beta. We start our analysis with the three-way handshake. The trace contains one Multipath TCP connection. Recent versions of Wireshark support Multipath TCP and we use the tcp.options.mptcp.subtype==0 filter to match all the packets that contain the MP_CAPABLE option. This option only appears in the three packets of the initial three-way handshake. Let us first analyse the SYN sent by the iPhone. In our test over an LTE network, iOS11 beta2 advertises the following options:
- MSS set to 1410 bytes. This is a relatively small value that was probably chosen to reduce the risk of fragmentation or Path MTU discovery problems since cellular networks often use tunnels internally
- Selective Acknowledgements are proposed
- The Window scale factor is set to 6 and the iPhone advertises a 64Kbytes window.
- The Timestamp option is used as well.
- The MP_CAPABLE option sent by the iPhone does not request the utilisation of the DSS checksum. The DSS checksum was introduced in RFC6824 to detect middlebox interference. Previous versions of iOS did not use this checksum to support Siri because Siri ran over HTTPS and this prevents most middlebox interference. However, when Multipath TCP is used to support a protocol such as HTTP, there is a risk of interference from middleboxes that inject HTTP headers. If you plan to use Multipath TCP on iOS11, you should probably rely on HTTPS and forget HTTP for other reasons than Multipath TCP.
The server, in this trace the Linux implementation running on multipath-tcp.org replies with Selective Acknowledgements, Timestamps, a Window Scaling factor set to 7 and requires the utilisation of the DSS Checksum.
The MP_CAPABLE option contained in the third ACK sent by the iPhone confirms that the iPhone will use the DSS checksum for this connection as requested by the server.
The utilisation of the DSS Checksum is clearly visible in the first data packet that is sent by the iPhone. It uses 32 bits long Data sequence numbers and data acknowledgement numbers.
The first data packet returned by the Linux server is shown below. It also uses 32 bits data sequence and data acknowledgement numbers.
With iOS11 beta2, the iPhone uses the MP_PRIO option and sets the cellular subflow as a backup subflow. This is immediately visible in the fourth packet of the trace that is shown below.
Apple has already explained earlier that they do not use the ADD_ADDR option because their stack is focussed on clients and they do not see a benefit in advertising client addresses since those are often behind a NAT or firewall. We did not observe ADD_ADDR or REMOVE_ADDR in our first trace.
The MP_JOIN option is used to create subflows. In our trace, this happens at time 4.74 when we enable the WiFi interface. The MP_JOIN option contains the token advertise by the server in the MP_CAPABLE option and its backup flag is reset. This indicates that the WiFi subflow is preferred to the cellular flow that was initially created. It is interesting to note that iOS11 beta advertises a longer MSS over the WiFi interface than over the cellular one. The same window scaling factor (6) is used.
We did not observe MP_FASTCLOSE in this trace.
We’ll discuss MP_FAIL in another post since it is related to fallbacks to TCP.
MPTCP experiments on iOS 11 beta
MPTCP support has been announced for iOS 11 during wwwdc2017. The developer documentation presents a new instance property called multipathServiceType inside the URLSessionConfiguration class that can be set to one of the constants specified in MultipathServiceType enumeration, which is also in the URLSessionConfiguration class. The enumeration contains four constants and the documentation has a small description for each constant :
- none : The default service type indicating that Multipath TCP should not be used.
- handover : A Multipath TCP service that provides seamless handover between Wi-Fi and cellular in order to preserve the connection.
- interactive : A service whereby Multipath TCP attempts to use the lowest-latency interface.
- aggregate : A service that aggregates the capacities of other Multipath options in an attempt to increase throughput and minimize latency.
The code bellow shows a simple example of usage:
let config = URLSessionConfiguration.ephemeral
config.multipathServiceType = URLSessionConfiguration.MultipathServiceType.handover
let session = URLSession(configuration: config)
let url = URL(string: "http://multipath-tcp.org/data/uml/vmlinux_64")
let task = session.dataTask(with: url!, completionHandler:{...})
task.resume()
We will present experiments done with iOS in a series of posts on this blog. In our first experiment, we use the handover service type. We start the connection with the wifi interface down and after a few seconds, we turn on the wifi interface. The trace of the connection is available here. We use mptcptrace to see how the subflows are used. Let’s take a look at the Multipath-TCP sequence numbers over time :
As expected, the connection starts on the mobile interface because it is the only interface available at that time. When the wifi interface becomes available, around five seconds after the start of the connection, all the traffic is immediately sent to the wifi subflow.
Let’s take a closer look at what happens during the transition around five seconds after the start of the connection:
On this graph, MPTCP acknowledgements are pictured as blue crosses. We can see on this zoom, on the left upper corner, that the client receives out-of-sequence (from Multipath-TCP’s perspective) packets during the transition. This is due to the fact that iOS tries to terminate the connection as soon as possible on the mobile interface and the server does not know yet that it should not be used anymore. Starting from packets 4647 in the trace, we can see the zero window advertisement and resets sent by the iPhone on the mobile subflow. Once the server detects that some packets will not arrive on the mobile subflow, when it receives the reset, it reinjects the packets on the wifi subflow. During the time of the reinjections, out-of-order packets are kept in the out-of-order queue of MPTCP on the client side. To observe this out-of-sequence queue, we zoom on the right top corner of the graph :
On this graph, we can observe the MPTCP ACKs that cover the out-of-sequence packets received earlier. In particular we can observe a hole in the middle of the graph. If we zoom on other parts of the graph we can see several holes like this one.
This concludes our first analysis of Multipath-TCP on iOS. Stay tuned for more detailed analysis and tests. In next posts, we will discuss other Multipath TCP services offered by iOS11.
The “Experimental” status of Multipath TCP
Multipath TCP is defined in RFC 6824 and I recently heard feedback from someone working for industry who mentioned that Multipath TCP should not be considered for deployment given its Experimental status. I was surprised by this comment and I think that it would be useful to clarify some facts about the maturity of Multipath TCP.
First, from a administrative viewpoint, the Experimental status of Multipath TCP was decided at the creation of the IETF MPTCP working group. At that time, it was unclear whether it would be even possible to specify a protocol like Multipath TCP and the IESG wanted to encourage experiments with the new protocol. By selecting this option , the IESG prepared a future standardisation of the protocol and this is happening right now with the definition of a standards-track version of Multipath TCP in RFC6824bis . According to the milestones of the IETF MPTCP working group, this revision should be ready in 2017.
Second, from a technical viewpoint, the maturity of a protocol cannot be inferred from the status of its specification. The best way to measure this maturity is to observe the interoperable implementations and the deployment of the protocol. From these two viewpoints, Multipath TCP is a clear success. There are endhost implementations on Linux, FreeBSD, Apple iOS, MacOS and Oracle Solaris. Multipath TCP is also supported on various middleboxes including Citrix Netscaler, F5 BIG-IP LTM and Ericsson.
From a deployment viewpoint, Multipath TCP is also a huge success. Hundreds of millions of users of Apple devices (iPhone, iPad, laptops) use Multipath TCP every time they use the Siri voice recognition application. In Korea, a dozen of models of high-end smartphones from Samsung and LG include a port of the reference implementation of Multipath TCP in the Linux kernel and use SOCKS proxies to bond WiFi and fast LTE. Several network operators provide those proxies as a commercial service. Other companies such as Swisscom or OVH also rely on SOCKS proxies to bond different types of links together. Another emerging use case are hybrid access networks. In various countries, network operators are require to provide fast broadband services, even in rural areas where deploying fiber is too expensive. Many of these operators want to combine their xDSL and LTE networks in order to improve the bandwidth to their customers. Tessares has already deployed a pilot hybrid access network solution that leverages Multipath TCP in Belgium.