Testing Multipath TCP

Once you have installed your mptcp-enabled kernel, you can test it is working as expected using the echo and discard services available on http://multipath-tcp.org

The echo service RFC 862 can be reached with a telnet client on port 7, and will send back every line you send it. The discard service RFC 863 is available on port 9 and discards all data you send it.

Using those services makes it easy to test you MPTCP stack : those services are not normally used and when capturing packets to and from those services, you can be nearly sure you won’t see unrelated packets (i.e. packets from other connections), which would certainly be the case if you tested it with port 80.

Opening a connection

In this post, we will look at the TCP segments exchanged between your host and the discard service running on discard.multipath-tcp.org by using tcpdump (an alternative is to use wireshark)

Let’s first see what happens when we open a connection to the discard service with the command

telnet discard.multipath-tcp.org 9

and let’s capture the packets exchanged in another terminal with the command

tcpdump -n -i any port 9

This captures segments to and from port 9 on all interfaces of the host, which has an ethernet interface (IPv4 and IPv6) and a wifi interface (IPv4). It also avoids name resolution with the -n flag.

Here are the packets captured when the connection is established. The first three packets captured are the classical 3-way handshake:

10:44:24.854234 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [S], seq 2314766615, win 28800, options [mss 1440,sackOK,TS val 1395168 ecr 0,nop,wscale 7,mptcp capable csum {0xb49d03c2011d7aba}], length 0
10:44:24.877157 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421: Flags [S.], seq 3417687790, ack 2314766616, win 28160, options [mss 1440,sackOK,TS val 296602671 ecr 1395168,nop,wscale 7,mptcp capable csum {0x8fb9c7b493f33d4b}], length 0
10:44:24.878529 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687791, win 225, options [nop,nop,TS val 1395174 ecr 296602671,mptcp capable csum {0xb49d03c2011d7aba,0x8fb9c7b493f33d4b},mptcp dss ack 2962569294], length 0

If your host is correctly configured to use mptcp, each of these 3 packets should include the option “mptcp capable” as above.

The 3-way handshake is immediately followed by other packets, 5 in this case:

10:44:24.878538 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687791, win 225, options [nop,nop,TS val 1395174 ecr 296602671,mptcp add-addr id 2 130.104.228.97,mptcp dss ack 2962569294], length 0
10:44:24.878547 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687791, win 225, options [nop,nop,TS val 1395174 ecr 296602671,mptcp add-addr id 3 192.168.122.1,mptcp dss ack 2962569294], length 0
10:44:24.878551 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687791, win 225, options [nop,nop,TS val 1395174 ecr 296602671,mptcp add-addr id 4 130.104.111.30,mptcp dss ack 2962569294], length 0
10:44:24.878557 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687791, win 225, options [nop,nop,TS val 1395174 ecr 296602671,mptcp add-addr id 8 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8,mptcp dss ack 2962569294], length 0
10:44:24.904876 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421: Flags [.], ack 2314766616, win 220, options [nop,nop,TS val 296602677 ecr 1395174,mptcp add-addr id 2 37.187.114.89,mptcp dss ack 1565554328], length 0

At this time, no additional mptcp subflow has been opened. This will only happen after a data packet with mptcp options has been received, to be sure no middlebox is messing things up.

These are packets with the option add-address, communicating other addresses used by the client and the server. The client advertises 4 additional addresses:

192.168.122.1   (libvirt bridge)
130.104.228.97  (IPv4)
130.104.111.30  (wifi)
2001:6a8:3080:2:f24d:a2ff:fe96:8ce8  (second IPv6 global scope)

In our case, having the client advertise its addresses does not add any value, but let’s analyse it further for the sake of the experiment.

The first address is the address of a bridge used by libvirt on the client. Advertising this address should be avoided. You can disable mptcp for an interface with the patched ip route 2 available from multipath-tcp.org (if you added the apt repository, you can install it with apt-get install iproute2).

In my case, disabling the advertising of the virbr0 interface’s address is achieved with:

ip link set dev virbr0 multipath off

Advertising the IPv4 address on the same interface as the IPv6 address that was used to open the connection makes sense as the path used by each will probably be different, and hence will have different performance characteristics.

The server announces one additional address: 37.187.114.89

Here is, for the compatison, the trace obtained up to this point when disabling mptcp by issuing the command

echo 0 > /proc/sys/net/mptcp/mptcp_enabled
11:00:32.731162 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522 > 2001:41d0:a:6759::1.9: Flags [S], seq 3444541109, win 28800, options [mss 1440,sackOK,TS val 1637138 ecr 0,nop,wscale 7], length 0
11:00:32.752091 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522: Flags [S.], seq 1047133288, ack 3444541110, win 28560, options [mss 1440,sackOK,TS val 296844644 ecr 1637138,nop,wscale 7], length 0
11:00:32.752128 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522 > 2001:41d0:a:6759::1.9: Flags [.], ack 1047133289, win 225, options [nop,nop,TS val 1637143 ecr 296844644], length 0

There’s no mptcp_enabled option, and no additional address is advertised. Only the 3 packets of the 3-way handshake are exchanged.

At this time the connection is open, and both hosts using mptcp have advertised their additional addresses and received the other hosts addresses.

Data transfer

We can now send a line of text to the service, I just type one character and press enter.

Let’s first look at what happen when normal TCP is used:

11:01:16.960802 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522 > 2001:41d0:a:6759::1.9: Flags [P.], seq 3444541110:3444541113, ack 1047133289, win 225, options [nop,nop,TS val 1648195 ecr 296844644], length 3
11:01:16.981867 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522: Flags [.], ack 3444541113, win 224, options [nop,nop,TS val 296855701 ecr 1648195], length 0

Only 2 segments are exchanged, the first sending the data to the server, the second being the ack from the server.

Things are different when using multipath tcp. The first two segments are equivalent to the normal tcp connection:

10:51:54.636308 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [P.], seq 2314766616:2314766619, ack 3417687791, win 225, options [nop,nop,TS val 1507614 ecr 296602677,mptcp dss ack 2962569294 seq 1565554328 subseq 1 len 3 csum 0xbd71], length 3
10:51:54.657885 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421: Flags [.], ack 2314766619, win 220, options [nop,nop,TS val 296715118 ecr 1507614,mptcp dss ack 1565554331], length 0

but those are followed by other segments:

10:51:54.657944 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [S], seq 508980983, win 29200, options [mss 1460,sackOK,TS val 1507619 ecr 0,nop,wscale 7,mptcp join id 2 token 0x7b14451d nonce 0x1ca1c3df], length 0
10:51:54.657958 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [S], seq 2757512376, win 29200, options [mss 1460,sackOK,TS val 1507619 ecr 0,nop,wscale 7,mptcp join id 3 token 0x7b14451d nonce 0xbd6f9678], length 0
10:51:54.657971 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1507619 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:51:54.657984 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [S], seq 899129682, win 28800, options [mss 1440,sackOK,TS val 1507619 ecr 0,nop,wscale 7,mptcp join id 8 token 0x7b14451d nonce 0xdad211ee], length 0

Those 4 segments are all SYN segments with the mptcp join option. This is the client trying to open additional subflows. We see subflows are requested from addresses

130.104.228.97 (2 times)
130.104.111.30
2001:6a8:3080:2:f24d:a2ff:fe96:8ce8

The first address listed is the source of 2 requests to open subflows. Note that the id in the second packet is 3, which, if you look at the add-addr segments above, you’ll see associated with IP 192.168.122.1. This behaviour is due to the private address of the libvirt bridge being used to open a new subflow, but it is natted by libvirt.

The SYN segments are requesting the opening of additional subflows, and here are the segments completing these 3-way handshakes.

First, the subflow requested from IP 130.104.228.97 and port 52462 is opened:

10:51:54.670368 IP 37.187.114.89.9 > 130.104.228.97.52462: Flags [S.], seq 276814345, ack 508980984, win 28560, options [mss 1460,sackOK,TS val 296715122 ecr 1507619,nop,wscale 7,mptcp join id 2 hmac 0x680d8e1f8915b0c nonce 0x8a55d382], length 0
10:51:54.670420 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [.], ack 276814346, win 454, options [nop,nop,TS val 1507622 ecr 296715122,mptcp join hmac 0x75c4548dd3b71d3172b6fab1dc1ad62b94ba58a4], length 0

Then the subflow asked for the private IP on the client is completed.

10:51:54.670427 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [S.], seq 3653300584, ack 2757512377, win 28560, options [mss 1460,sackOK,TS val 296715122 ecr 1507619,nop,wscale 7,mptcp join id 2 hmac 0x7bc3d5be52b067e4 nonce 0x5195517a], length 0
10:51:54.670441 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300585, win 682, options [nop,nop,TS val 1507622 ecr 296715122,mptcp join hmac 0x5e10da51f9a7205f97c8a1e90283341f1b17b557], length 0

Finally the third 3-way handshake is completed:

10:51:54.681913 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280: Flags [S.], seq 3184835425, ack 899129683, win 28160, options [mss 1440,sackOK,TS val 296715123 ecr 1507619,nop,wscale 7,mptcp join id 8 hmac 0xfbb9cf34c4bc25bf nonce 0xf9e90440], length 0
10:51:54.681941 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [.], ack 3184835426, win 907, options [nop,nop,TS val 1507625 ecr 296715123,mptcp join hmac 0x420cc6a3bcbd81fadd3298b264e2a20f97bd98ea], length 0

At this time, the subflows are in the PRE_ESTABLISHED state, and cannot be used yet, because the last segment sent by the initiating party is the only one containing its authentication information. An acknowledgement of this last segment is required before data can be sent through the subflow. Here are the 3 acknowledgments:

10:51:54.682476 IP 37.187.114.89.9 > 130.104.228.97.52462: Flags [.], ack 508980984, win 444, options [nop,nop,TS val 296715125 ecr 1507622,mptcp dss ack 1565554331], length 0
10:51:54.682497 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [.], ack 2757512377, win 667, options [nop,nop,TS val 296715125 ecr 1507622,mptcp dss ack 1565554331], length 0
10:51:54.702843 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280: Flags [.], ack 899129683, win 887, options [nop,nop,TS val 296715129 ecr 1507625,mptcp dss ack 1565554331], length 0

At this time, 3 additional subflows have been set up. The subflow on the wireless interface has not been set up, and we can see new attempts:

10:51:55.654975 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1507869 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:51:57.658963 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1508370 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:52:01.666976 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1509372 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:52:09.682970 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1511376 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:52:25.714981 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1515384 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0
10:52:57.746964 IP 130.104.111.30.45837 > 37.187.114.89.9: Flags [S], seq 77997614, win 29200, options [mss 1460,sackOK,TS val 1523392 ecr 0,nop,wscale 7,mptcp join id 4 token 0x7b14451d nonce 0x6d029f47], length 0

This is due to a firewall blocking access to port 9.

Connection tear down

Now that we have opened a connection, transfered data and observerd subflows being established, we can close the connection. In the telnet connection, type the control sequence (usually ^], press CTRL-]), press enter and enter quit to exit telnet. At that time the connection is closed.

Let’s first look at what happens in standard TCP:

11:01:45.672776 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522 > 2001:41d0:a:6759::1.9: Flags [F.], seq 3444541113, ack 1047133289, win 225, options [nop,nop,TS val 1655373 ecr 296855701], length 0
11:01:45.694726 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522: Flags [F.], seq 1047133289, ack 3444541114, win 224, options [nop,nop,TS val 296862880 ecr 1655373], length 0
11:01:45.694766 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46522 > 2001:41d0:a:6759::1.9: Flags [.], ack 1047133290, win 225, options [nop,nop,TS val 1655378 ecr 296862880], length 0

In short, both ends send a FIN segment that is acknowledged to close the connection in both directions.

With mptcp, things are more complex as we also have opened multiple subflows.

First, at the MPTCP level, it is signaled that no more data will be sent with a DATA_FIN flagged segment. This segment has to be acknowledged in the DSS. As with TCP, this is done in both directions. This is seen in segments 1,2,4 below. As no more data will be transmitted, subflows can be teared down as classical tcp connections. This is what happens from segment 3. We see that segment 4 is used both to signal an acknowledgement at the DSS level, as well as signal a FIN at the subflow level.

10:58:03.422967 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300585, win 907, options [nop,nop,TS val 1599810 ecr 296715125,mptcp dss fin ack 2962569294 seq 1565554331 subseq 0 len 1 csum 0x2f7f], length 0
10:58:03.435575 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [.], ack 2757512377, win 887, options [nop,nop,TS val 296807315 ecr 1599810,mptcp dss fin ack 1565554332 seq 2962569294 subseq 0 len 1 csum 0x31ad], length 0
10:58:03.435631 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [F.], seq 899129683, ack 3184835426, win 907, options [nop,nop,TS val 1599814 ecr 296715129,mptcp dss ack 2962569294], length 0
10:58:03.435647 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [F.], seq 2757512377, ack 3653300585, win 907, options [nop,nop,TS val 1599814 ecr 296715125,mptcp dss ack 2962569294], length 0
10:58:03.435653 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [F.], seq 508980984, ack 276814346, win 907, options [nop,nop,TS val 1599814 ecr 296715125,mptcp dss ack 2962569294], length 0
10:58:03.435658 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [F.], seq 2314766619, ack 3417687791, win 907, options [nop,nop,TS val 1599814 ecr 296715118,mptcp dss ack 2962569294], length 0
10:58:03.435669 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300585, win 907, options [nop,nop,TS val 1599814 ecr 296807315,mptcp dss ack 2962569295], length 0
10:58:03.447702 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [F.], seq 3653300585, ack 2757512378, win 887, options [nop,nop,TS val 296807318 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.447735 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300586, win 907, options [nop,nop,TS val 1599817 ecr 296807318,mptcp dss ack 2962569295], length 0
10:58:03.447743 IP 37.187.114.89.9 > 130.104.228.97.52462: Flags [F.], seq 276814346, ack 508980985, win 887, options [nop,nop,TS val 296807318 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.447748 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [.], ack 276814347, win 907, options [nop,nop,TS val 1599817 ecr 296807318,mptcp dss ack 2962569295], length 0
10:58:03.457119 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421: Flags [F.], seq 3417687791, ack 2314766620, win 887, options [nop,nop,TS val 296807319 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.457147 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687792, win 907, options [nop,nop,TS val 1599819 ecr 296807319,mptcp dss ack 2962569295], length 0
10:58:03.460016 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280: Flags [F.], seq 3184835426, ack 899129684, win 887, options [nop,nop,TS val 296807319 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.460032 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [.], ack 3184835427, win 907, options [nop,nop,TS val 1599820 ecr 296807319,mptcp dss ack 2962569295], length 0

We can also look at what happens per subflow. Here is the DATA_FIN sent and acknowledged, with the subflow closed afterwards:

10:58:03.422967 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300585, win 907, options [nop,nop,TS val 1599810 ecr 296715125,mptcp dss fin ack 2962569294 seq 1565554331 subseq 0 len 1 csum 0x2f7f], length 0
10:58:03.435575 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [.], ack 2757512377, win 887, options [nop,nop,TS val 296807315 ecr 1599810,mptcp dss fin ack 1565554332 seq 2962569294 subseq 0 len 1 csum 0x31ad], length 0
10:58:03.435647 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [F.], seq 2757512377, ack 3653300585, win 907, options [nop,nop,TS val 1599814 ecr 296715125,mptcp dss ack 2962569294], length 0
10:58:03.435669 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300585, win 907, options [nop,nop,TS val 1599814 ecr 296807315,mptcp dss ack 2962569295], length 0
10:58:03.447702 IP 37.187.114.89.9 > 130.104.228.97.34887: Flags [F.], seq 3653300585, ack 2757512378, win 887, options [nop,nop,TS val 296807318 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.447735 IP 130.104.228.97.34887 > 37.187.114.89.9: Flags [.], ack 3653300586, win 907, options [nop,nop,TS val 1599817 ecr 296807318,mptcp dss ack 2962569295], length 0

Hereafter we see the tear down of the 3 remaining subflows:

10:58:03.435631 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [F.], seq 899129683, ack 3184835426, win 907, options [nop,nop,TS val 1599814 ecr 296715129,mptcp dss ack 2962569294], length 0
10:58:03.460016 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280: Flags [F.], seq 3184835426, ack 899129684, win 887, options [nop,nop,TS val 296807319 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.460032 IP6 2001:6a8:3080:2:f24d:a2ff:fe96:8ce8.39280 > 2001:41d0:a:6759::1.9: Flags [.], ack 3184835427, win 907, options [nop,nop,TS val 1599820 ecr 296807319,mptcp dss ack 2962569295], length 0


10:58:03.435653 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [F.], seq 508980984, ack 276814346, win 907, options [nop,nop,TS val 1599814 ecr 296715125,mptcp dss ack 2962569294], length 0
10:58:03.447743 IP 37.187.114.89.9 > 130.104.228.97.52462: Flags [F.], seq 276814346, ack 508980985, win 887, options [nop,nop,TS val 296807318 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.447748 IP 130.104.228.97.52462 > 37.187.114.89.9: Flags [.], ack 276814347, win 907, options [nop,nop,TS val 1599817 ecr 296807318,mptcp dss ack 2962569295], length 0


10:58:03.435658 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [F.], seq 2314766619, ack 3417687791, win 907, options [nop,nop,TS val 1599814 ecr 296715118,mptcp dss ack 2962569294], length 0
10:58:03.457119 IP6 2001:41d0:a:6759::1.9 > 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421: Flags [F.], seq 3417687791, ack 2314766620, win 887, options [nop,nop,TS val 296807319 ecr 1599814,mptcp dss ack 1565554332], length 0
10:58:03.457147 IP6 2001:6a8:3080:2:95ad:6e51:ba2:31ea.46421 > 2001:41d0:a:6759::1.9: Flags [.], ack 3417687792, win 907, options [nop,nop,TS val 1599819 ecr 296807319,mptcp dss ack 2962569295], length 0

Citing Multipath TCP

A growing number of scientific papers use the Multipath TCP implementation in the Linux kernel to perform experiments, develop new features or compare Multipath TCP with newly proposed techniques. While reading these scientific papers, we often see different ways of citing the Multipath TCP implementation in the Linux kernel. As of this writing, more than twenty developers have contributed to this implementation and the number continues to grow. The full list of contributors is available from : http://multipath-tcp.org/mptcp_stats/authors.html

If you write a scientific paper that uses the Multipath TCP implementation in the Linux kernel, we encourage you cite it by using the following reference :

Christoph Paasch, Sebastien Barre, et al., Multipath TCP implementation in the Linux kernel, available from http://www.multipath-tcp.org

The corresponding bibtex entry may be found below

@Misc{MPTCPLinux,
    author =    {Christoph Paasch and Sebastien Barre and others},
    title =     {Multipath TCP implementation in the Linux kernel},
    howpublished = {Available from http://www.multipath-tcp.org}
}

Please also indicate the precise version of the implementation that you used to ease the reproduction of your results. We also strongly encourage you to distribute the software the you used to perform your experiments and the patches that you have written on top of this implementation. This will allow other researchers to reproduce your results.

Recommended Multipath TCP configuration

A growing number of researchers and users are downloading the pre-compiled Linux kernels that include the Multipath TCP implementation. Besides the researchers who performed experiments on improving the protocol or its implementation, we see a growing number of users that deploy Multipath TCP on real machines to benefit from its multihoming capabilities. Several of these users have asked questions on the mptcp-dev mailing list on how to configure Multipath TCP in the Linux kernel. There are several parts of the Multipath TCP implementation that can be tuned.

The first element that can be configured is the path manager. The path manager has been reimplemented in a modular manner recently. This is the part of the software that controls the establishment of new subflows. The latests versions of Multipath TCP contain a path manager that has a modular architecture, but as of this writing, only two different path managers have been implemented : the fullmesh and the ndiffports path managers.

The fullmesh path manager is the default one and should be used in most deployments. On a client, it will advertise all the IP addresses of the client to the server and listen to all the IP addresses that are advertised by the server. It also listens to events from the network interfaces and reacts by adding/removing addresses when interfaces go up or down. On a server, it allows the server to automatically learn all the available addresses and announce them to the client. Note that in the current implementation the server never creates subflows, even if it learns different addresses from the client. The reason is that the client is often behind a NAT or firewall and creating subflows from the server is not a good idea in this case. The typical use case for this fullmesh path manager is a dual-homed client connected to a single-homed server (e.g. a smartphone connected to a regular server). In this case, two subflows will be established on each of the interfaces of the dual-homed client. We expect that this is the more popular use case for Multipath TCP. It should be noted that if the client has N addresses and the server M addresses, this path manager will establish N \times M subflows. This is probably not optimal in all scenarios.

The ndiffports path manager was designed for a specific use case in mind : exploit the equal costs multiple paths that are available in a dataceenter. This allowed to demonstrate nice performance results with Multipath TCP in the Amazon EC2 datacenter in a paper presented at SIGCOMM11. It can also be used to perform some tests between single-homed hosts. However, this path manager does not automatically learn the IP addresses on the client and the server and does not react to interface changes. As for the full-mesh path manager, the server never creates subflows. The ndiffports path manager should not be used in production and should be considered as an example on how a path manager can be written inside the Linux kernel.

A second important module in the Multipath TCP implementation in the Linux kernel is the packet scheduler. This scheduler is used every time a new packet needs to be sent. When there are several subflows that are active and have an open congestion window, the default scheduler selects the subflow with the smallest round-trip-time. The various measurements that have been performed during the last few years with the Multipath TCP implementation in the Linux kernel indicate that this scheduler appears to be the best compromise from a performance viewpoint. Recently, the implementation of the scheduler have been made more modulas to enable researchers to experiment with other schedulers. A round-robin scheduler has been implemented and evaluated in a recent paper that shows that the default scheduler remains the best choice. Researchers might come up later with a better scheduler than improves the performance of Multipath TCP under specific circumstances, but as of this writing the default rtt-based scheduler remains the best choice.

A third important part of Multipath TCP is the congestion control scheme. The standard congestion control scheme is the Linked Increase Algorithm (LIA) defined in RFC 6356. It provides a similar performance as the NewReno congestion control algorithm with single path TCP. An alternative is the OLIA congestion control algorithm. The paper that proposes this algorithm has shown that it gives some benefits over LIA in several environments. Our experience indicates that LIA and OLIA could safely be used as a default in deployments. Recently, a delay based congestion control scheme tuned for Multipath TCP has been added to the Linux implementation. Users who plan to use this congestion control scheme in specific environments should first perform tests before deploying it.

There are two other configuration parameters that could be tuned to improve the performance of Multipath TCP. First, Multipath TCP tends to consume more buffers than regular TCP since data is transmitted over paths with different delays. If you experience performance issues with the default buffer sizes, you might try to increase them, see https://fasterdata.es.net/host-tuning/linux/ for additional information. Second, if Multipath TCP is used on paths having different Maximum Segment Sizes, there are scenarios where the performance can be significantly reduced. A patch that solves this problem has been posted recently. If your version of the Multipath TCP kernel does not include this patch, you might want to force the MTU on all your interfaces one the client to use the same value (or force a lower MTU on the server to ensure that the clients always use the same MSS).

Multipath TCP discussed at Blackhat 2014

The interest in Multipath TCP continues to grow. During IETF90, an engineer from Oracle confirmed that they were working on an implementation of Multipath TCP on Solaris. This indicates that companies see a possible benefit with Multipath TCP. Earlier this week, Catherine Pearce and Patrick Thomas from Neohapsis gave a presentation on how the deployment of Multipath TCP could affect enterprise that heavily rely on firewalls and IDS in their corporate network. This first ‘heads up’ for the security community will likely be followed by many other attempts to analyse the security of Multipath TCP and its implications on the security of an enterprise network.

In parallel with their presentation, Catherine and Patrick have released two software packages that could be useful for Multipath TCP users. Both are based on a first implementation of Multipath TCP inside scapy written by Nicolas Maitre during his Master thesis at UCL.

  • mptcp_scanner is a tool that probes remote hosts to verify whether they support Multipath TCP. It would be interesting to see whether an iPhone is detected as such (probably not because there are no servers running on the iPhone). In the long term, we can expect that nmap
  • mptcp_fragmenter is a tool that mimics how a Multipath TCP connection could send start over different subflows. Currently, the tool is very simple, five subflows are used and their source port numbers are fixed. Despite of this limitation, it is a good starting point to test the support of Multipath TCP on firewalls. We can expect that new features will be added as firewalls add support for Multipath TCP.