TCP Tracing – Part 6 – Segmentation Offloading

With multi gigabit per second Ethernet and wireless interfaces, CPUs are quite challenged by the sheer number of packets that need to be handled. Let’s say a transmission on a (meager) 1 Gbps Ethernet link is well utilized by large file transfers (let’s say 100 MBbytes per second) and a typical maximum segment size (MSS) of 1460 bytes is used. That’s 68.493 packets per second in one direction, not even counting the TCP ACKs in the other direction! Also, Wireshark starts smoking when it has to look at a 60 seconds trace with 4 million packets inside. But there’s a fix for that with almost no downsides that: Segmentation Offloading.

Fortunately, by default, the Linux kernel, the network card’s driver and the network hardware support “Segmentation Offloading” today. Instead of sending single IP packets between the Ethernet card and the CPU/main memory, the network card bundles several incoming packets and sends a single huge packet to the CPU and the TCP/IP stack for treatment. Instead of only bundling IP packets, it goes one step further and looks at TCP sequence numbers and combines several TCP/IP packets into a single TCP/IP packet that it then forwards to the kernel. This means that the network card itself, or the NICs driver software needs to understand IP and TCP, as it needs to create a new IP and TCP header for the combined packet that is based on the changing information in the headers of the individual packets.

In the other direction, i.e. when sending packets into the network, the Linux kernel can do the same: Instead of sending TCP/IP packets with a maximum size of 1460 bytes, it only sends a huge TCP/IP packet to the network card. Here, the packet is then split up into 1460 byte packets, and individual IP and TCP headers are added. The screenshot above shows an example. Here, more than 40 TCP/IP packets are bundled into a single TCP/IP packet by the kernel IP stack which is then sent to the network card. In addition, tcpdump has only stored the first 100 bytes of the packet to save space (see part 2 of this series for details).

While this significantly speeds up packet analysis, segmentation offload can get in the way when one wants to have a very close look at each individual packet that is ‘really’ sent over the network. Fortunately, the default segmentation offload can be deactivated on Linux. Here are the three commands I use for the purpose:

# off to handle packets separately - CPU intensive
# on to handle them in bulk - default!

ethtool -k <interface> tso off
ethtool -k <interface> gso off
ethtool -k <interface> gro off

tso and gso deal with send (transmit) offload while gro stands for ‘generic receive offload. Depending on the network card, there are other offloading (i.e. packet bundling / modification) schemes as well. Here’s how to get an overview of them and their current status:

sudo ethtool -k INTEFACE-NAME

Here’s a part of the result that is returned when I have a look at the Wifi interface (an Intel AX200) of my notebook:

martin@m3:~$ sudo ethtool -k wlp3s0
Features for wlp3s0:
[...]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on

generic-segmentation-offload: on

generic-receive-offload: on
[...]

For more details have a look at Wikipedia here and here and for some excellent further background on lwn.net here.