Inspecting Internal TCP State on Linux

Sometimes it can be useful to inspect the state of a TCP endpoint. Things such as the current congestion window, the retransmission timeout (RTO), duplicate ack threshold, etc. are not reflected in the segments that flow over the wire. Therefore, just looking at packet captures can leave you scratching your head as to why a TCP connection is behaving a certain way.

Using the Linux ss utility coupled with crash, its not too difficult to inspect some of the internal TCP state for a socket on Linux. Figuring out the meaning of all variables and how they relate to the variables referenced in the many TCP RFCs and papers is another matter, but at least we can get some idea of what is going on.

First, you can ask ss to give you information about, say, NFS sockets in use on a given client system:

[cperl@localhost ~]$ ss -eipn '( dport = :nfs )'
State       Recv-Q Send-Q    Local Address:Port    Peer Address:Port 
ESTAB       0      0         192.168.1.10:975      192.168.1.200:2049   ino:12453 sk:ffff8802305a0800
     ts sack cubic wscale:6,7 rto:201 rtt:1.875/0.75 ato:40 cwnd:10 ssthresh:40 send 61.8Mbps rcv_rtt:1.875 rcv_space:1814280
ESTAB       0      0         192.168.1.10:971      192.168.1.201:2049   ino:16576 sk:ffff88022f14d6c0
     ts sack cubic wscale:6,7 rto:202 rtt:2.125/1.75 ato:40 cwnd:10 ssthresh:405 send 54.5Mbps rcv_rtt:5 rcv_space:3011258

Internally, ss uses the tcp_diag kernel module to extract extra information (this is done via an AF_NETLINK socket).

A lot of interesting TCP state is provided in this output. For example, you can see the current retransmission timeout (“rto”), the current buffer space available for receiving data (“rcv_space”), the congestion control algorithm (“cubic”) and you can see what the window scale option for the connection is (the number before the comma is the scaling applied to the window offered by the remote endpoint and the number after the comma is the scaling the remote endpoint will be applying to the window offered by us (i.e. its the Window Scale option we sent in our initial SYN). Some of the other variables are interesting too, but going into details on all of them is beyond the scope of this blog post.

If you’re really, really interested in the kernel’s internal state, you can also take the address of the struct sock that ss gave you (e.g. sk:ffff8802305a0800) and inspect it with crash:

[cperl@localhost ~]$ sudo crash -e emacs
...
      KERNEL: /usr/lib/debug/lib/modules/2.6.32-431.1.2.0.1.el6.x86_64/vmlinux
    DUMPFILE: /dev/crash
        CPUS: 4
        DATE: Tue Jul  1 15:26:19 2014
      UPTIME: 1 days, 07:32:48
LOAD AVERAGE: 0.08, 0.05, 0.01
       TASKS: 871
    NODENAME: localhost
     RELEASE: 2.6.32-431.1.2.0.1.el6.x86_64
     VERSION: #1 SMP Fri Dec 13 13:06:13 UTC 2013
     MACHINE: x86_64  (2992 Mhz)
      MEMORY: 7.9 GB
         PID: 29732
     COMMAND: "crash"
        TASK: ffff88013a928080  [THREAD_INFO: ffff88011b548000]
         CPU: 1
       STATE: TASK_RUNNING (ACTIVE)

crash> struct tcp_sock.rcv_nxt,snd_una,reordering ffff8802305a0800
  rcv_nxt = 3794247234
  snd_una = 2557966926
  reordering = 3

Because of the way Linux stores the structures in memory, you can just cast the struct sock to a struct tcp_sock. If you leave off the specific members in the “struct” invocation above you can get a recursive dump of all the fields and the structures embedded within (its just too large to be useful in this blog post).

It’s possible you might not be able to get what you want just using crash and may want to turn to a tool like SystemTap to further figure out what is going on, but this is a decent place to start.

Inspecting Internal TCP State on Linux

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team

Inspecting Internal TCP State on Linux

Subscribe to Email Updates

Signals & Threads Podcast

Featured Tech Talk

Featured Reads

Tags

RSS

Jane Street Open Source

Join Our Team