Sometimes it can be useful to inspect the state of a TCP endpoint. Things such as the current congestion window, the retransmission timeout (RTO), duplicate ack threshold, etc. are not reflected in the segments that flow over the wire. Therefore, just looking at packet captures can leave you scratching your head as to why a TCP connection is behaving a certain way.
Using the Linux
ss utility coupled with
crash, its not too difficult to
inspect some of the internal TCP state for a socket on Linux. Figuring out the
meaning of all variables and how they relate to the variables referenced in the
many TCP RFCs and papers is another matter, but at least we can get some idea of
what is going on.
First, you can ask
ss to give you information about, say, NFS sockets in use
on a given client system:
[cperl@localhost ~]$ ss -eipn '( dport = :nfs )' State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB 0 0 192.168.1.10:975 192.168.1.200:2049 ino:12453 sk:ffff8802305a0800 ts sack cubic wscale:6,7 rto:201 rtt:1.875/0.75 ato:40 cwnd:10 ssthresh:40 send 61.8Mbps rcv_rtt:1.875 rcv_space:1814280 ESTAB 0 0 192.168.1.10:971 192.168.1.201:2049 ino:16576 sk:ffff88022f14d6c0 ts sack cubic wscale:6,7 rto:202 rtt:2.125/1.75 ato:40 cwnd:10 ssthresh:405 send 54.5Mbps rcv_rtt:5 rcv_space:3011258
ss uses the
tcp_diag kernel module to extract extra information
(this is done via an
A lot of interesting TCP state is provided in this output. For example, you can see the current retransmission timeout (“rto”), the current buffer space available for receiving data (“rcv_space”), the congestion control algorithm (“cubic”) and you can see what the window scale option for the connection is (the number before the comma is the scaling applied to the window offered by the remote endpoint and the number after the comma is the scaling the remote endpoint will be applying to the window offered by us (i.e. its the Window Scale option we sent in our initial SYN). Some of the other variables are interesting too, but going into details on all of them is beyond the scope of this blog post.
If you’re really, really interested in the kernel’s internal state, you can also
take the address of the
struct sock that
ss gave you (e.g.
sk:ffff8802305a0800) and inspect it with crash:
[cperl@localhost ~]$ sudo crash -e emacs ... KERNEL: /usr/lib/debug/lib/modules/2.6.32-4184.108.40.206.1.el6.x86_64/vmlinux DUMPFILE: /dev/crash CPUS: 4 DATE: Tue Jul 1 15:26:19 2014 UPTIME: 1 days, 07:32:48 LOAD AVERAGE: 0.08, 0.05, 0.01 TASKS: 871 NODENAME: localhost RELEASE: 2.6.32-4220.127.116.11.1.el6.x86_64 VERSION: #1 SMP Fri Dec 13 13:06:13 UTC 2013 MACHINE: x86_64 (2992 Mhz) MEMORY: 7.9 GB PID: 29732 COMMAND: "crash" TASK: ffff88013a928080 [THREAD_INFO: ffff88011b548000] CPU: 1 STATE: TASK_RUNNING (ACTIVE) crash> struct tcp_sock.rcv_nxt,snd_una,reordering ffff8802305a0800 rcv_nxt = 3794247234 snd_una = 2557966926 reordering = 3
Because of the way Linux stores the structures in memory, you can just cast the
struct sock to a
struct tcp_sock. If you leave off the specific members in
the “struct” invocation above you can get a recursive dump of all the fields and
the structures embedded within (its just too large to be useful in this blog
It’s possible you might not be able to get what you want just using
may want to turn to a tool like SystemTap to further figure out what is going
on, but this is a decent place to start.