lunes, 15 de mayo de 2017

We all understand free command... don't we???

I know I should be writing the next entry of "Understanding SOMAXCONN parameter Part I" (and I am) but instead I decided to do a little detour first. I faced this question about free a few days ago and noticed odd/different answers for it (including my own xD) so decided to go for a deep dive and found some interesting facts worth sharing.

Free


Free Linux command is one of those tools you will be definitely get exposed to as a sysadmin, and understanding its output is extremely important. A bad interpretation of free output could cause unnecessary panic and you don't want to panic for no reason :D (oh wait... maybe you do, I certainly don't ).

Lets have a look at a simple free output:

juan@test:~$ free
             total       used       free     shared    buffers     cached
Mem:       3007572     584872    2422700      14780      55808     291704
-/+ buffers/cache:     237360    2770212
Swap:            0          0          0
juan@test:~$

In the line starting with "Mem:" we get to see the total available physical memory on the system 3007572 KBytes (3GBytes) and then how it is distributed:
  • used: memory being occupied
  • free: memory that is available in the system
  • shared: amount of memory that is shared between processes
  • buffers: memory used for kernel buffers involving IO (disk, network, etc)
  • cached: memory used for data caching
Then we have a line starting with "-/+ buffers/cache:" which looks kind of cryptic, doesn't it?... it turns out to be a kind of summary of the previous line:
  • There are 237360 KBytes being used, this number is the result of the following math "used - (buffers + cached)". This is the amount of memory the system is actually relying on.
  • There are 2770212 KBytes that could be utilized, this is the result of "free + buffers + cached". What this number actually means is that the kernel should be able to reclaim buffer and cached memory if necessary.
So we have a better understanding of free output now, but lets get more details of where these numbers come from.

The power of the force source


Free binary is shipped as part of procps-ng package in Linux and we can inspect its latest code here. What free actually does for you is summarize memory utilization details coming from /proc/meminfo in a more "human way". So lets dissect this...
  • The main source code can be read here free.c, if you have a look at the lines between 359-377, you can see that there are 6 variables kb_main_total, kb_main_used, kb_main_free, kb_main_shared, kb_main_buffers and kb_main_cached and they are actually the ones containing the values that will show up in free's output. However if you look for these variables on the code you won't find them, they are extern variables included from "proc/sysinfo.h" which means the variables are being initialized somewhere else, particularly in "proc/sysinfo.c". The variables are actually being populated by meminfo() function being called in line 355.
  meminfo();
  /* Translation Hint: You can use 9 character words in
   * the header, and the words need to be right align to
   * beginning of a number. */
  if (flags & FREE_WIDE) {
   printf(_("              total        used        free      shared     buffers       cache   available"));
  } else {
   printf(_("              total        used        free      shared  buff/cache   available"));
  }
  printf("\n");
  printf("%-7s", _("Mem:"));
  printf(" %11s", scale_size(kb_main_total, flags, args));
  printf(" %11s", scale_size(kb_main_used, flags, args));
  printf(" %11s", scale_size(kb_main_free, flags, args));
  printf(" %11s", scale_size(kb_main_shared, flags, args));
  if (flags & FREE_WIDE) {
   printf(" %11s", scale_size(kb_main_buffers, flags, args));
   printf(" %11s", scale_size(kb_main_cached, flags, args));
  } else {
   printf(" %11s", scale_size(kb_main_buffers+kb_main_cached, flags, args));
  }
  printf(" %11s", scale_size(kb_main_available, flags, args));
  printf("\n");

  • Now checking the "proc/sysinfo.c" code where meminfo() function exists, we see:
    • kb_main_buffer is populated with the content of Buffers value from /proc/meminfo (line 691), this is memory used by the kernel to temporarily hold data being sent/received by the system (network IO, disk IO, etc).
    • kb_main_cached is populated with the result of "kb_page_cache + kb_slab_reclaimable" (line 762). These two values come from Cached and SReclaimable respectively (in /proc/meminfo). So the cached value presented by free contains the memory used by the page cache and the reclaimable slab memory (slab memory caches dentry and inodes structures to speed up some fs operations).
  {"Bounce",       &kb_bounce},
  {"Buffers",      &kb_main_buffers}, // important
  {"Cached",       &kb_page_cache},  // important
  {"CommitLimit",  &kb_commit_limit},
  ...
  kb_main_cached = kb_page_cache + kb_slab_reclaimable;
  kb_swap_used = kb_swap_total - kb_swap_free;

It is clear now that free relies completely on the details exposed by the kernel through /proc/meminfo, and the math involved in getting the values isn't really rocket science after all. But here it comes the interesting part...

To have in mind


It looks like the information provided by free can be slightly different between different Linux flavors (because they carry different free versions :D) so I thought it might be worth pointing that out here as well. This could lead to some inconsistencies when some other tools rely on the output of free.

Free version shipped with Ubuntu 14.04 (3.3.9), doesn't even include Slabs under the cached memory, you can see here how kb_main_cached value is just populated with Cached value from /proc/meminfo.

juan@test:~$ free -V
free from procps-ng 3.3.9
juan@test:~$ free
             total       used       free     shared    buffers     cached
Mem:       3007572     530948    2476624      14776      53080     244740
-/+ buffers/cache:     233128    2774444
Swap:            0          0          0
juan@test:~$ grep "^Cached\|^SReclaimable\|^Slab" /proc/meminfo
Cached:           244740 kB
Slab:              32308 kB
SReclaimable:      17572 kB
juan@test:~$

Note: cached = Cached

Free version shipped with CentOS 7 (3.3.10), defines kb_main_cached as "kb_page_cache + kb_slab" (706) this seems a minor thing but not all the slab memory is reclaimable therefore part of this cached content is not really available in case of memory pressure.

[juan@server ~]$ free -V
free from procps-ng 3.3.10
[juan@server ~]$ free -w
              total        used        free      shared     buffers       cache   available
Mem:        1016860       74212      818380        6688         948      123320      810600
Swap:             0           0           0
[juan@server ~]$ grep "^Cached\|^SRecl\|^Slab" /proc/meminfo
Cached:            88876 kB
Slab:              34444 kB
SReclaimable:      13928 kB
[juan@server ~]$

Note: cache = Cached + Slab

Interesting, isn't it? That's the power of open source after all right? Having the chance to really understand what certain piece of software is doing for you and how is doing it!

That's all about free, hope it helps (it helped me at least :P).

sábado, 6 de mayo de 2017

Understanding SOMAXCONN parameter Part I

This post is kind of a fork of something I worked on the previous week and tries to shed some light on what somaxconn parameter means in the context of Linux Network stack. I've tried to prepare something similar to what I did in TCP Keep Alive - how it works, with examples and some traffic captures so I guess this article will be a bit longer that [I|you]'d like xD, but useful (hopefully).

somaxconn


By definition this value is the maximum number of established connections a socket may have queuing waiting to be accepted by the process holding the socket. This is completely different from tcp_syn_max_backlog since this value is for incomplete connection (connections that haven't been ACKed yet).

To give more context to that statement is necessary to understand a bit more about sockets. When a socket is created, before being able to receive connections two things must happen:
  1. It has to be bind() to a local address.
  2. It has to bet set to listen() for connections.
The listen() syscall is key here, it receives 2 parameters:
Long time ago (before kernel 2.4.25) somaxconn value used to be hardcoded in the kernel, nowadays you can read and update using sysctl as with the rest of the network stack parameters. In most systems the default value is 128, like in my CentOS VM:

[root@server juan]# sysctl -a --pattern net.core.somaxconn
net.core.somaxconn = 128
[root@server juan]#

Test scenario


In order to test and confirm the behavior of somaxconn I have the following setup:
  • Client VM (IP 192.168.0.17) running a C binary (source here) that will start 25 TCP connections to the Server VM on port 8080, in each connection it will send 4 bytes "juan". The binary opens 1 connection per second, and after having opened the 25 connections just waits for 2 minutes to finish.
  • Server VM (IP 192.168.0.26) running ncat (nc, netcat, whatever you like to call it) listening on port 8080, and the maximum open files has been set to 9 (ulimit -n 9) to achieve a condition where we can see established connections being queued easier.
A few more details you should keep in mind:
  • The version of nc shipped with CentOS sets a backlog of 10 on the listen() syscall (using strace):
[juan@server ~]$ ulimit -n 9
[juan@server ~]$ strace nc -lk 8080
execve("/usr/bin/nc", ["nc", "-lk", "8080"], [/* 24 vars */]) = 0
brk(0)                                  = 0x1dd7000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe7075e9000
...
bind(3, {sa_family=AF_INET6, sin6_port=htons(8080), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 128) = 0
listen(3, 10)                           = 0
fcntl(3, F_GETFL)
...
  • So we have a backlog of 10 set in the listen syscall and a limit of 9 open files. I will use this value as somaxconn since the meaning is the same.
  • After nc was just started we can see it has already used 5 FD in use (STDIN, STDOUT, STDERR, and two more for the listening sockets on IPv4 and IPv6)
[root@server juan]# lsof -p 14136|grep "CHR\|IPv"
nc      14136 juan    0u   CHR     136,0      0t0        3 /dev/pts/0
nc      14136 juan    1u   CHR     136,0      0t0        3 /dev/pts/0
nc      14136 juan    2u   CHR     136,0      0t0        3 /dev/pts/0
nc      14136 juan    3u  IPv6 488270957      0t0      TCP *:webcache (LISTEN)
nc      14136 juan    4u  IPv4 488270958      0t0      TCP *:webcache (LISTEN)
[root@server juan]#

With all these details in mind and some basic math (2+2=4 :D) we can expect the following behavior:
  • Since 5 out of the 9 available file descriptors are already in use, nc should not be able to receive more than 4 new connections on port 8080. Each new connection will create a new socket that has to be accessible in user space through a file descriptor (have a look at accept() syscall), which can't be created if the max open files limit has been reached.
  • The C binary will issue 25 connections:
    • The first 4 will be answered just fine by nc and the 4 bytes will be read from the socket and printed to STDOUT.
    • The next 10 connections will be indeed accepted by the Network layer of the Server VM, and even the 4 bytes will be received, however nc won't be able to accept the socket and read the 4 bytes. These are the established connections being queued!!!
    •  The remaining 11 connections will look as ESTABLISHED from the client as they have completed the TCP handshake however the 4 bytes won't be ACKnowledged by Network Stack because the queue is full (somaxconn limit has been reached). From the server the connections won't even exist.

Binary execution:

juan@test:~/TCP_connections$ gcc -o tcp_connections tcp_connections.c
juan@test:~/TCP_connections$ ./tcp_connections
Socket opened FD=3, connection number 0
Socket opened FD=4, connection number 1
Socket opened FD=5, connection number 2
Socket opened FD=6, connection number 3
Socket opened FD=7, connection number 4
Socket opened FD=8, connection number 5
Socket opened FD=9, connection number 6
Socket opened FD=10, connection number 7
Socket opened FD=11, connection number 8
Socket opened FD=12, connection number 9
Socket opened FD=13, connection number 10
Socket opened FD=14, connection number 11
Socket opened FD=15, connection number 12
Socket opened FD=16, connection number 13
Socket opened FD=17, connection number 14
Socket opened FD=18, connection number 15
Socket opened FD=19, connection number 16
Socket opened FD=20, connection number 17
Socket opened FD=21, connection number 18
Socket opened FD=22, connection number 19
Socket opened FD=23, connection number 20
Socket opened FD=24, connection number 21
Socket opened FD=25, connection number 22
Socket opened FD=26, connection number 23
Socket opened FD=27, connection number 24
juan@test:~/TCP_connections$

Nc execution on the server


Note: I introduced the new line after the 4th connections just to split them from the rest:
[juan@server ~]$ nc -lvk -p 8080
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Listening on :::8080
Ncat: Listening on 0.0.0.0:8080
Ncat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49026.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49027.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49028.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49029.
juan

Ncat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49030.
Ncat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49031.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49032.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49033.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49034.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49035.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49036.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49037.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49038.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49039.
juanNcat: Connection from 192.168.0.17.
Ncat: Connection from 192.168.0.17:49040.
juanjuan 

Results


To have a  look at what was happening during the test I measured 3 different locations:
  1. TCP connections on the client using ss.
  2. TCP connections on the server using ss.
  3. Captured network traffic on the server using tcpdump.
These 3 locations provided enough information to understand what is going on and to even find something interesting :D.

TCP connections on the client:


I've chopped off a couple of lines for the sake of space and time.  The samples were taken every 2 seconds and only TCP connections to port 8080 were considered:

root@test:/home/juan# for i in {1..100}; do ss -ntpe|grep "State\|8080"; echo ---; sleep 2; done
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
---
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49027         192.168.0.26:8080   users:(("tcp_connections",3342,4)) uid:1000 ino:23350 sk:ffff8800b4914140
ESTAB      0      0              192.168.0.17:49028         192.168.0.26:8080   users:(("tcp_connections",3342,5)) uid:1000 ino:23368 sk:ffff8800b4910e80
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
---
...
---
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49031         192.168.0.26:8080   users:(("tcp_connections",3342,8)) uid:1000 ino:23408 sk:ffff8800b4911d00
ESTAB      0      0              192.168.0.17:49027         192.168.0.26:8080   users:(("tcp_connections",3342,4)) uid:1000 ino:23350 sk:ffff8800b4914140
ESTAB      0      0              192.168.0.17:49029         192.168.0.26:8080   users:(("tcp_connections",3342,6)) uid:1000 ino:23372 sk:ffff8800b4910740
ESTAB      0      0              192.168.0.17:49030         192.168.0.26:8080   users:(("tcp_connections",3342,7)) uid:1000 ino:23390 sk:ffff8800b49115c0
ESTAB      0      0              192.168.0.17:49028         192.168.0.26:8080   users:(("tcp_connections",3342,5)) uid:1000 ino:23368 sk:ffff8800b4910e80
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
ESTAB      0      0              192.168.0.17:49033         192.168.0.26:8080   users:(("tcp_connections",3342,10)) uid:1000 ino:23445 sk:ffff8800b4912b80
ESTAB      0      0              192.168.0.17:49034         192.168.0.26:8080   users:(("tcp_connections",3342,11)) uid:1000 ino:23447 sk:ffff8800b49132c0
ESTAB      0      0              192.168.0.17:49032         192.168.0.26:8080   users:(("tcp_connections",3342,9)) uid:1000 ino:23426 sk:ffff8800b4912440
---
...


After a few seconds, we can see a few ESTABLISHED tcp connections using source ports from 49026 to 49034, 9 connections to be more precise, and they all show 0 bytes in Send-Q which suggests the 4 bytes sent by the application were actually acknowledged by the server.

...
---
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49031         192.168.0.26:8080   users:(("tcp_connections",3342,8)) uid:1000 ino:23408 sk:ffff8800b4911d00
ESTAB      0      0              192.168.0.17:49027         192.168.0.26:8080   users:(("tcp_connections",3342,4)) uid:1000 ino:23350 sk:ffff8800b4914140
ESTAB      0      0              192.168.0.17:49029         192.168.0.26:8080   users:(("tcp_connections",3342,6)) uid:1000 ino:23372 sk:ffff8800b4910740
ESTAB      0      0              192.168.0.17:49030         192.168.0.26:8080   users:(("tcp_connections",3342,7)) uid:1000 ino:23390 sk:ffff8800b49115c0
ESTAB      0      0              192.168.0.17:49028         192.168.0.26:8080   users:(("tcp_connections",3342,5)) uid:1000 ino:23368 sk:ffff8800b4910e80
ESTAB      0      0              192.168.0.17:49037         192.168.0.26:8080   users:(("tcp_connections",3342,14)) uid:1000 ino:23502 sk:ffff8800b4917400
ESTAB      0      0              192.168.0.17:49035         192.168.0.26:8080   users:(("tcp_connections",3342,12)) uid:1000 ino:23465 sk:ffff8800b4913a00
ESTAB      0      0              192.168.0.17:49036         192.168.0.26:8080   users:(("tcp_connections",3342,13)) uid:1000 ino:23484 sk:ffff8800b4915e40
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
ESTAB      0      0              192.168.0.17:49038         192.168.0.26:8080   users:(("tcp_connections",3342,15)) uid:1000 ino:23520 sk:ffff8800b4916580
ESTAB      0      0              192.168.0.17:49033         192.168.0.26:8080   users:(("tcp_connections",3342,10)) uid:1000 ino:23445 sk:ffff8800b4912b80
ESTAB      0      0              192.168.0.17:49034         192.168.0.26:8080   users:(("tcp_connections",3342,11)) uid:1000 ino:23447 sk:ffff8800b49132c0
ESTAB      0      0              192.168.0.17:49040         192.168.0.26:8080   users:(("tcp_connections",3342,17)) uid:1000 ino:23557 sk:ffff8800b4914880
ESTAB      0      0              192.168.0.17:49039         192.168.0.26:8080   users:(("tcp_connections",3342,16)) uid:1000 ino:23539 sk:ffff8800b4915700
ESTAB      0      0              192.168.0.17:49032         192.168.0.26:8080   users:(("tcp_connections",3342,9)) uid:1000 ino:23426 sk:ffff8800b4912440
---
...

A few more seconds later we see even more TCP connections in ESTABLISHED state, including src ports from 49035 to 49040. So we have a total of 15 ESTABLISHED connections with 0 bytes waiting to be ACKnowledged by the server. Remember we were expecting 14 (4 + 10), but instead we got 15 (4 + 11), odd right? I'll get back to this later...

...
---
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49031         192.168.0.26:8080   users:(("tcp_connections",3342,8)) uid:1000 ino:23408 sk:ffff8800b4911d00
ESTAB      0      0              192.168.0.17:49027         192.168.0.26:8080   users:(("tcp_connections",3342,4)) uid:1000 ino:23350 sk:ffff8800b4914140
ESTAB      0      0              192.168.0.17:49029         192.168.0.26:8080   users:(("tcp_connections",3342,6)) uid:1000 ino:23372 sk:ffff8800b4910740
ESTAB      0      0              192.168.0.17:49030         192.168.0.26:8080   users:(("tcp_connections",3342,7)) uid:1000 ino:23390 sk:ffff8800b49115c0
ESTAB      0      0              192.168.0.17:49028         192.168.0.26:8080   users:(("tcp_connections",3342,5)) uid:1000 ino:23368 sk:ffff8800b4910e80
ESTAB      0      0              192.168.0.17:49037         192.168.0.26:8080   users:(("tcp_connections",3342,14)) uid:1000 ino:23502 sk:ffff8800b4917400
ESTAB      0      0              192.168.0.17:49035         192.168.0.26:8080   users:(("tcp_connections",3342,12)) uid:1000 ino:23465 sk:ffff8800b4913a00
ESTAB      0      0              192.168.0.17:49036         192.168.0.26:8080   users:(("tcp_connections",3342,13)) uid:1000 ino:23484 sk:ffff8800b4915e40
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
ESTAB      0      0              192.168.0.17:49038         192.168.0.26:8080   users:(("tcp_connections",3342,15)) uid:1000 ino:23520 sk:ffff8800b4916580
ESTAB      0      0              192.168.0.17:49033         192.168.0.26:8080   users:(("tcp_connections",3342,10)) uid:1000 ino:23445 sk:ffff8800b4912b80
ESTAB      0      0              192.168.0.17:49034         192.168.0.26:8080   users:(("tcp_connections",3342,11)) uid:1000 ino:23447 sk:ffff8800b49132c0
ESTAB      0      0              192.168.0.17:49040         192.168.0.26:8080   users:(("tcp_connections",3342,17)) uid:1000 ino:23557 sk:ffff8800b4914880
ESTAB      0      4              192.168.0.17:49041         192.168.0.26:8080   timer:(on,1.208ms,3) users:(("tcp_connections",3342,18)) uid:1000 ino:23575 sk:ffff8800b4916cc0
ESTAB      0      0              192.168.0.17:49039         192.168.0.26:8080   users:(("tcp_connections",3342,16)) uid:1000 ino:23539 sk:ffff8800b4915700
ESTAB      0      0              192.168.0.17:49032         192.168.0.26:8080   users:(("tcp_connections",3342,9)) uid:1000 ino:23426 sk:ffff8800b4912440
ESTAB      0      4              192.168.0.17:49042         192.168.0.26:8080   timer:(on,608ms,2) users:(("tcp_connections",3342,19)) uid:1000 ino:23593 sk:ffff8800b4910000
---
...

This new sample shows 2 lines that are different than the rest, the ones using src ports 49041 and 49042.  These were the first 2 connections that couldn't fit in the somaxconn queue!!! And even more, you can see how the 4 bytes that were sent from the client haven't been acknowledged by the server and therefore they show up in the Send-Q column. Note how the connection looks as ESTABLISHED for the client though, this is because the 3-way handshake was completed from the client point of view.

...
---
State      Recv-Q Send-Q        Local Address:Port          Peer Address:Port
ESTAB      0      0              192.168.0.17:49031         192.168.0.26:8080   users:(("tcp_connections",3342,8)) uid:1000 ino:23408 sk:ffff8800b4911d00
ESTAB      0      0              192.168.0.17:49027         192.168.0.26:8080   users:(("tcp_connections",3342,4)) uid:1000 ino:23350 sk:ffff8800b4914140
ESTAB      0      0              192.168.0.17:49029         192.168.0.26:8080   users:(("tcp_connections",3342,6)) uid:1000 ino:23372 sk:ffff8800b4910740
ESTAB      0      4              192.168.0.17:49047         192.168.0.26:8080   timer:(on,460ms,4) users:(("tcp_connections",3342,24)) uid:1000 ino:23632 sk:ffff8800a6efba00
ESTAB      0      4              192.168.0.17:49050         192.168.0.26:8080   timer:(on,132ms,3) users:(("tcp_connections",3342,27)) uid:1000 ino:23635 sk:ffff8800a6efc140
ESTAB      0      0              192.168.0.17:49030         192.168.0.26:8080   users:(("tcp_connections",3342,7)) uid:1000 ino:23390 sk:ffff8800b49115c0
ESTAB      0      0              192.168.0.17:49028         192.168.0.26:8080   users:(("tcp_connections",3342,5)) uid:1000 ino:23368 sk:ffff8800b4910e80
ESTAB      0      4              192.168.0.17:49045         192.168.0.26:8080   timer:(on,4.740ms,5) users:(("tcp_connections",3342,22)) uid:1000 ino:23630 sk:ffff8800a6efcfc0
ESTAB      0      4              192.168.0.17:49049         192.168.0.26:8080   timer:(on,2.336ms,4) users:(("tcp_connections",3342,26)) uid:1000 ino:23634 sk:ffff8800a6efc880
ESTAB      0      0              192.168.0.17:49037         192.168.0.26:8080   users:(("tcp_connections",3342,14)) uid:1000 ino:23502 sk:ffff8800b4917400
ESTAB      0      0              192.168.0.17:49035         192.168.0.26:8080   users:(("tcp_connections",3342,12)) uid:1000 ino:23465 sk:ffff8800b4913a00
ESTAB      0      4              192.168.0.17:49043         192.168.0.26:8080   timer:(on,2.740ms,5) users:(("tcp_connections",3342,20)) uid:1000 ino:23611 sk:ffff8800a6efde40
ESTAB      0      0              192.168.0.17:49036         192.168.0.26:8080   users:(("tcp_connections",3342,13)) uid:1000 ino:23484 sk:ffff8800b4915e40
ESTAB      0      4              192.168.0.17:49044         192.168.0.26:8080   timer:(on,3.732ms,5) users:(("tcp_connections",3342,21)) uid:1000 ino:23616 sk:ffff8800a6efd700
ESTAB      0      4              192.168.0.17:49046         192.168.0.26:8080   timer:(on,5.740ms,5) users:(("tcp_connections",3342,23)) uid:1000 ino:23631 sk:ffff8800a6eff400
ESTAB      0      0              192.168.0.17:49026         192.168.0.26:8080   users:(("tcp_connections",3342,3)) uid:1000 ino:23326 sk:ffff8800b4914fc0
ESTAB      0      4              192.168.0.17:49048         192.168.0.26:8080   timer:(on,1.336ms,4) users:(("tcp_connections",3342,25)) uid:1000 ino:23633 sk:ffff8800a6efecc0
ESTAB      0      0              192.168.0.17:49038         192.168.0.26:8080   users:(("tcp_connections",3342,15)) uid:1000 ino:23520 sk:ffff8800b4916580
ESTAB      0      0              192.168.0.17:49033         192.168.0.26:8080   users:(("tcp_connections",3342,10)) uid:1000 ino:23445 sk:ffff8800b4912b80
ESTAB      0      0              192.168.0.17:49034         192.168.0.26:8080   users:(("tcp_connections",3342,11)) uid:1000 ino:23447 sk:ffff8800b49132c0
ESTAB      0      0              192.168.0.17:49040         192.168.0.26:8080   users:(("tcp_connections",3342,17)) uid:1000 ino:23557 sk:ffff8800b4914880
ESTAB      0      4              192.168.0.17:49041         192.168.0.26:8080   timer:(on,732ms,5) users:(("tcp_connections",3342,18)) uid:1000 ino:23575 sk:ffff8800b4916cc0
ESTAB      0      0              192.168.0.17:49039         192.168.0.26:8080   users:(("tcp_connections",3342,16)) uid:1000 ino:23539 sk:ffff8800b4915700
ESTAB      0      0              192.168.0.17:49032         192.168.0.26:8080   users:(("tcp_connections",3342,9)) uid:1000 ino:23426 sk:ffff8800b4912440
ESTAB      0      4              192.168.0.17:49042         192.168.0.26:8080   timer:(on,1.732ms,5) users:(("tcp_connections",3342,19)) uid:1000 ino:23593 sk:ffff8800b4910000
---
^C
root@test:/home/juan#

The last sample shows the remaining connections from ports 49043 to 49050 all of them in the same condition, 4 bytes not being acknowledged.

From the client side samples we can confirm we got 15 complete TCP connections (src ports from 49026 to 49040) and 10 that even though showed up as ESTABLISHED weren't able to get acknowledged the 4 bytes sent by the client (src ports from 49041 to 49050).

TCP connections on the server


Lets have a look now to the samples taken on the server and see if they match with what we found on the client.

[root@server juan]# for i in {1..100}; do ss -tpn|grep "State\|8080"; echo "---"; sleep 2;done
State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49026               users:(("nc",pid=13705,fd=5))
---
State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49026               users:(("nc",pid=13705,fd=5))
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49028               users:(("nc",pid=13705,fd=7))
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49027               users:(("nc",pid=13705,fd=6))
---
...

The first samples look ok, nothing fancy there, we can see the first 3 TCP connections using src ports from 49026 to 49028.

...
---
State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49026               users:(("nc",pid=13705,fd=5))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49031
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49033
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49029               users:(("nc",pid=13705,fd=8))
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49028               users:(("nc",pid=13705,fd=7))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49032
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49030
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49027               users:(("nc",pid=13705,fd=6))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49034
---
State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49036
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49026               users:(("nc",pid=13705,fd=5))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49035
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49031
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49033
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49029               users:(("nc",pid=13705,fd=8))
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49028               users:(("nc",pid=13705,fd=7))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49032
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49030
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49027               users:(("nc",pid=13705,fd=6))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49034
---
...

And here it got interesting! We have now 4 connections (src ports between 49026 and 49029) which besides being ESTABLISHED have a FD associated to our nc process. However we have another 7 ESTABLISHED connections (src ports from 49030 to 49036) with no FD associated (and no process either :D)!!! These 7 connections are the ones waiting on the queue for nc to issue the accept() call. nc is actually issuing the accept calls, but the calls are failing due to nc having reached the maximum open files (because of the "ulimit -n 9").
 
...
---
State      Recv-Q Send-Q Local Address:Port               Peer Address:Port
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49036
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49026               users:(("nc",pid=13705,fd=5))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49038
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49035
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49031
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49033
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49029               users:(("nc",pid=13705,fd=8))
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49028               users:(("nc",pid=13705,fd=7))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49032
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49039
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49030
ESTAB      0      0      192.168.0.26:8080               192.168.0.17:49027               users:(("nc",pid=13705,fd=6))
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49037
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49040
ESTAB      4      0      192.168.0.26:8080               192.168.0.17:49034
---
^C
[root@server juan]#

After the binary issued the 25 connections, on the server we can see 4 ESTABLISHED connections (src ports 49026 to 49029) with corresponding FDs and 11 ESTABLISHED connections (src ports 49030 to 49040) with no FD. So... where are the remaining 10 connections that show up as ESTABLISHED on the client??? Well, these 10 connections didn't reach ESTABLISHED state on the server :D as per the documentation of listen():

The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow.  If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.

Considering we are using TCP , the use case matches exactly with the underlined part.

To be continued...