domingo, 16 de octubre de 2016

Linux limits 102 - Open files

This post is kind of a follow up of Linux limits 101 - Ulimit from a few weeks ago where we went through ulimit and how they are used to limit users behaviors. This time we'll take a look at another limit the kernel can impose on us and therefore make our lives a bit sower, the number of open files.

I think it's worth mentioning (I may have said this before xD) that when we talk about Open Files in Linux systems we are basically talking about File descriptors (aka file handles). A file descriptor is a data structure used by processes to access: files, Unix sockets, Networking sockets, pipes, etc. Every new process comes by default with 3 file descriptors:
  • FD 0: standard input
  • FD 1: standard output
  • FD 2: standard error
So just by having lets say:
juan@test:~$ ps aux|wc -l
142
juan@test:~$
about 142 processes running on the system, we should expect at least around 426 file descriptors to be in use (142x3=426). What if there was a way to know how many file descriptors a particular process is using?

File descriptors a process is using:


Yeahp, of course that's possible! And as always in Linux, there are at least two different ways. The first approach will be of course the easiest one, each process has a folder under /proc that will provide loads of information, in this case will focus on subfolder fd, where guess what's going to show up? indeed, the file descriptors for the process:
root@test:/home/juan# ls /proc/1/fd | wc -l
24
root@test:/home/juan#
init process (PID=1) has 24 file descriptors in use! We can see more details about them in the next output:
root@test:/home/juan# ls -la /proc/1/fd
total 0
dr-x------ 2 root root  0 Oct 15 10:39 .
dr-xr-xr-x 9 root root  0 Oct 15 10:38 ..
lrwx------ 1 root root 64 Oct 15 10:39 0 -> /dev/null
lrwx------ 1 root root 64 Oct 15 10:39 1 -> /dev/null
lrwx------ 1 root root 64 Oct 15 10:39 10 -> socket:[8662]
lrwx------ 1 root root 64 Oct 15 10:39 11 -> socket:[9485]
l-wx------ 1 root root 64 Oct 15 10:39 12 -> /var/log/upstart/network-manager.log.1 (deleted)
lrwx------ 1 root root 64 Oct 15 10:39 14 -> socket:[10329]
l-wx------ 1 root root 64 Oct 15 10:39 16 -> /var/log/upstart/systemd-logind.log.1 (deleted)
lrwx------ 1 root root 64 Oct 15 10:39 17 -> socket:[8637]
lrwx------ 1 root root 64 Oct 15 10:39 18 -> /dev/ptmx
lrwx------ 1 root root 64 Oct 15 10:39 2 -> /dev/null
lrwx------ 1 root root 64 Oct 15 10:39 20 -> /dev/ptmx
lrwx------ 1 root root 64 Oct 15 10:39 22 -> /dev/ptmx
l-wx------ 1 root root 64 Oct 15 10:39 24 -> /var/log/upstart/modemmanager.log.1 (deleted)
lrwx------ 1 root root 64 Oct 15 10:39 29 -> /dev/ptmx
lr-x------ 1 root root 64 Oct 15 10:39 3 -> pipe:[8403]
lrwx------ 1 root root 64 Oct 15 10:39 30 -> /dev/ptmx
l-wx------ 1 root root 64 Oct 15 10:39 31 -> /var/log/upstart/mysql.log.1 (deleted)
lrwx------ 1 root root 64 Oct 15 10:39 34 -> /dev/ptmx
lrwx------ 1 root root 64 Oct 15 10:39 36 -> /dev/ptmx
l-wx------ 1 root root 64 Oct 15 10:39 4 -> pipe:[8403]
lr-x------ 1 root root 64 Oct 15 10:39 5 -> anon_inode:inotify
lr-x------ 1 root root 64 Oct 15 10:39 6 -> anon_inode:inotify
lrwx------ 1 root root 64 Oct 15 10:39 7 -> socket:[8404]
lrwx------ 1 root root 64 Oct 15 10:39 9 -> socket:[12675]
root@test:/home/juan#
we can see how the file descriptors are represented as links, a brief of the output would be:
  • Default file descriptors (0,1 and 2) have been pointed to /dev/null, which is ok for a process like init that isn't an interactive process.
  • There are a couple of UNIX sockets "socket:[XXXX]" opened (7,9,10, 11, etc), probably to connect to other processes.
  • There's a pipe "pipe:[8403]" as well using two fd (3 and 4) that's normal, pipes provide a fd to write and one to read while data is buffered on the kernel.
  • The rest of the fs point to:
    • /dev/ptmx pseudo terminal device.
    • inotify a way to monitor changes on files, this means init is interested in the events on two particular fd.
    • some deleted log files like /var/log/upstart/mysql.log.1 this is odd. Probably files were rotated or something like that.
 If for some reason these details weren't enough, you can go hardcore and try the second way.  lsof (list open files, makes sense, right?) is the program you need for that.

Lets list all the open files for a particular process using lsof, init in this case:
root@test:/home/juan# lsof -p 1
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/112/gvfs
      Output information may be incomplete.
COMMAND PID USER   FD   TYPE             DEVICE SIZE/OFF   NODE NAME
init      1 root  cwd    DIR                8,1     4096      2 /
init      1 root  rtd    DIR                8,1     4096      2 /
init      1 root  txt    REG                8,1   265848 261189 /sbin/init
init      1 root  mem    REG                8,1    43616 581960 /lib/x86_64-linux-gnu/libnss_files-2.19.so
init      1 root  mem    REG                8,1    47760 555508 /lib/x86_64-linux-gnu/libnss_nis-2.19.so
init      1 root  mem    REG                8,1    97296 555504 /lib/x86_64-linux-gnu/libnsl-2.19.so
init      1 root  mem    REG                8,1    39824 555503 /lib/x86_64-linux-gnu/libnss_compat-2.19.so
init      1 root  mem    REG                8,1    14664 555500 /lib/x86_64-linux-gnu/libdl-2.19.so
init      1 root  mem    REG                8,1   252032 540246 /lib/x86_64-linux-gnu/libpcre.so.3.13.1
init      1 root  mem    REG                8,1   141574 555505 /lib/x86_64-linux-gnu/libpthread-2.19.so
init      1 root  mem    REG                8,1  1840928 581957 /lib/x86_64-linux-gnu/libc-2.19.so
init      1 root  mem    REG                8,1    31792 581956 /lib/x86_64-linux-gnu/librt-2.19.so
init      1 root  mem    REG                8,1    43464 527349 /lib/x86_64-linux-gnu/libjson-c.so.2.0.0
init      1 root  mem    REG                8,1   134296 527439 /lib/x86_64-linux-gnu/libselinux.so.1
init      1 root  mem    REG                8,1   281552 527323 /lib/x86_64-linux-gnu/libdbus-1.so.3.7.6
init      1 root  mem    REG                8,1    38920 527371 /lib/x86_64-linux-gnu/libnih-dbus.so.1.0.0
init      1 root  mem    REG                8,1    96280 527373 /lib/x86_64-linux-gnu/libnih.so.1.0.0
init      1 root  mem    REG                8,1   149120 555506 /lib/x86_64-linux-gnu/ld-2.19.so
init      1 root    0u   CHR                1,3      0t0   1029 /dev/null
init      1 root    1u   CHR                1,3      0t0   1029 /dev/null
init      1 root    2u   CHR                1,3      0t0   1029 /dev/null
init      1 root    3r  FIFO                0,9      0t0   8403 pipe
init      1 root    4w  FIFO                0,9      0t0   8403 pipe
init      1 root    5r  0000               0,10        0   7661 anon_inode
init      1 root    6r  0000               0,10        0   7661 anon_inode
init      1 root    7u  unix 0xffff8800b37c8780      0t0   8404 @/com/ubuntu/upstart
init      1 root    9u  unix 0xffff8800a14e7c00      0t0  12675 @/com/ubuntu/upstart
init      1 root   10u  unix 0xffff8800b37c9a40      0t0   8662 @/com/ubuntu/upstart
init      1 root   11u  unix 0xffff8800b37a1e00      0t0   9485 @/com/ubuntu/upstart
init      1 root   12w   REG                8,1      283 551619 /var/log/upstart/network-manager.log.1 (deleted)
init      1 root   14u  unix 0xffff8800b37a3c00      0t0  10329 @/com/ubuntu/upstart
init      1 root   16w   REG                8,1      451 522345 /var/log/upstart/systemd-logind.log.1 (deleted)
init      1 root   17u  unix 0xffff8800b37cb0c0      0t0   8637 socket
init      1 root   18u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   20u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   22u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   24w   REG                8,1      502 527289 /var/log/upstart/modemmanager.log.1 (deleted)
init      1 root   29u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   30u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   31w   REG                8,1      881 552236 /var/log/upstart/mysql.log.1 (deleted)
init      1 root   34u   CHR                5,2      0t0   1932 /dev/ptmx
init      1 root   36u   CHR                5,2      0t0   1932 /dev/ptmx
root@test:/home/juan#
now we can see a few more things, like:
  • Details of the FD, like its Type, Device it belongs to, etc.
  • We can see also some things that aren't really opened fd but some extra process information:
    • cwd current working directory
    • rtd root directory
    • txt init's binary code file
    • memory mapped files in this case bunch of system libraries. YES, these had a fd when they were mapped, but the fd was closed right after the mmap call was successful (you can see that checking this entry about strace).
Ok, now we know what a file descriptor or file handle is and how to identify them and map them to our processes. Is there any system wide limit for the file descriptors you can open?
 

Max open files, system wide:


If the answer for the previous question was now, I wouldn't have a reason to write this article in the first place I guess xD, therefore the answer is YES :P. The kernel is cool, and in order to play safe it has to set limits (I sound like a father now...) to avoid bigger problems.

The maximum number of open files the kernel can handle can be obtained from our beloved /proc, particularly in file file-nr under sys/fs directory. Here we can see the numbers for my current system:
root@test:/home/juan# cat /proc/sys/fs/file-nr
2944 0 298505
root@test:/home/juan#
These values mean the following:
  • First value (2944) indicates the number of allocated file descriptors, these are allocated dynamically by the kernel.
  • Second value (0) is the number of allocated but unused file descriptor. Kernels from 2.6.something free any unused fd, so this value should always be 0.
  • Third value (298505) indicates the maximum number of file descriptors that the kernel can allocate (also visible on file-max file).
Summing up, there are 2944 file descriptors allocated and in use at this precise moment. Almost 3k file descriptors allocated for about 142, interesting right?

Just for the sake of it, lets track down the process using the most number of file descriptors:
root@test:/home/juan# for i in `ps -Ao pid|grep -v PID`;do count=`ls /proc/$i/fd/ 2> /dev/null|wc -l`; echo "$count $i";done | sort -nr | head
61 1075
48 393
32 1264
31 1365
27 1132
24 1
21 440
20 1265
19 1325
19 1311
root@test:/home/juan#
there we see the top ten (first column is the number of FD and the second is the PID). Interestingly enough there's a process using 61 file descriptors, turns out I had mysqld installed on this VM (had no idea...):
root@test:/home/juan# ps aux|grep 1075
mysql     1075  0.1  1.9 624040 57792 ?        Ssl  10:38   0:07 /usr/sbin/mysqld
root      9664  0.0  0.0  15948  2232 pts/1    S+   12:35   0:00 grep --color=auto 1075
root@test:/home/juan#
 

Increasing the limit


If by any chance the almost 300k file descriptors the kernel allows to open is not enough (some busy systems may reach that limit) you will notice logs like "VFS: file-max limit reached" on dmesg and probably in messages or syslog files. In that case,  you can increase that limit using one of the following ways:
  • iminproductionpainchangeitrightnowgoddamnit way, by just updating /proc with the new value, like
root@test:/home/juan# echo 400000 > /proc/sys/fs/file-max
root@test:/home/juan# cat /proc/sys/fs/file-nr
3072    0       400000
root@test:/home/juan#
  • Or you can be more elegant and use sysctl command:
root@test:/home/juan# sysctl -w fs.file-max=500000
fs.file-max = 500000
root@test:/home/juan# cat /proc/sys/fs/file-nr
3072    0       500000
root@test:/home/juan#

In any case, don't forget to make the change persistent by doing:
root@test:/home/juan# echo "fs.file-max=500000" >> /etc/sysctl.conf
root@test:/home/juan#

If the error showing on your logs is instead "Too many open files" then the limit you've reached is most likely the ulimit for the user :D, which you know how to deal with because you've read this.

And that's about it!