jueves, 17 de noviembre de 2016

Where did my disk space go??? Is df lying to me? or is it du lying?

I had an interesting situation today and thought might be a good idea to share it here. Basically the problem was more or less the following:
  1. A critical production system was running out of disk space and things were going downhills scarcely fast, df was showing utilization at 99% already.
  2. The user decided to delete some big log files in order to free up some space.
  3. After that df was still showing 99% , however running du on the root directory would show less space being utilized.
You can imagine now how unhappy the user was when realized that deleting files was not freeing any disk space and he was about to see the system become totally useless. On top of that he couldn't explain why df and du wouldn't come into an agreement in terms of free space on the system.

What was happening here? Was df or du lying to the user?


Well... software in general doesn't lie, usually what actually happens is that we don't really understand how it works and therefore we expect a different behavior. The root cause of the problem mentioned before is the unlink syscall, why? we can get the answer by asking the oracle (man 2 unlink):
DESCRIPTION
       unlink() deletes a name from the filesystem.  If that name was the last
       link to a file and no processes have the file open, the file is deleted
       and the space it was using is made available for reuse.

       If  the  name  was the last link to a file but any processes still have
       the file open, the file will remain in existence until  the  last  file
       descriptor referring to it is closed.

       ...
Yes :D, turns out that if you delete a file and there's at least one process that has a file descriptor pointing to that file, the space of the file won't be available right away.

Lets reproduce this:


The easiest way to understand this is by actually reproducing the behavior. I just used dd to fill-up the drive and the result was this:
[ec2-user@ip-172-31-16-117 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G   56K  3.9G   1% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
/dev/xvda1      7.8G  7.7G     0 100% /
[ec2-user@ip-172-31-16-117 ~]$ du -sch / 2>/dev/null
7.7G    /
7.7G    total
[ec2-user@ip-172-31-16-117 ~]$ echo a > file
-bash: echo: write error: No space left on device
[ec2-user@ip-172-31-16-117 ~]$
the system has clearly ran out of disk space at this point and we can't even add a single byte to a file. Now the next step would be to free up some space, so we check with du where we can find some files to delete:
[ec2-user@ip-172-31-16-117 ~]$ du -sch /* 2>/dev/null
7.0M    /bin
25M     /boot
4.0K    /cgroup
56K     /dev
9.1M    /etc
6.7G    /home
46M     /lib
23M     /lib64
4.0K    /local
16K     /lost+found
4.0K    /media
4.0K    /mnt
43M     /opt
0       /proc
4.0K    /root
8.0K    /run
12M     /sbin
4.0K    /selinux
4.0K    /srv
0       /sys
8.0K    /tmp
802M    /usr
48M     /var
7.7G    total
[ec2-user@ip-172-31-16-117 ~]$
the size of /home is a bit crazy considering the size of the partition itself so lets delete some files from /home/ec2-user:
[ec2-user@ip-172-31-16-117 ~]$ ls -lah
total 6.7G
drwx------ 3 ec2-user ec2-user 4.0K Nov 17 20:10 .
drwxr-xr-x 3 root     root     4.0K Nov 17 18:42 ..
-rw-r--r-- 1 ec2-user ec2-user   18 Aug 15 23:52 .bash_logout
-rw-r--r-- 1 ec2-user ec2-user  193 Aug 15 23:52 .bash_profile
-rw-r--r-- 1 ec2-user ec2-user  124 Aug 15 23:52 .bashrc
-rw-rw-r-- 1 ec2-user ec2-user 306M Nov 17 20:09 big_file
-rw-rw-r-- 1 ec2-user ec2-user    0 Nov 17 20:10 file
-rw-rw-r-- 1 ec2-user ec2-user 6.4G Nov 17 18:44 fillingup
-rwxrwxr-x 1 ec2-user ec2-user 7.0K Nov 17 20:09 service
-rw-rw-r-- 1 ec2-user ec2-user  309 Nov 17 20:09 service.c
drwx------ 2 ec2-user ec2-user 4.0K Nov 17 18:42 .ssh
-rw------- 1 ec2-user ec2-user 2.1K Nov 17 20:09 .viminfo
[ec2-user@ip-172-31-16-117 ~]$ rm big_file
[ec2-user@ip-172-31-16-117 ~]$
[ec2-user@ip-172-31-16-117 ~]$ echo a > file
-bash: echo: write error: No space left on device
[ec2-user@ip-172-31-16-117 ~]$
I've just deleted a 306Mbytes file and can't even write 1 single byte!!! Having a look at df and du things look like:
[ec2-user@ip-172-31-16-117 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G   56K  3.9G   1% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
/dev/xvda1      7.8G  7.7G     0 100% /
[ec2-user@ip-172-31-16-117 ~]$ du -sch / 2>/dev/null
7.4G    /
7.4G    total
[ec2-user@ip-172-31-16-117 ~]$
while du shows the expected size, df insists in saying the partition is at 100% utilization. Cool, with just a few steps we have reproduced the situation!

The reason behind the difference between df and du is in the fact that du calculates the size of the folders by adding up the size of the files inside and big_file does not exist anymore on the folder (has been removed from the directory entry), while df shows the real available space of the file system.

With all this in mind, the only reason why the file system size wouldn't change after deleting a file is because there's at least one process that keeps a file descriptor pointing to the deleted file (you can read more about open files and file descriptors in Linux limits 102 - Open Files). We can easily identify the process that holds the file descriptor by simple using our lovely lsof:
[ec2-user@ip-172-31-16-117 ~]$ lsof |grep "\(deleted\)"
service   23177 ec2-user    3r      REG  202,1 319934464  14708 /home/ec2-user/big_file (deleted)
[ec2-user@ip-172-31-16-117 ~]$
ja! turns out process "service" with PID 23177 has a file descriptor that points to file /home/ec2-user/big_file and that file has been deleted!!! So if we get rid of process service then the space should be finally available, lets confirm that:
[ec2-user@ip-172-31-16-117 ~]$ kill -9 23177
[ec2-user@ip-172-31-16-117 ~]$ 
[1]+  Killed                  ./service
[ec2-user@ip-172-31-16-117 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        3.9G   56K  3.9G   1% /dev
tmpfs           3.9G     0  3.9G   0% /dev/shm
/dev/xvda1      7.8G  7.4G  306M  97% /
[ec2-user@ip-172-31-16-117 ~]$ du -sch / 2>/dev/null
7.4G    /
7.4G    total
[ec2-user@ip-172-31-16-117 ~]$
and magic happened!!! When the processes was terminated the fd was finally closed and the last reference to the file disappeared causing the space to bee available now and therefore df matches with du now.

Depending on which process is keeping the file open you may have other options besides killing it, for example:
  • Restarting it if it's a service.
  • Perhaps sending another signal (for example to rotate logs?). 
  • If for some reason (uninterruptible state) you can't kill the process then the only easy way here is to restart the whole system.