jueves, 22 de septiembre de 2016

Hard links and Soft links - Do you really understand them?

I've noticed the Hard/Soft link concept seems to be a bit confusing for some people, some of them don't even known Hard links exist. Well, I must confess that I didn't quite get them at the beginning however after dealing a bit and breaking stuff you get to love understand them. Therefore this post is intended to explain them in details to understand their differences and the reasons behind them.

Test scenario


The test scenario is pretty simple:
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 17:45 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 17:45 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
juan@test:~/hard_soft$
just a folder with 1 file that we are going to use as target for the links. The content of the file is just a string "File1 content\n", so 14 bytes. We can see the inode numbers is 457337 -> File1 (every file has an inode number). We also see a number 1 on the third column, that is the number of "Hard Links" the inode has, we'll see this more in details later on :D.

We can also get similar information (and more) by using stat program:
juan@test:~/hard_soft$ stat File1
File: ‘File1’
  Size: 14              Blocks: 8          IO Block: 4096   regular file
Device: 801h/2049d      Inode: 457337      Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/    juan)   Gid: ( 1000/    juan)
Access: 2016-09-21 17:45:43.744729861 +0100
Modify: 2016-09-21 17:45:43.744729861 +0100
Change: 2016-09-21 17:45:43.744729861 +0100
 Birth: -
juan@test:~/hard_soft$ 
so this is the file we'll use to play.

Soft links aka Symbolic link


Lets take a look at the soft ones first. Long story short, creating a soft link is creating a new file that points to a target file, therefore you end up having 2 files, yeahp, trust me! You can see that happening here:
juan@test:~/hard_soft$ ln -s File1 LinkToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:06 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 17:45 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$
I just used "ln -s" to create a soft link called LinkToFile1 that points to File1. The particular syscall in use here is symlink, we can see that here:
juan@test:~/hard_soft$ strace ln -s File1 Link1ToFile1
execve("/bin/ln", ["ln", "-s", "File1", "Link1ToFile1"], [/* 22 vars */]) = 
brk(0)                                  = 0x13e0000
...
stat("Link1ToFile1", 0x7ffc1f9a1470)    = -1 ENOENT (No such file or directory)
symlink("File1", "Link1ToFile1")        = 0
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
juan@test:~/hard_soft$
ln first checks if there's a file already called Link1ToFile1 and since there isn't any, it moves forward and creates the symbolic link with that name pointing to file File1.

Taking a look at the "ls" output:
  • File LinkToFile1 has its own inode number 45734, therefore is an independent file. Also the 1 on the third column suggest there's a single link to the inode.
  • Before the permissions we have an "l", that suggests this is not a regular file, but a symbolic link.
  • The file size is different, isn't that funny? just 5 bytes! Why?
  • Permissions are kind of open, right? Yeahp, that's normal for soft links, the permissions of the target file are the ones that matters.
Regarding the size of LinkToFile1, what are these 5 bytes?
juan@test:~/hard_soft$ cat LinkToFile1
File1 content
juan@test:~/hard_soft$
oops... of course, doing "cat LinkToFile1" is in the end doing cat to File1! So how can we actually read the content of LinkToFile1? Lets see if strace can help here (wanna know more about strace? take a look at this post):
juan@test:~/hard_soft$ strace cat LinkToFile1
execve("/bin/cat", ["cat", "LinkToFile1"], [/* 22 vars */]) = 0
brk(0)                                  = 0x149f000
...
open("LinkToFile1", O_RDONLY)           = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=14, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "File1 content\n", 65536)       = 14
write(1, "File1 content\n", 14File1 content
)         = 14
read(3, "", 65536)                      = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
juan@test:~/hard_soft$
turns out that by default open syscall will recognize the file as a soft link and will follow it (you can avoid this with certain flags). In the end, the returned FD 3 will actually point to File1 and that's why the read syscall returns "File1 content\n". So how can we actually retrieve the content of LinkToFile1?, well we can use readlink program (which actually uses readlink syscall xD) to read the content of a Symbolic link, just like this:
juan@test:~/hard_soft$ readlink LinkToFile1
File1
juan@test:~/hard_soft$
Yes :D, the content of LinkToFile1 is File1, the name of the file (the relative path actually) that's why the size is 5 bytes!!! But if the content of LinkToFile1 is a path to File1 what happens if I move File1 somewhere else?
Lets have a look:
juan@test:~/hard_soft$ mv File1 ..
juan@test:~/hard_soft$ ls -lai
total 12
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:44 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:44 ..
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$ readlink LinkToFile1
File1
juan@test:~/hard_soft$ cat LinkToFile1
cat: LinkToFile1: No such file or directory
juan@test:~/hard_soft$
exactly, the link breaks and it doesn't work anymore! Same thing happens if we remove the target file or if we move LinkToFile1 instead:
juan@test:~/hard_soft$ ls
File1  File2  LinkToFile1
juan@test:~/hard_soft$ mv LinkToFile1 ../
juan@test:~/hard_soft$ ll -i ../LinkToFile1
457342 lrwxrwxrwx 1 juan juan 5 Sep 21 18:06 ../LinkToFile1 -> File1
juan@test:~/hard_soft$ cat ../LinkToFile1
cat: ../LinkToFile1: No such file or directory
juan@test:~/hard_soft$
We could workaround the moving link file issue by using the full path to File1 as target when creating the link instead of the relative one:
juan@test:~/hard_soft$ ln -s /home/juan/hard_soft/File1 Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:50 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
457343 lrwxrwxrwx  1 juan juan   26 Sep 21 18:50 Link2ToFile1 -> /home/juan/hard_soft/File1
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$ readlink Link2ToFile1
/home/juan/hard_soft/File1
juan@test:~/hard_soft$
What if I delete a Soft link? since soft links are just files, deleting one of them will just make it go away and nothing will happen with the target/linked file:
juan@test:~/hard_soft$ rm Link*
juan@test:~/hard_soft$ ls -lai
total 12
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:14 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
juan@test:~/hard_soft$
What actually happens when you delete a file is that the number of links in the inode is decreased by one (and the entry gets removed from the directory). Once the number of links reaches 0, the file is officially gone (unless there's a running process that has a FD using it).

So... Soft links are just files and its content is the path to the targeted/linked file, makes perfect sense now, right?!

Hard links


The story is a bit different with Hard Links. When you create one you DO NOT get an extra file, nope you don't. Creating a Hard Link increases the number of links for a particular inode, lets see an example:
juan@test:~/hard_soft$ ln File1 Link1ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:24 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 Link1ToFile1
juan@test:~/hard_soft$
Note: the syscall in play here is link or linkat.

Interesting! looks like 2 files, but:
  • They both have the same inode number! therefore they are the same file. The reason behind this is that directory wise, there are indeed two directory entries, one for File1 and one for Link1ToFile1
  • Permissions are the same, that makes sense because is the same file (xD you are getting my point, right?), and that also applies for the rest of the properties MAC times for example.
Does that mean that if I have a third Hard link I'll get 3 as number of links for inode 457337, yeahp, that's correct:
juan@test:~/hard_soft$ ln File1 Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 20
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:56 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link1ToFile1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$
The good thing about hard links, is that you can move them around and they just keep working:
juan@test:~/hard_soft$ mv Link1ToFile1 ../
juan@test:~/hard_soft$ cat ../Link1ToFile1
File1 content
juan@test:~/hard_soft$
that's because by moving the hard link, the directory entry in hard_soft directory was removed and a corresponding one was created on the parent directory (/home/juan), so accessing the link keeps working. Did this change the number of links on inode 457337?
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 20:01 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 20:01 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$ ls -lai ../Link1ToFile1
457337 -rw-rw-r-- 3 juan juan 14 Sep 21 17:45 ../Link1ToFile1
juan@test:~/hard_soft$
of course not, inode 457337 keeps having 3 links.

Then what if I delete a Hard link? As we mentioned before deleting a file decreases by one the link counter on the inode, therefore if we have 3 hard links and we delete one of them we'll be back to 2, like you can see here:
juan@test:~/hard_soft$ ls -lai
total 20
457232 drwxrwxr-x  2 juan juan 4096 Sep 22 19:52 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 22 19:51 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link1ToFile1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$ rm Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 22 19:53 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 22 19:51 ..
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 Link1ToFile1
juan@test:~/hard_soft$

Summary


To wrap up the idea, I'll summarize the most important points here:

  • A soft link is another file (link type though), its content indicates the location of the targeted file.
  • You can have a soft links pointing to a file in a different partitions, which is NOT possible with hard links.
  • Hard links don't require extra space or inodes, and they can be moved around (in the same partition) and will keep working fine.
  • Every time you create a file, a hard link is created and that's link 1 in the inode :D.

lunes, 5 de septiembre de 2016

Linux limits 101 - Ulimit

Resources aren't infinite, and that's old news, right? We are usually worried about disk space and memory utilization, however these are far from being the only available resources on a Linux system you should worry about. A few months ago I wrote an entry about cgroups and I mentioned they are a way to limit/assign resources to processes, this time we'll see a different kind of restrictions that can also cause some production pain situations.

Ulimit - Users Limits


User limits are restrictions enforced to processes spawn from your shell, and they are placed in order to keep users under control, somehow. For every resource that is tracked (by the kernel) there are 2 limits, a soft limit and a hard limit. While the hard limit cannot be changed by an unprivileged processes, the soft limit can be raised up to the hard limit if necessary (more details here).

We can see the User limits by using the bash builtin command ulimit, for example we can see the soft limits with ulimit -a:
juan@test:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11664
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 11664
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
juan@test:~$
and we can see the hard limits with -aH:
juan@test:~$ ulimit -aH
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 11664
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 11664
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
juan@test:~$
There are two different situations here:
  • Some resources like "open files" and "core file size" have a soft limit lower than the hard limit, which means the process itself can increase it if necessary. 
  • Other resources like "max memory size" and "max user processes" have both soft and hard limit with the same value, which suggests  the user can only decrease the value.
These limits are inherited by the child processes after a fork call and they are maintained after an execve call.  

Using ulimit command, you can also update the values, for example if we believe the maximum number of open files per process (1024) is not enough, we can go ahead and raise the soft limit with -S, like this:
juan@test:~$ ulimit -n
1024
juan@test:~$ ulimit -S -n 2048
juan@test:~$ ulimit -n
2048
juan@test:~$
now the process (our shell in this case) can open up to 2048 files. If we spawn a new process out of this shell we'll see the limit is still there:
juan@test:~$ /bin/bash
juan@test:~$ ulimit -n
2048
juan@test:~$ exit
exit
juan@test:~$
Using -H we can decrease (or increase if it's a privileged process) the hard limit for a particular value, but be careful you can't increase it back!!!
juan@test:~$ ulimit -H -n 1027
juan@test:~$ ulimit -Hn
1027
juan@test:~$ ulimit -H -n 1028
-bash: ulimit: open files: cannot modify limit: Operation not permitted
juan@test:~$
at this point we decreased the hard limit from 4096 to 1027 so if we want to open more than 1027 files with this particular process we won't be able to.
All these changes we've done on the soft and hard limits are persistent as long as the shell used is still there, if we just close that shell and open a new one the default limits will come back to play. So how the heck do I get them to be persistent?

File /etc/security/limits.conf


This is the file used by pam_limits module to enforce ulimit  limits to all the user sessions on the system. Just by reading the comments on the file you will be able to understand its syntax, for more check here. I could easily change the default ulimits for user juan by adding for example:
juan               soft           nofile             2048
this would increase the soft limit for the number of files a process can open. The change will take effect for the next session, not for the current one.

C examples


Just for the sake of fun, I wrote a small C program that will try to open 2048 files, and will abort if it doesn't succeed. The first code open_files.c is here:
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#define SIZE 2048

int main()
{
        int open_files[SIZE];
        int index=0;
        int i,keep_it;

        for(i=0;i<SIZE;i++)
        {
                printf("Opening file number %d:\n",i);
                open_files[i]=open("/etc/passwd",O_RDONLY);
                keep_it=errno;//we save errno before doing anything else
                if(open_files[i] == -1)
                {
                        printf("%s\n",strerror(keep_it));//we print the system error that corresponds to errno
                        return open_files[i];
                }
                printf("Opened file number %d, assigned FD=%d:\n",i,open_files[i]);
        }
        printf("%d files have been opened.\n",SIZE);

        return 0;
}
if you compile and run it you should see something like:
juan@test:~/ulimit$ ./open_files
Opening file number 0:
Opened file number 0, assigned FD=3:
Opening file number 1:
Opened file number 1, assigned FD=4:
Opening file number 2:
Opened file number 2, assigned FD=5:
Opening file number 3:
Opened file number 3, assigned FD=6:
Opening file number 4:
Opened file number 4, assigned FD=7:
...
Opening file number 1018:
Opened file number 1018, assigned FD=1021:
Opening file number 1019:
Opened file number 1019, assigned FD=1022:
Opening file number 1020:
Opened file number 1020, assigned FD=1023:
Opening file number 1021:
Too many open files
juan@test:~/ulimit$
a few things to take from the previous run:
  • The first file descriptor returned by the open syscall is 3, Why is that? :D exactly, because FD 0 is the STDIN, FD 1 is STOUT and FD 2 is STDERR, so the first available file descriptor for a new process is 3.
  • As soon as the process tries to open the 1021st file the open call returns -1 and sets errno to "Too many open files". This is because the maximum number of open files has been reached.
How could we address this? Well, the easiest way would by changing the soft limit before running the program, but that would allow all newly spawn processes to open 2048 files and we might not want that side effect. So lets change the soft limit inside the C program:
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#define SIZE 2048
#define JUMP 100

int main()
{
        int open_files[SIZE];
        int index=0;
        int i,keep_it,aux;
        struct rlimit old, new;

        for(i=0;i%lt;SIZE;i++)
        {
                printf("Opening file number %d:\n",i);
                open_files[i]=open("/etc/passwd",O_RDONLY);
                keep_it=errno;//we save errno before doing anything else
                if(open_files[i] == -1)
                {
                        if(keep_it == 24)//Too many open files
                        {
                                printf("%s\n",strerror(keep_it));//we print the system error that corresponds to errno
                                printf("Increasing NOFILE in %d\n",JUMP);
                                getrlimit(RLIMIT_NOFILE,&old);
                                printf("Current soft limit %d, current hard limit %d\n",(int)old.rlim_cur,(int)old.rlim_max);
                                new.rlim_max=old.rlim_max;
                                new.rlim_cur=old.rlim_cur+JUMP;
                                aux=setrlimit(RLIMIT_NOFILE,&new);
                                keep_it=errno;
                                if(aux==0)
                                {
                                        i=i-1;//reduce i in 1 to "move back" the loop one cycle.
                                }
                                else
                                {
                                        printf("Couldn't raise the soft limit: %s\n",strerror(keep_it));
                                        return -1;
                                }
                        }
                        else
                        {//some different error
                                return -1;
                        }
                }
                else
                {
                        printf("Opened file number %d, assigned FD=%d:\n",i,open_files[i]);
                }
        }
        printf("%d files have been opened.\n",SIZE);

        return 0;
}

The example will get the current soft and hard limit using getrlimit syscall and then update the soft limit using setrlimit. Two rlimit structures were added to the code, old and new, in order to update the limit. We can see the update is done by adding JUMP to the current limit, in this case adding 100. The rest of the code is pretty much the same :D.

If we ran the new code we'll see something like:
juan@test:~/ulimit$ ./open_files_increase_soft
Opening file number 0:
Opened file number 0, assigned FD=3:
Opening file number 1:
Opened file number 1, assigned FD=4:
Opening file number 2:
Opened file number 2, assigned FD=5:
Opening file number 3:
Opened file number 3, assigned FD=6:
Opening file number 4:
Opened file number 4, assigned FD=7:
...
Opening file number 1019:
Opened file number 1019, assigned FD=1022:
Opening file number 1020:
Opened file number 1020, assigned FD=1023:
Opening file number 1021:
Too many open files
Increasing NOFILE in 100
Current soft limit 1024, current hard limit 4096
Opening file number 1021:
Opened file number 1021, assigned FD=1024:
Opening file number 1022:
Opened file number 1022, assigned FD=1025:
...
Opened file number 2043, assigned FD=2046:
Opening file number 2044:
Opened file number 2044, assigned FD=2047:
Opening file number 2045:
Opened file number 2045, assigned FD=2048:
Opening file number 2046:
Opened file number 2046, assigned FD=2049:
Opening file number 2047:
Opened file number 2047, assigned FD=2050:
2048 files have been opened.
juan@test:~/ulimit$
now the process was able to open 2048 files by increasing its soft limit slowly on demand.

Wrapping up 


So whenever you are working with production systems you need to be aware of these limits unless of course you enjoy getting paged randomly haha. I've seen production systems going unresponsive because of reaching these limits, bear in mind that when we talk about Open Files we talk about file descriptors therefore this limit also applies to network connections, not just files! On top of that, if the application doesn't capture the error showing it on its logs it can be pretty hard to spot...