jueves, 22 de septiembre de 2016

Hard links and Soft links - Do you really understand them?

I've noticed the Hard/Soft link concept seems to be a bit confusing for some people, some of them don't even known Hard links exist. Well, I must confess that I didn't quite get them at the beginning however after dealing a bit and breaking stuff you get to love understand them. Therefore this post is intended to explain them in details to understand their differences and the reasons behind them.

Test scenario


The test scenario is pretty simple:
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 17:45 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 17:45 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
juan@test:~/hard_soft$
just a folder with 1 file that we are going to use as target for the links. The content of the file is just a string "File1 content\n", so 14 bytes. We can see the inode numbers is 457337 -> File1 (every file has an inode number). We also see a number 1 on the third column, that is the number of "Hard Links" the inode has, we'll see this more in details later on :D.

We can also get similar information (and more) by using stat program:
juan@test:~/hard_soft$ stat File1
File: ‘File1’
  Size: 14              Blocks: 8          IO Block: 4096   regular file
Device: 801h/2049d      Inode: 457337      Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/    juan)   Gid: ( 1000/    juan)
Access: 2016-09-21 17:45:43.744729861 +0100
Modify: 2016-09-21 17:45:43.744729861 +0100
Change: 2016-09-21 17:45:43.744729861 +0100
 Birth: -
juan@test:~/hard_soft$ 
so this is the file we'll use to play.

Soft links aka Symbolic link


Lets take a look at the soft ones first. Long story short, creating a soft link is creating a new file that points to a target file, therefore you end up having 2 files, yeahp, trust me! You can see that happening here:
juan@test:~/hard_soft$ ln -s File1 LinkToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:06 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 17:45 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$
I just used "ln -s" to create a soft link called LinkToFile1 that points to File1. The particular syscall in use here is symlink, we can see that here:
juan@test:~/hard_soft$ strace ln -s File1 Link1ToFile1
execve("/bin/ln", ["ln", "-s", "File1", "Link1ToFile1"], [/* 22 vars */]) = 
brk(0)                                  = 0x13e0000
...
stat("Link1ToFile1", 0x7ffc1f9a1470)    = -1 ENOENT (No such file or directory)
symlink("File1", "Link1ToFile1")        = 0
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
juan@test:~/hard_soft$
ln first checks if there's a file already called Link1ToFile1 and since there isn't any, it moves forward and creates the symbolic link with that name pointing to file File1.

Taking a look at the "ls" output:
  • File LinkToFile1 has its own inode number 45734, therefore is an independent file. Also the 1 on the third column suggest there's a single link to the inode.
  • Before the permissions we have an "l", that suggests this is not a regular file, but a symbolic link.
  • The file size is different, isn't that funny? just 5 bytes! Why?
  • Permissions are kind of open, right? Yeahp, that's normal for soft links, the permissions of the target file are the ones that matters.
Regarding the size of LinkToFile1, what are these 5 bytes?
juan@test:~/hard_soft$ cat LinkToFile1
File1 content
juan@test:~/hard_soft$
oops... of course, doing "cat LinkToFile1" is in the end doing cat to File1! So how can we actually read the content of LinkToFile1? Lets see if strace can help here (wanna know more about strace? take a look at this post):
juan@test:~/hard_soft$ strace cat LinkToFile1
execve("/bin/cat", ["cat", "LinkToFile1"], [/* 22 vars */]) = 0
brk(0)                                  = 0x149f000
...
open("LinkToFile1", O_RDONLY)           = 3
fstat(3, {st_mode=S_IFREG|0664, st_size=14, ...}) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "File1 content\n", 65536)       = 14
write(1, "File1 content\n", 14File1 content
)         = 14
read(3, "", 65536)                      = 0
close(3)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++
juan@test:~/hard_soft$
turns out that by default open syscall will recognize the file as a soft link and will follow it (you can avoid this with certain flags). In the end, the returned FD 3 will actually point to File1 and that's why the read syscall returns "File1 content\n". So how can we actually retrieve the content of LinkToFile1?, well we can use readlink program (which actually uses readlink syscall xD) to read the content of a Symbolic link, just like this:
juan@test:~/hard_soft$ readlink LinkToFile1
File1
juan@test:~/hard_soft$
Yes :D, the content of LinkToFile1 is File1, the name of the file (the relative path actually) that's why the size is 5 bytes!!! But if the content of LinkToFile1 is a path to File1 what happens if I move File1 somewhere else?
Lets have a look:
juan@test:~/hard_soft$ mv File1 ..
juan@test:~/hard_soft$ ls -lai
total 12
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:44 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:44 ..
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$ readlink LinkToFile1
File1
juan@test:~/hard_soft$ cat LinkToFile1
cat: LinkToFile1: No such file or directory
juan@test:~/hard_soft$
exactly, the link breaks and it doesn't work anymore! Same thing happens if we remove the target file or if we move LinkToFile1 instead:
juan@test:~/hard_soft$ ls
File1  File2  LinkToFile1
juan@test:~/hard_soft$ mv LinkToFile1 ../
juan@test:~/hard_soft$ ll -i ../LinkToFile1
457342 lrwxrwxrwx 1 juan juan 5 Sep 21 18:06 ../LinkToFile1 -> File1
juan@test:~/hard_soft$ cat ../LinkToFile1
cat: ../LinkToFile1: No such file or directory
juan@test:~/hard_soft$
We could workaround the moving link file issue by using the full path to File1 as target when creating the link instead of the relative one:
juan@test:~/hard_soft$ ln -s /home/juan/hard_soft/File1 Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 18:50 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
457343 lrwxrwxrwx  1 juan juan   26 Sep 21 18:50 Link2ToFile1 -> /home/juan/hard_soft/File1
457342 lrwxrwxrwx  1 juan juan    5 Sep 21 18:06 LinkToFile1 -> File1
juan@test:~/hard_soft$ readlink Link2ToFile1
/home/juan/hard_soft/File1
juan@test:~/hard_soft$
What if I delete a Soft link? since soft links are just files, deleting one of them will just make it go away and nothing will happen with the target/linked file:
juan@test:~/hard_soft$ rm Link*
juan@test:~/hard_soft$ ls -lai
total 12
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:14 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  1 juan juan   14 Sep 21 17:45 File1
juan@test:~/hard_soft$
What actually happens when you delete a file is that the number of links in the inode is decreased by one (and the entry gets removed from the directory). Once the number of links reaches 0, the file is officially gone (unless there's a running process that has a FD using it).

So... Soft links are just files and its content is the path to the targeted/linked file, makes perfect sense now, right?!

Hard links


The story is a bit different with Hard Links. When you create one you DO NOT get an extra file, nope you don't. Creating a Hard Link increases the number of links for a particular inode, lets see an example:
juan@test:~/hard_soft$ ln File1 Link1ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:24 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 Link1ToFile1
juan@test:~/hard_soft$
Note: the syscall in play here is link or linkat.

Interesting! looks like 2 files, but:
  • They both have the same inode number! therefore they are the same file. The reason behind this is that directory wise, there are indeed two directory entries, one for File1 and one for Link1ToFile1
  • Permissions are the same, that makes sense because is the same file (xD you are getting my point, right?), and that also applies for the rest of the properties MAC times for example.
Does that mean that if I have a third Hard link I'll get 3 as number of links for inode 457337, yeahp, that's correct:
juan@test:~/hard_soft$ ln File1 Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 20
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 19:56 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 18:50 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link1ToFile1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$
The good thing about hard links, is that you can move them around and they just keep working:
juan@test:~/hard_soft$ mv Link1ToFile1 ../
juan@test:~/hard_soft$ cat ../Link1ToFile1
File1 content
juan@test:~/hard_soft$
that's because by moving the hard link, the directory entry in hard_soft directory was removed and a corresponding one was created on the parent directory (/home/juan), so accessing the link keeps working. Did this change the number of links on inode 457337?
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 21 20:01 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 21 20:01 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$ ls -lai ../Link1ToFile1
457337 -rw-rw-r-- 3 juan juan 14 Sep 21 17:45 ../Link1ToFile1
juan@test:~/hard_soft$
of course not, inode 457337 keeps having 3 links.

Then what if I delete a Hard link? As we mentioned before deleting a file decreases by one the link counter on the inode, therefore if we have 3 hard links and we delete one of them we'll be back to 2, like you can see here:
juan@test:~/hard_soft$ ls -lai
total 20
457232 drwxrwxr-x  2 juan juan 4096 Sep 22 19:52 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 22 19:51 ..
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link1ToFile1
457337 -rw-rw-r--  3 juan juan   14 Sep 21 17:45 Link2ToFile1
juan@test:~/hard_soft$ rm Link2ToFile1
juan@test:~/hard_soft$ ls -lai
total 16
457232 drwxrwxr-x  2 juan juan 4096 Sep 22 19:53 .
415465 drwxr-xr-x 39 juan juan 4096 Sep 22 19:51 ..
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 File1
457337 -rw-rw-r--  2 juan juan   14 Sep 21 17:45 Link1ToFile1
juan@test:~/hard_soft$

Summary


To wrap up the idea, I'll summarize the most important points here:

  • A soft link is another file (link type though), its content indicates the location of the targeted file.
  • You can have a soft links pointing to a file in a different partitions, which is NOT possible with hard links.
  • Hard links don't require extra space or inodes, and they can be moved around (in the same partition) and will keep working fine.
  • Every time you create a file, a hard link is created and that's link 1 in the inode :D.

No hay comentarios:

Publicar un comentario