miércoles, 13 de diciembre de 2017

Some Zombies just can't be killed - Zombie Processes

We've been told by so many movies and TV series that no matter how smelly or ugly Zombies can be, you can easily kill them by either blowing up their heads with a shotgun or gently detaching it from the rest of the body (this doesn't really kill the zombie but renders it almost harmless, fun fact, head can't run while detached from the legs). Unfortunately Zombie processes don't really follow the previous "Zombie expected" behavior and they just can't be killed (kill -9 PID).

What in the world is a Zombie process?


Wikipedia has a pretty nice description here, but essentially we are talking about a process that has finished its execution through either exit or exit_group syscalls but remains present in the system due to some unfinished housekeeping tasks. You can easily identify a zombie process from its Z state in for example top or ps. A nice description can be found here the wait man page:

       A child that terminates, but has not been waited for becomes a
       "zombie".  The kernel maintains a minimal set of information about
       the zombie process (PID, termination status, resource usage
       information) in order to allow the parent to later perform a wait to
       obtain information about the child.  As long as a zombie is not
       removed from the system via a wait, it will consume a slot in the
       kernel process table, and if this table fills, it will not be
       possible to create further processes.  If a parent process
       terminates, then its "zombie" children (if any) are adopted by
       init(1), (or by the nearest "subreaper" process as defined through
       the use of the prctl(2) PR_SET_CHILD_SUBREAPER operation); init(1)
       automatically performs a wait to remove the zombies.

For some reason there's a lot of confusion around this Z state, for example:
  • Orphan processes aren't zombie processes, in fact for a process to become zombie its parent process has to be still alive. If the parent dies it will be adopted by init and waitED properly.
  •  A Zombie process isn't consuming memory (or CPU), to be more precise, isn't consuming as much memory as it was consuming while it was running. However it still occupies a slot in the process table and this could be a problem.

 What happens when exit or exit_group are called?


As I mentioned before these are the syscalls used to terminate a process gracefully. You can find the details of these two for kernel 4.14 here and here. I'm not going to describe the whole code because first I don't understand all of it and second I would probably do it wrong :), but... I'm going to point out some interesting parts:
  • Things begin in do_group_exit which takes care of killing all the threads in the current process thread group then it calls do_exit to finish the process itself.
  • do_exit takes care of a few things as well, for example releasing some of the resources associated to the process like files and shared memory (lines 858-866). Later on in line 885, calls to exit_notify, which is in charge of sending to the relative processes the bad news, kind of "Hey!, I'm dying" thing. Now within this function a few things will happen being the most important ones the following:
    • do_notify_parent will be called to signal the parent process with a SIGCHLD signal. This function will return either true or false depending on whether the dying process should become a Zombie or just go regular dead.
      • Have a look at the comment in here, long story short, IF the father decides to ignore (by setting the handler to SIG_IGN) the SIGCHLD signal (or SA_NOCLDWAIT flag is set) the dying process can reap itself and doesn't become a zombie and in that case do_notify_parent will return true. Otherwise it will return false.
    • Back to exit_notify, you can see in here how the process decides whether or not to become a zombie depending on the autoreap variable.
Something worth mentioning is the fact that Zombie is an exit state (tsk ->exit_state) while the scheduling state of the process is actually Dead, you can see this is defined by calling here do_task_dead function. So a Zombie process isn't consuming CPU resources at all, for the scheduling perspective this process is gone, and so it is all the other resources that had been allocated to it.

Why can't it be killed?


Well, you can still send SIGKILL signal to a Zombie process and as you may know, you can't escape from that signal. However technically speaking even though the process is still in the process table, there's nothing it can do, there's no code attached to it anymore, no stack, no files, nothing.

Example


Lets have a look at all this zombie stuff with a simple example. You can get the code here, it's a simple C program that will fork 2 child processes and allocate some memory in each. The parent will sleep for 30 seconds, while the child processes will wait for 15 seconds, hence dying while the father is still running (sleeping) becoming spoooky zombiesssss:

juan@test:~$gcc -o zombie_maker zombie_maker.c
juan@test:~$ ./zombie_maker &
juan@test:~$ Parent PID = 2050, PGRP = 2050, PPID = 1934, PSID = 1934
Child 0 -> PID = 2051, PGRP = 2050, PPID = 2050, PSID = 1934
Child 1 -> PID = 2052, PGRP = 2050, PPID = 2050, PSID = 1934

juan@test:~$ ps aux|grep zombie
juan      2050  0.0  0.0   4216   732 pts/1    S    21:34   0:00 ./zombie_maker
juan      2051  0.0  0.0   6268    92 pts/1    S    21:34   0:00 ./zombie_maker
juan      2052  0.0  0.0   6268    92 pts/1    S    21:34   0:00 ./zombie_maker
juan      2054  0.0  0.0  15960  2268 pts/1    R+   21:34   0:00 grep --color=auto zombie
juan@test:~$ ps aux|grep zombie
juan      2050  0.0  0.0   4216   732 pts/1    S    21:34   0:00 ./zombie_maker
juan      2051  0.0  0.0      0     0 pts/1    Z    21:34   0:00 [zombie_maker] 
juan      2052  0.0  0.0      0     0 pts/1    Z    21:34   0:00 [zombie_maker] 
juan      2056  0.0  0.0  15960  2268 pts/1    S+   21:34   0:00 grep --color=auto zombie
juan@test:~$ ps aux|grep zombie
juan      2058  0.0  0.0  15960  2288 pts/1    S+   21:35   0:00 grep --color=auto zombie
[2]+  Done                    ./zombie_maker
juan@test:~$

After the 15 second, we can see how both child processes not only became Zombies but they also show no memory footprint whatsoever on the system (VSZ is 0 and RSS is 0 as well). After the 30 seconds we can see that none of the 3 processes exist anymore, but who reaped the Zombie children from the system???

Yeahp, init did! Attaching to init with strace we can see the waitid calls returning with the details from the Zombie processes:

root@test:/home/juan# strace -p 1
Process 1 attached
select(54, [3 5 6 7 10 11 15 16 17 19 20 24 26 30 34 53], [], [7 10 11 15 16 17], NULL) = ? ERESTARTNOHAND (To be restarted if no handler)
...
waitid(P_ALL, 0, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2051, si_status=3, si_utime=0, si_stime=0}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0
waitid(P_ALL, 0, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2052, si_status=3, si_utime=0, si_stime=0}, WNOHANG|WEXITED|WSTOPPED|WCONTINUED, NULL) = 0
select(54, [3 5 6 7 10 11 15 16 17 19 20 24 26 30 34 53], [], [7 10 11 15 16 17], NULL^CProcess 1 detached
 
root@test:/home/juan#

Running zombie_maker with strace will show you how the parent process gets indeed interrupted by the SIGCHLD signals but since the handler isn't defined it is ignored. PLEASE, don't confuse that ignore with the one that happens when the handler is set to SIG_IGN, literally this is what the documentation says:

       POSIX.1-2001 specifies that if the disposition of SIGCHLD is set to
       SIG_IGN or the SA_NOCLDWAIT flag is set for SIGCHLD (see
       sigaction(2)), then children that terminate do not become zombies and
       a call to wait() or waitpid() will block until all children have
       terminated, and then fail with errno set to ECHILD.  (The original
       POSIX standard left the behavior of setting SIGCHLD to SIG_IGN
       unspecified.  Note that even though the default disposition of
       SIGCHLD is "ignore", explicitly setting the disposition to SIG_IGN
       results in different treatment of zombie process children.)

Summary


No matter how brave or root you are, you can't kill a Zombie process :D, killing its parent is usually the way to get rid of them. When the parent dies, init will adopt the little zombie bastards and reap them off the process table, as we saw on the example. Also as mentioned, a Zombie process doesn't consume much system resources (actually almost none), so unless you have hundreds of them you shouldn't worry much.

The kernel code is going crazy, it would be nice if we had more comments in place :P.

No hay comentarios:

Publicar un comentario