As its name precisely suggests, processes in uninterruptible state (D in ps/top/etc) cannot be interrupted hence they can't be killed either!!! An unlike a process Z state, the former might be indeed occupying memory you can't get back!
How do you get into D state?
Most of the time people associates processes in D state with IO activity, and that's not wrong actually that's exactly what man ps says:
PROCESS STATE CODES Here are the different values that the s, stat and state output specifiers (header "STAT" or "S") will display to describe the state of a process: D uninterruptible sleep (usually IO) R running or runnable (on run queue) S interruptible sleep (waiting for an event to complete) T stopped, either by a job control signal or because it is being traced W paging (not valid since the 2.6.xx kernel) X dead (should never be seen) Z defunct ("zombie") process, terminated but not reaped by its parent
However processes that go into D state due to IO usually come back to R once the IO operation that was holding them is completed (unless something goes nuts). Now, there are some scenarios that don't involve IO and can also lead to D state, for example when using serialization mechanisms like mutexes. When you try to acquire a mutex that has been locked already you have two options, either you go into interruptable sleep (S) or you go into uninterruptible sleep (D) this is something that will depend on what the code is supposed to do and whether you want to be able to receive signals and stuff while waiting for the mutex to be available. If you are curious about mutexes in the kernel have a look at this or a shorter and perhaps more human version here.
So how do you detect a process in D state?
That's rather simple! You can use the regular tools like ps and top and look for the State field/column. For example:
juan@test:~$ ps aux|grep test root 20592 0.0 0.0 4220 656 pts/2 S+ 21:36 0:00 ./test root 20593 0.0 0.0 4216 640 pts/24 D+ 21:36 0:00 ./test juan 20982 0.0 0.0 15964 2268 pts/27 R+ 23:26 0:00 grep --color=auto test juan@test:~$
in this case, process 20593 is in D+ state, so Uninterruptable sleep and is a foreground process.
What to do with a process in D state? How to know what is actually locking the process:
This is where things get interesting. As I mentioned before D state is usually temporary and most of the times if you see a process in that state as soon as you run ps again the process will have changed to either R or S. BUT... if you believe the process is just refusing to leave D state, then what to do?
Lots of people answer this question with "I kill the process"... well that's exactly why is called UNinterruptible, it will absolutely ignore your most violent attempts to kill it. Any signals you send to the process will be queued and delivered eventually once the process leaves the D state, if it ever does of course.
Fortunately there are a few things that can help you debug this lovely situation and prevent it from happening in the future.
Have a look at the process stack:
This would be an initial approach, by having a look at the process stack you can get an idea of what code path took the process to that D condition. Luckily having a look at the stack isn't that hard now a days and you can do that by using /proc/:
juan@test:~$ sudo cat /proc/20593/stack [<ffffffffc04ee045>] dev_open+0x15/0x40 [ebbcharmutex] [<ffffffff811f0a7f>] chrdev_open+0x9f/0x1d0 [<ffffffff811e9527>] do_dentry_open+0x1f7/0x340 [<ffffffff811ead87>] vfs_open+0x57/0x60 [<ffffffff811faccc>] do_last+0x4ec/0x1190 [<ffffffff811fb9f0>] path_openat+0x80/0x600 [<ffffffff811fd10a>] do_filp_open+0x3a/0x90 [<ffffffff811eb109>] do_sys_open+0x129/0x280 [<ffffffff811eb27e>] SyS_open+0x1e/0x20 [<ffffffff817b788d>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff juan@test:~$
nice! So the process is locked in Kernel space and started by running an open syscall (SyS_Open+0x1e/0x20) on some sort of character device (chrdev_open+0x9f/0x1d0) and there's a module intervening here called [ebbcharmutex]. Since this is a stack trace we know with high certainty that the process is blocked particularly inside dev_open call, and that the code for the function belongs to module ebbcharmutex.
For more details about ebbcharmutex, I'd suggest to go to the source (really nice website around Linux and embedded systems) anyways here a piece of the code:
/** @brief The device open function that is called each time the device is opened * This will only increment the numberOpens counter in this case. * @param inodep A pointer to an inode object (defined in linux/fs.h) * @param filep A pointer to a file object (defined in linux/fs.h) */ static int dev_open(struct inode *inodep, struct file *filep){ // if(!mutex_trylock(&ebbchar_mutex)){ // Try to acquire the mutex (returns 0 on fail) // printk(KERN_ALERT "EBBChar: Device in use by another process"); // return -EBUSY; // } mutex_lock(&ebbchar_mutex); numberOpens++; printk(KERN_INFO "EBBChar: Device has been opened %d time(s)\n", numberOpens); return 0; }
as you can see the very first line of the function calls mutex_lock() in its uninterrtible fashion hence this is where our process is stuck.
Now what? well... technically there's not much you can do to unlock the process, all you can try to do is identify who could be the one holding the lock and not releasing it (of course this would require some serious understanding of the problem and the code itself). In my simple case scenario you can see that I had two test processes one of them in S+ state (scroll up, I'm not going to paste it twice :P), if we check the opened files for that process we can see it opened /dev/ebbchar so he is the one holding the lock:
root@test:/home/juan# lsof -p 20592 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME test 20592 root cwd DIR 8,1 4096 466847 /home/juan/ModTest/exploringBB/extras/kernel/ebbcharmutex test 20592 root rtd DIR 8,1 4096 2 / test 20592 root txt REG 8,1 9102 466927 /home/juan/ModTest/exploringBB/extras/kernel/ebbcharmutex/test test 20592 root mem REG 8,1 1857312 583879 /lib/x86_64-linux-gnu/libc-2.19.so test 20592 root mem REG 8,1 149120 583895 /lib/x86_64-linux-gnu/ld-2.19.so test 20592 root 0u CHR 136,2 0t0 5 /dev/pts/2 test 20592 root 1u CHR 136,2 0t0 5 /dev/pts/2 test 20592 root 2u CHR 136,2 0t0 5 /dev/pts/2 test 20592 root 3u CHR 250,0 0t0 69224 /dev/ebbchar root@test:/home/juan#
If you can't really understand why the process isn't releasing the lock, since it is in S+ state, you can kill it and this should free the lock hence releasing the other stuck process:
root@test:/home/juan# kill -9 20592 root@test:/home/juan# ps aux|grep test root 20593 0.0 0.0 4220 640 pts/24 S+ 05:39 0:00 ./test root 21460 0.0 0.0 15964 2140 pts/26 S+ 12:06 0:00 grep --color=auto test root@test:/home/juan#
perfect, now process 20593 is in S state!
The kernel doesn't like processes in D state either:
Can't blame it, right? Long time in D state is usually sign of some kind of issue, either high load or locking problems. Because of this, the kernel has 4 features to detect and act on processes in D state. Here you can see the default setup for Ubuntu:
root@test:/home/juan# cat /proc/sys/kernel/hung_task_timeout_secs 120 root@test:/home/juan# cat /proc/sys/kernel/hung_task_panic 0 root@test:/home/juan# cat /proc/sys/kernel/hung_task_check_count 4194304 root@test:/home/juan# cat /proc/sys/kernel/hung_task_warnings 0 root@test:/home/juan#
What do these values mean (for more precision read this):
- hung_task_timeout_secs: if a task doesn't get scheduled in more than that number of seconds a warning will be reported.
- hung_task_panic: to indicate the kernel whether it should panic or not when detecting a hung task. 0 for no panic, 1 for panic. This for example could be use to reboot the system.
- hung_task_check_count: maximum number of tasks to be checked, don't really get this one...
- hung_task_warnings: maximum number of warnings due to hung tasks to be reported, every time a task is detected as hung this number gets decreased, when reaching 0 no more reports will be issued.
Dec 20 19:57:25 test kernel: [78751.283729] INFO: task test:20593 blocked for more than 120 seconds. Dec 20 19:57:25 test kernel: [78751.283737] Tainted: G OE 3.19.0-51-generic #58~14.04.1-Ubuntu Dec 20 19:57:25 test kernel: [78751.283739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Dec 20 19:57:25 test kernel: [78751.283741] test D ffff88003610bb68 0 20593 18722 0x00000000 Dec 20 19:57:25 test kernel: [78751.283748] ffff88003610bb68 ffff88007ce9f5c0 0000000000013e80 ffff88003610bfd8 Dec 20 19:57:25 test kernel: [78751.283753] 0000000000013e80 ffff8800b771f5c0 ffff88007ce9f5c0 ffff88007ce9f5c0 Dec 20 19:57:25 test kernel: [78751.283756] ffffffffc04f0100 ffffffffc04f0104 ffff88007ce9f5c0 00000000ffffffff Dec 20 19:57:25 test kernel: [78751.283760] Call Trace: Dec 20 19:57:25 test kernel: [78751.283773] [<ffffffff817b3a29>] schedule_preempt_disabled+0x29/0x70 Dec 20 19:57:25 test kernel: [78751.283778] [<ffffffff817b5745>] __mutex_lock_slowpath+0x95/0x100 Dec 20 19:57:25 test kernel: [78751.283781] [<ffffffff817b57b5>] ? mutex_lock+0x5/0x37 Dec 20 19:57:25 test kernel: [78751.283785] [<ffffffff817b57d3>] mutex_lock+0x23/0x37 Dec 20 19:57:25 test kernel: [78751.283790] [<ffffffffc04ee045>] ? dev_open+0x15/0x40 [ebbcharmutex] Dec 20 19:57:25 test kernel: [78751.283793] [<ffffffffc04ee045>] dev_open+0x15/0x40 [ebbcharmutex] Dec 20 19:57:25 test kernel: [78751.283798] [<ffffffff811f0a7f>] chrdev_open+0x9f/0x1d0 Dec 20 19:57:25 test kernel: [78751.283802] [<ffffffff811e9527>] do_dentry_open+0x1f7/0x340 Dec 20 19:57:25 test kernel: [78751.283805] [<ffffffff811f09e0>] ? cdev_put+0x30/0x30 Dec 20 19:57:25 test kernel: [78751.283808] [<ffffffff811ead35>] ? vfs_open+0x5/0x60 Dec 20 19:57:25 test kernel: [78751.283812] [<ffffffff811ead87>] vfs_open+0x57/0x60 Dec 20 19:57:25 test kernel: [78751.283815] [<ffffffff811faccc>] do_last+0x4ec/0x1190 Dec 20 19:57:25 test kernel: [78751.283820] [<ffffffff811fb9f0>] path_openat+0x80/0x600 Dec 20 19:57:25 test kernel: [78751.283825] [<ffffffff810a6ef5>] ? update_stats_wait_end+0x5/0xd0 Dec 20 19:57:25 test kernel: [78751.283828] [<ffffffff817b38b5>] ? _cond_resched+0x5/0x40 Dec 20 19:57:25 test kernel: [78751.283832] [<ffffffff811fd10a>] do_filp_open+0x3a/0x90 Dec 20 19:57:25 test kernel: [78751.283836] [<ffffffff81209cd7>] ? __alloc_fd+0xa7/0x130 Dec 20 19:57:25 test kernel: [78751.283839] [<ffffffff811fd0d5>] ? do_filp_open+0x5/0x90 Dec 20 19:57:25 test kernel: [78751.283843] [<ffffffff811eb109>] do_sys_open+0x129/0x280 Dec 20 19:57:25 test kernel: [78751.283846] [<ffffffff811eafe5>] ? do_sys_open+0x5/0x280 Dec 20 19:57:25 test kernel: [78751.283850] [<ffffffff811eb27e>] SyS_open+0x1e/0x20 Dec 20 19:57:25 test kernel: [78751.283853] [<ffffffff817b788d>] system_call_fastpath+0x16/0x1b
The report provides an even more detailed stack trace along with some other details. Now, if you have a look at the top of the stack trace becomes crystal clear the fact that the process is stuck trying to acquire a lock a mutex.
Summing this up:
Processes in D state are rare and could be an indication of system issues like intense IO, high load, or maybe locking problems. Because of being uninterruptible, they don't get to be scheduled hence signals can't be delivered so the process can't be killed. Fortunately there are a few things you can do in order to gather more information to have the chance to troubleshoot the problem.