Deadlock between clone_area(), Kernel Area Operation, I/O, and Page Fault
|Reported by:||bonefish||Owned by:||bonefish|
|Has a Patch:||no||Platform:||All|
When cloning a kernel area into userland (as done e.g. by the message deliverer in the registrar) the following deadlock can occur:
- thread 1: clone_area() read-locks the kernel address space.
- thread 2: Some thread wants to create/delete a kernel area. It blocks trying to write-lock the kernel address space.
- thread 3 (I/O scheduler notifier): Some sub-I/O-request (e.g. from the block cache) goes through the I/O scheduler and is finished. The notifier thread calls the iteration callback, which creates more subrequests and tries to schedule them. lock_memory() is invoked, which blocks on the kernel address space R/W lock.
- thread 4 (team mate of thread 1): Page faults on a mapped file. The page fault handler read-locks the team's address space and tries to read in the page in question. Since the I/O scheduler notifier thread is blocked, this thread blocks too, waiting for the I/O request to finish.
- thread 1: clone_area() tries to write lock the team's address space and blocks, since thread 4 has it read-locked.
To sum it up:
- thread 1: blocks trying to write-lock a team's address space (read-locked by thread 4)
- thread 2: blocks trying to write-lock the kernel's address space (read-locked by thread 1)
- thread 3: blocks trying to read-lock the kernel's address space (waiting writer thread 2)
- thread 4: waits for I/O (to be finished by thread 3)
I've seen this while booting two times already (out of maybe 20 boots). It seems to happen more likely with my soon-to-be-committed optimization to pre-map pages of mapped files.
A solution would be to drop a team's address space lock while handling a page fault. There's already a TODO to that effect in vm_soft_fault(), though it mentions performance reasons only.