Ticket #1917 (closed bug: fixed)

Opened 2 months ago

Last modified 2 months ago

vm_cache_remove_consumer() Deadlock

Reported by: bonefish Assigned to: axeld
Priority: critical Milestone: R1/alpha1
Component: System/Kernel Version: R1 development
Cc: Platform: All

Description

r24356

vm_cache_remove_consumer() calls release_ref() on the store while the cache is locked. For vnode stores this will call vfs_put_vnode(), which in turn might call the underlying FS's remove_vnode() hook (if the file has been removed). BFS' hook calls file_cache_set_size() which attempts to lock the cache a second time.

Possible solutions:

  • Make vm_cache::lock a recursive lock. This will solve this problem, but the locking order will still be problematic, i.e. calling the VFS (and thus the respective FS) while holding the cache lock. The usual locking order is reverse.
  • Move release_ref() in vm_cache_remove_consumer() before locking the cache. That sounds promising, but I don't see all the consequences yet. The page writer doesn't hold the cache lock either when releasing the store ref.

Change History

03/15/08 04:42:29 changed by axeld

It should always be sure to release the ref without holding the lock I suppose, but we might want to have a closer look again before choosing that option :-)

03/24/08 00:23:01 changed by bonefish

  • status changed from new to closed.
  • resolution set to fixed.

Applied the proposed change in r24548. As I see it, the worst thing that can happen is, that there can be two caches for a vnode for a short time, but that shouldn't be a problem, since when the release_ref() returns, all of the old cache's pages should either be removed (remove_vnode()) or at least written back and unmapped (put_vnode()).

Considering this fixed unless proven otherwise. :-)