Opened 9 months ago

Closed 7 months ago

#15015 closed bug (fixed)

Race condition in BFS initialization leads to KDL when volume is not mountable.

Reported by: pulkomandy Owned by: axeld
Priority: normal Milestone: Unscheduled
Component: File Systems/BFS Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All


I had a partition with a corrupt root node, making it unmountable. This lead to a panic in BlockAllocator::_Initialize, as the thread was scheduld after the BlockAllocator object had already been cleared (panic shows the pointer to it to be 0xcccccccc).

There is a lock between the thread and the destructor, but I'm not sure about the semantics of recursive_lock which is used with ownership transfer here. It looks like if the thread is not started yet, the locking isn't effective?

Change History (5)

comment:1 by waddlesplash, 9 months ago

Ownership transfer just changes the thread_id of the lock holder:

So if the thread_id is valid, that should be OK. recursive_lock usually panic()s if you try to unlock it from the wrong thread.

comment:2 by pulkomandy, 9 months ago

Yes, the problem is that we do that, and then destroy the lock from the thread that doesn't own it anymore. When the new thread is finally scheduled, it thinks it holds the lock, but in fact the lock was already deleted and cleared to 0xcccccccc.

I guess the lock destroy should check that the current thread owns the lock?

comment:3 by waddlesplash, 9 months ago

Yes, it should. Mutex destruction already does this.

comment:4 by waddlesplash, 9 months ago

I added some more asserts to mutexes in hrev53104 which should catch this case.

comment:5 by axeld, 7 months ago

Resolution: fixed
Status: newclosed

Should be fixed in hrev53184.

Note: See TracTickets for help on using tickets.