Opened 4 days ago
Last modified 45 hours ago
#18451 new bug
Assertion failure (mutex was not actually locked) in libroot hoard malloc
| Reported by: | waddlesplash | Owned by: | nobody |
|---|---|---|---|
| Priority: | normal | Milestone: | Unscheduled |
| Component: | System/libroot.so | Version: | R1/beta4 |
| Keywords: | Cc: | ||
| Blocked By: | Blocking: | ||
| Platform: | All |
Description
This triggers unreliably (1 in 5-10 or so) with certain applications, following the addition of those assertions (the new mutex code is not needed to trigger it.)
The stack trace after the assert is always just this:
0x7f6d30f69b60 0xce64b42d98 BPrivate::processHeap::free(void*) + 0x138 0x7f6d30f69b90 0xce64b43f84 free + 0x44 ...
Change History (6)
comment:1 by , 3 days ago
comment:2 by , 3 days ago
The most reliable ways to trigger this problem are with QtWebEngine and GTKWebKit. With QtWebEngine, CanvasMark is a reliable reproducer; with GTKWebKit, Haiku's own forums seem to trigger it.
comment:3 by , 3 days ago
Oho, it's possible to trigger it using WebPositive and trying to access the Haiku forums also, not just GTKWebKit.
follow-up: 5 comment:4 by , 3 days ago
The full stacktrace would be useful. This may just be some kind of heap corruption.
comment:5 by , 3 days ago
Replying to pulkomandy:
The full stacktrace would be useful. This may just be some kind of heap corruption.
Example here.
thread 3597: pthread func state: Call (mutex was not actually locked!) Frame IP Function Name ----------------------------------------------- 00000000 0xce64ab8da7 _kern_debugger + 0x7 Disassembly: _kern_debugger: 0x000000ce64ab8da0: 48c7c0e5000000 mov $0xe5, %rax 0x000000ce64ab8da7: 0f05 syscall <-- 0x7f6d30f69b60 0xce64b42d98 BPrivate::processHeap::free(void*) + 0x138 0x7f6d30f69b90 0xce64b43f84 free + 0x44 0x7f6d30f69bf0 0x14afeb6e206 _ZNSt6vectorISt4pairIt13scoped_refptrIN2cc4TaskEEESaIS5_EE17_M_realloc_insertIJS5_EEEvN9__gnu_cxx17(, ) + 0x116 0x7f6d30f69c60 0x14afeb703c9 cc::TaskGraphWorkQueue::GetNextTaskToRun(unsigned short) + 0x269 0x7f6d30f69d30 0x14b0059dd0b content::CategorizedWorkerPool::RunTaskInCategoryWithLockAcquired(cc::TaskCategory) + 0x2b 0x7f6d30f69d80 0x14b0059dfea content::CategorizedWorkerPool::Run(std::vector<cc::TaskCategory, std::allocator<cc::TaskCategory> > const&, base::ConditionVariable*) + 0x6a 0x7f6d30f69db0 0x14afe2e5b48 base::_GLOBAL__N_1::ThreadFunc(void*) + 0x48 0x7f6d30f69dd0 0xce64ac7105 pthread_thread_entry(void*, void*) + 0x15 00000000 0x7faeead5b258 commpage_thread_exit + 0
comment:6 by , 45 hours ago
So we're looking at this lock, I guess:
https://cgit.haiku-os.org/haiku/tree/src/system/libroot/posix/malloc_hoard2/processheap.cpp#n203
Notice how the superblock is retrieved from the memory block being freed:
superblock *sb = b->getSuperblock();
This means a corrupt block (either because some data was overwritten, or because the software is trying to free memory that was not allocated by malloc) will result in a pointer not pointing to a superblock at all.
There is an assert to check if the superblock is valid, but the validation done is quite weak:
- Two values (numBlocks and sizeClass) must be greater than zero (effectively checking only the sign bit)
- numAvailable must be less than numBlocks
(block::isValid which is called earlier in the function is even worse, it is just a "return 1" and there isn't much more that can be checked)
HEAP_DEBUG is not set so we don't have the _magic field which could be used for a more reliable check. And the isValid function in the superblock does not even try to uses that anyways.
So, is it possible that we are simply looking at an invalid superblock, and the value being used is not at all a mutex, because we didn't actually get a pointer to a superblock?



Things I have tried so far:
~superblock()(but is this ever called?) No results.