Opened 3 years ago
Closed 3 years ago
#17444 closed bug (fixed)
KDL when using Falkon
Reported by: | hitech | Owned by: | waddlesplash |
---|---|---|---|
Priority: | critical | Milestone: | R1/beta4 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | #17454, #17455 | |
Platform: | All |
Description (last modified by )
I've experienced three KDLs on 2 different PCs with Haiku nightly hrev55673 and Falkon running. In two of the cases, I wasn't near the computer at the time of the crash, so I literally didn't do anything to provoke the KDL.
/var/log> grep -i "assert" /var/log/syslog.old KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id
The problem is in this assertion: https://git.haiku-os.org/haiku/tree/src/system/kernel/locks/lock.cpp#n929
This assertion was added about two years ago in hrev53340.
The regression is suspected in hrev55671.
I'm attaching the syslog for the assertion message and the call stack as a picture.
Attachments (2)
Change History (15)
by , 3 years ago
Attachment: | photo_2021-12-03_12-34-33.jpg added |
---|
comment:1 by , 3 years ago
Component: | System → System/Kernel |
---|---|
Description: | modified (diff) |
Owner: | changed from | to
Platform: | x86-64 → All |
Status: | new → assigned |
follow-up: 3 comment:2 by , 3 years ago
I'm not sure how this is possible, at first glance. The thread should only be woken up by the ConditionVariable while we are inside ConditionVariableEntry::Wait(); if the timeout occurs, the entry removes itself from the variable before returning so it can't be woken up at all, and if it returns early due to already having been woken up, it will have been removed from the variable by the variable itself. So in no circumstance should we get to the mutex_lock while the entry is still part of the variable.
The easiest thing to do to diagnose this may be to add a debug field indicating what woke up a thread last. I think you have a way to build Haiku images for testing, yes?
comment:3 by , 3 years ago
Replying to waddlesplash:
The easiest thing to do to diagnose this may be to add a debug field indicating what woke up a thread last. I think you have a way to build Haiku images for testing, yes?
I never build the whole Haiku image, but I suppose yes, I can do it. Please note that the same crash occured on two different computers with different HW, but I'll test only on one of them, my main computer, because it's a problem to transfer the image to the other one.
follow-up: 8 comment:4 by , 3 years ago
Hmm, do you have any drivers loaded from a non-packaged directory? I find it strange you can reproduce this on two different computers but nobody else has reported anything like this. (I note that there is a media server KDL also in your syslog as well, indicating this might be unrelated to QtWebEngine altogether.)
comment:6 by , 3 years ago
OK, I reread the code and determined there was indeed a race condition. Hopefully hrev55694 should fix this, please test.
comment:7 by , 3 years ago
Milestone: | Unscheduled → R1/beta4 |
---|---|
Priority: | normal → critical |
comment:8 by , 3 years ago
Replying to waddlesplash:
Hmm, do you have any drivers loaded from a non-packaged directory? I find it strange you can reproduce this on two different computers but nobody else has reported anything like this. (I note that there is a media server KDL also in your syslog as well, indicating this might be unrelated to QtWebEngine altogether.)
One of these computers is a fresh Haiku installation without any non-packaged drivers. The other has an input server filter that enables the multimedia keys on the keyboard. It is located in the non-packaged directory because I'm too shy of my drawing skills to draw an icon and to submit it to HaikuPorts :)
I'll test hrev55694. Thanks!
comment:11 by , 3 years ago
Blocking: | 17454 added |
---|
comment:12 by , 3 years ago
2 days of uptime, including extensive use of Falkon, - no KDLs for now. Either the bug is solved, or I can't reproduce it.
comment:13 by , 3 years ago
Blocking: | 17455 added |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
Very good. There's one more KDL related to ConditionVariables remaining, #17455, but this one is pretty rare and unrelated here, so I think this ticket is safe to close.
The stack