Opened 2 years ago

Closed 2 years ago

#17444 closed bug (fixed)

KDL when using Falkon

Reported by: hitech Owned by: waddlesplash
Priority: critical Milestone: R1/beta4
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking: #17454, #17455
Platform: All

Description (last modified by diver)

I've experienced three KDLs on 2 different PCs with Haiku nightly hrev55673 and Falkon running. In two of the cases, I wasn't near the computer at the time of the crash, so I literally didn't do anything to provoke the KDL.

/var/log> grep -i "assert" /var/log/syslog.old 
KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id
KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id
KERN: PANIC: ASSERT FAILED (../haiku-git/src/system/kernel/locks/lock.cpp:931): lock->holder == waiter.thread->id

The problem is in this assertion: https://git.haiku-os.org/haiku/tree/src/system/kernel/locks/lock.cpp#n929

This assertion was added about two years ago in hrev53340.

The regression is suspected in hrev55671.

I'm attaching the syslog for the assertion message and the call stack as a picture.

Attachments (2)

photo_2021-12-03_12-34-33.jpg (243.8 KB ) - added by hitech 2 years ago.
The stack
syslog.old (512.0 KB ) - added by hitech 2 years ago.
The syslog

Download all attachments as: .zip

Change History (15)

by hitech, 2 years ago

The stack

by hitech, 2 years ago

Attachment: syslog.old added

The syslog

comment:1 by diver, 2 years ago

Component: SystemSystem/Kernel
Description: modified (diff)
Owner: changed from nobody to waddlesplash
Platform: x86-64All
Status: newassigned

comment:2 by waddlesplash, 2 years ago

I'm not sure how this is possible, at first glance. The thread should only be woken up by the ConditionVariable while we are inside ConditionVariableEntry::Wait(); if the timeout occurs, the entry removes itself from the variable before returning so it can't be woken up at all, and if it returns early due to already having been woken up, it will have been removed from the variable by the variable itself. So in no circumstance should we get to the mutex_lock while the entry is still part of the variable.

The easiest thing to do to diagnose this may be to add a debug field indicating what woke up a thread last. I think you have a way to build Haiku images for testing, yes?

in reply to:  2 comment:3 by hitech, 2 years ago

Replying to waddlesplash:

The easiest thing to do to diagnose this may be to add a debug field indicating what woke up a thread last. I think you have a way to build Haiku images for testing, yes?

I never build the whole Haiku image, but I suppose yes, I can do it. Please note that the same crash occured on two different computers with different HW, but I'll test only on one of them, my main computer, because it's a problem to transfer the image to the other one.

comment:4 by waddlesplash, 2 years ago

Hmm, do you have any drivers loaded from a non-packaged directory? I find it strange you can reproduce this on two different computers but nobody else has reported anything like this. (I note that there is a media server KDL also in your syslog as well, indicating this might be unrelated to QtWebEngine altogether.)

comment:5 by diver, 2 years ago

Another user has reported the same KDL on Telegram a few days ago.

comment:6 by waddlesplash, 2 years ago

OK, I reread the code and determined there was indeed a race condition. Hopefully hrev55694 should fix this, please test.

comment:7 by waddlesplash, 2 years ago

Milestone: UnscheduledR1/beta4
Priority: normalcritical

in reply to:  4 comment:8 by hitech, 2 years ago

Replying to waddlesplash:

Hmm, do you have any drivers loaded from a non-packaged directory? I find it strange you can reproduce this on two different computers but nobody else has reported anything like this. (I note that there is a media server KDL also in your syslog as well, indicating this might be unrelated to QtWebEngine altogether.)

One of these computers is a fresh Haiku installation without any non-packaged drivers. The other has an input server filter that enables the multimedia keys on the keyboard. It is located in the non-packaged directory because I'm too shy of my drawing skills to draw an icon and to submit it to HaikuPorts :)

I'll test hrev55694. Thanks!

comment:9 by waddlesplash, 2 years ago

There was another corner case that I just fixed in hrev55700.

comment:10 by waddlesplash, 2 years ago

... and another in hrev55701.

comment:11 by waddlesplash, 2 years ago

Blocking: 17454 added

comment:12 by hitech, 2 years ago

2 days of uptime, including extensive use of Falkon, - no KDLs for now. Either the bug is solved, or I can't reproduce it.

comment:13 by waddlesplash, 2 years ago

Blocking: 17455 added
Resolution: fixed
Status: assignedclosed

Very good. There's one more KDL related to ConditionVariables remaining, #17455, but this one is pretty rare and unrelated here, so I think this ticket is safe to close.

Note: See TracTickets for help on using tickets.