Ticket #1745 (closed bug: fixed)

Opened 17 months ago

Last modified 17 months ago

Deskbar sometimes hangs on start

Reported by: andreasf Owned by: axeld
Priority: normal Milestone: R1/alpha1
Component: Kits/Interface Kit Version: R1 development
Cc: Blocked By:
Platform: x86 Blocking:

Description

When Virtual Memory is enabled, on startup the Deskbar hangs, with feather, clock and ProcessController visible and a too large, grey-only box below. It is not redrawn (time not updated) and does not respond to mouse input. When killing it and restarting the desktop, it works okay.

This is mostly reproducible with SMP enabled. Haven't seen it with SMP disabled yet, nor with VM disabled.

Change History

  Changed 17 months ago by axeld

  • priority changed from normal to high
  • summary changed from Deskbar hangs with VM enabled to Deskbar sometimes hangs on start
  • milestone changed from R1 to R1/alpha

Virtual memory is currently a no-op. And I've also seen this without SMP, I just didn't find the time to investigate it. I've updated the title of the bug to better match the problem :-)

  Changed 17 months ago by stippi

I am using QEMU 0.7.2 on ZETA and can reproduce this problem quite reliably (maybe 50-60% of the boots?). Ingo, maybe a good candidate for your KDL usage demonstration?

  Changed 17 months ago by stippi

Oh, I forgot to add some more info: First, my QEMU session is definitely not running as an SMP system. Second, I mostly don't see the ProcessController replicant, just the empty space which seems to account for 3 items (Tracker, Terminal and ProcessController launched from the Bootscript). Sometimes when Deskbar hangs like this, Tracker hangs too, but not always. When Tracker does hang, and I use the Terminal to kill Deskbar, Tracker suddenly continues to start up and runs fine afterwards. I think the reason why this came up was the addition of ProcessController to the Bootscript, but it probably just shows an underlying problem which existed before.

  Changed 17 months ago by bonefish

  • owner changed from axeld to bonefish
  • status changed from new to assigned

I couldn't reproduce it in VMware or on real hardware yet, but it seem I can with qemu.

  Changed 17 months ago by bonefish

  • owner changed from bonefish to axeld
  • status changed from assigned to new

Fixed one reason for the deadlock in r23865. BLooper::_PostMessage() was holding the BLooperList lock while calling BMessenger::SendMessage(). I believe it is still incorrect, that BWindow::UpdateIfNeeded() calls DispatchMessage() while holding the message queue lock.

follow-up: ↓ 7   Changed 17 months ago by axeld

  • priority changed from high to normal
  • component changed from - General to Kits/Interface Kit

Very nice! That doesn't really sound like a very deadlock-like scenario, though (the locking order is always looper then message queue). It wouldn't be too difficult to change the loop to release the message lock before calling DispatchMessage(), though.

Do you still see the hang happening?

in reply to: ↑ 6   Changed 17 months ago by bonefish

Replying to axeld:

Very nice! That doesn't really sound like a very deadlock-like scenario, though (the locking order is always looper then message queue). It wouldn't be too difficult to change the loop to release the message lock before calling DispatchMessage(), though.

It's not the looper lock with which I see deadlock potential. The user can easily create a deadlock by acquiring e.g. some data lock in Draw() and in another thread hold the same lock while sending a message to the window. Then you've got two code paths with opposing locking order (1. UpdateIfNeeded() [message queue lock] -> Draw() [data lock], 2. [data lock] -> BMessage::_SendMessage() [message queue lock]), and the user isn't even to blame.

Do you still see the hang happening?

Nope. I just wanted to keep the ticket open until the UpdateIfNeeded() code is fixed.

  Changed 17 months ago by axeld

  • status changed from new to closed
  • resolution set to fixed

I've fixed UpdateIfNeeded() in r23870, no matter if it's related to this bug or not :-)

Note: See TracTickets for help on using tickets.