Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#9148 closed bug (fixed)

Enumerating interfaces hangs in kernel on first boot in safe mode

Reported by: mmadia Owned by: anevilyak
Priority: high Milestone: R1/beta1
Component: Network & Internet/Stack Version: R1/alpha4
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

As mentioned in 9128#comment:7, NetworkStatus is capable of locking Deskbar when net_server is not running.

While thinking of the various combinations isFirstBoot, isReadOnly, isSafeMode along with what the user may do in those scenarios (e.g., boot CD into safemode and install to a writable device vs. dd anyboot to a USB stick and do a first boot into safemode, as Haiku didn't fully boot in normal mode), the most complete solution I can think of is to modify NetworkStatus to gracefully install itself into Deskbar when net_server is not present.

Perhaps a new network status icon could be made to indicate the lack of net_server? Or would that qualify as kNetworkStatusNoConnection?

Attachments (4)

bt.png (35.4 KB ) - added by augiedoggie 11 years ago.
backtrace of deskbar window (requested by AnEvilYak)
bt2.png (27.3 KB ) - added by augiedoggie 11 years ago.
backtrace of deskbar window right after boot
mutex_info.png (40.5 KB ) - added by augiedoggie 11 years ago.
gozer.diff (559 bytes ) - added by augiedoggie 11 years ago.
diff by anevilyak that seems to fix the problem

Download all attachments as: .zip

Change History (13)

comment:1 by anevilyak, 11 years ago

I've looked into this code and so far I don't really see how it could cause a hang. Nothing it does while installing itself in Deskbar nor in its constructor/AttachedToWindow() in any way talks to net_server (all of it is done via ioctls directly to the network stack at that point), so that doesn't appear to be the reason it's hanging. When it does communicate with net_server later, it uses BMessengers which would fail immediately if net_server isn't actually running. I don't really see any logical reason for it to block like this unless one of the ioctls is itself blocking, which would make it more of a kernel/network stack problem.

by augiedoggie, 11 years ago

Attachment: bt.png added

backtrace of deskbar window (requested by AnEvilYak)

comment:2 by anevilyak, 11 years ago

Component: Applications/NetworkStatusKits/Network Kit
Status: newassigned
Summary: Must be able to install itself in Deskbar w/o net_server running.Enumerating interfaces hangs in kernel on first boot in safe mode

Based on augiedoggie's results in virtualbox (as seen in attachment:bt.png), the problem does indeed seem to reside in the network stack. Switching component/description.

comment:3 by anevilyak, 11 years ago

Component: Kits/Network KitNetwork & Internet/Stack

by augiedoggie, 11 years ago

Attachment: bt2.png added

backtrace of deskbar window right after boot

comment:4 by augiedoggie, 11 years ago

The first backtrace was after I had attempted to kill the NetworkStatus thread. Uploaded a new one that was taken right after boot although it points to the same area.

comment:5 by bonefish, 11 years ago

@augiedoggie: Please enter the kernel debugger, get the info for the thread (thread -s <thread ID>), get the info for the mutex it is waiting on (mutex <mutex address>) and a stack trace for the holder of the mutex.

by augiedoggie, 11 years ago

Attachment: mutex_info.png added

comment:6 by anevilyak, 11 years ago

Interestingly the mutex seems to have been smashed/corrupted. The only one net_timer acquires is sTimerLock, which is initialized to the name "net timer", which doesn't appear in the mutex info.

in reply to:  6 comment:7 by bonefish, 11 years ago

Replying to anevilyak:

Interestingly the mutex seems to have been smashed/corrupted. The only one net_timer acquires is sTimerLock, which is initialized to the name "net timer", which doesn't appear in the mutex info.

The culprit is obviously uninit_timers(). It waits for the timer thread only after destroying the mutex, which might still be used by the former.

by augiedoggie, 11 years ago

Attachment: gozer.diff added

diff by anevilyak that seems to fix the problem

comment:8 by anevilyak, 11 years ago

Owner: changed from axeld to anevilyak
Status: assignedin-progress

comment:9 by anevilyak, 11 years ago

Resolution: fixed
Status: in-progressclosed

Fixed in hrev44382, thanks for helping track it down augiedoggie!

Version 0, edited 11 years ago by anevilyak (next)
Note: See TracTickets for help on using tickets.