#3856 closed bug (fixed)
page fault in LegacyDevice::UninitDevice
Reported by: | marcusoverhagen | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/pre-alpha1 |
Keywords: | Cc: | fredrik.holmqvist@… | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
When switching with a USB KVM switch between machines, haiku will sometimes crash when a file descriptor belonging to the usb mouse is close()ed. See attachment.
Attachments (1)
Change History (14)
by , 16 years ago
Attachment: | 020520097499small.jpg added |
---|
comment:1 by , 16 years ago
Unfortunately I don't have a USB KVM, and I also haven't seen this bug yet. Feel free to investigate :-)
comment:2 by , 16 years ago
I've seen that too in hrev30603 while unplugging my Kinesis keyboard to which a mouse was attached.
As for a possible cause: The there are several places in legacy_drivers.cpp where the device is deleted right after a devfs_unpublish_device(, true). It is not guaranteed that after vfs_disconnect_vnode() returns there won't be anymore calls using that node, though, since it only marks the FDs disconnected. Any thread that is in put_fd() past the check could cause this problem. Unless I'm mistaken even more generally, anyone who has already acquired a reference to the vnode (or even a file_descriptor) at this point can still call into the devfs with the "stale" node.
comment:3 by , 16 years ago
Yes, indeed, the legacy device must not be deleted before the vnode has been released. This is also what makes the whole disconnect mechanism work.
comment:4 by , 16 years ago
Status: | new → assigned |
---|
comment:5 by , 16 years ago
Should be fixed in hrev30726. I'm keeping this bug open until I got a confirmation (at least for a few weeks, that is), as I can't reproduce this.
comment:6 by , 16 years ago
Hy Axel.
As I sent on an email to haiku-development, it is now crashing for me (sorry, no KDL image as I have no camera with me right now). The culprit seems to be the ps2 bus manager and the crash is in LegacyDevice::Remove(). It is obviously related to not having any ps2 devices connected to the machine as when I plug in a PS2 keyboard it works.
comment:7 by , 16 years ago
Cc: | added |
---|
That revision broke it for me as well with a kdl for ps2 service. I used auto_stack_trace but it didn't list anything except the header for stack traces. Either the computer hang, auto_stack_trace is broken or there is none. Reverting the files from hrev30726 at least proves that it is to blame. No ps2 here as well, just usb mouse and keyboard connected to a hub.
comment:8 by , 16 years ago
I can reproduce this also if I boot with my ps2 keyboard unplugged, backtrace as follows:
PANIC: vm_page_fault: unhandled page fault in kernel space at 0x2c, ip 0x8007f9bc stack trace for thread 136 "ps2 service" kernel stack: 0x9157c000 to 0x91580000 frame caller <image>:function + offset 0 9157fa34 (+ 48) 800659f1 <kernel_x86>:invoke_debugger_command + 0x00f5 1 9157fa64 (+ 64) 800657e1 <kernel_x86> invoke_pipe_segment(debugger_command_pipe*: 0x80136b20, int32: 0, 0x0 "<NULL>") + 0x0079 2 9157faa4 (+ 64) 80065b68 <kernel_x86>:invoke_debugger_command_pipe + 0x009c 3 9157fae4 (+ 48) 80067118 <kernel_x86> ExpressionParser<0x9157fb98>::_ParseCommandPipe(0x9157fb94) + 0x0234 4 9157fb14 (+ 64) 80066552 <kernel_x86> ExpressionParser<0x9157fb98>::EvaluateCommand(0x80127200 "bt", 0x9157fb94) + 0x02ba 5 9157fb54 (+ 224) 80068540 <kernel_x86>:evaluate_debug_command + 0x0088 6 9157fc34 (+ 64) 8006386a <kernel_x86> kernel_debugger_loop() + 0x01ae 7 9157fc74 (+ 32) 80064771 <kernel_x86>:kernel_debugger + 0x004d 8 9157fc94 (+ 192) 80064719 <kernel_x86>:panic + 0x0029 9 9157fd54 (+ 80) 800ca551 <kernel_x86>:vm_page_fault + 0x0139 10 9157fda4 (+ 64) 800daf65 <kernel_x86>:page_fault_exception + 0x00d9 11 9157fde4 (+ 12) 800de696 <kernel_x86>:int_bottom + 0x0036 kernel iframe at 0x9157fdf0 (end = 0x9157fe40) eax 0x1 ebx 0x2c ecx 0x8012bd60 edx 0x0 esi 0x8132c3b8 edi 0x8119f398 ebp 0x9157fe58 esp 0x9157fe24 eip 0x8007f9bc eflags 0x10286 vector: 0xe, error code: 0x0 12 9157fdf0 (+ 104) 8007f9bc <kernel_x86> LegacyDevice<0x8132c3b8>::Removed(0x812de780, 0x9157fe98) + 0x004c 13 9157fe58 (+ 48) 80076f14 <kernel_x86> devfs_delete_vnode(devfs*: 0x8119f3c0, devfs_vnode*: 0x812de780, false) + 0x006c 14 9157fe88 (+ 48) 8007807c <kernel_x86> devfs_remove_vnode(fs_volume*: 0x8119f398, fs_vnode*: 0x81330d8c, true) + 0x005c 15 9157feb8 (+ 48) 8009fea8 <kernel_x86> free_vnode(vnode*: 0x81330d8c, true) + 0x00ac 16 9157fee8 (+ 48) 800a00a4 <kernel_x86> dec_vnode_ref_count(vnode*: 0x81330d8c, false, true) + 0x0140 17 9157ff18 (+ 48) 800a4105 <kernel_x86>:put_vnode + 0x0069 18 9157ff48 (+ 48) 80079986 <kernel_x86>:devfs_unpublish_device + 0x0086 19 9157ff78 (+ 48) 91576c23 </boot/system/add-ons/kernel/bus_managers/ps2>:ps2_dev_unpublish + 0x002f 20 9157ffa8 (+ 48) 915794c0 </boot/system/add-ons/kernel/bus_managers/ps2>:ps2_service_thread + 0x00d8 21 9157ffd8 (+ 32) 800592ab <kernel_x86> _create_kernel_thread_kentry() + 0x001b 22 9157fff8 (+1856503816) 80059248 <kernel_x86> thread_kthread_exit() + 0x0000
comment:9 by , 16 years ago
After some further tracing, I see why it's crashing, but not what leads to this condition:
fRemovedFromParent is false, but fDriver is NULL. Thus fDriver->devices.Remove(this); causes the above mentioned panic, since this condition isn't checked. Should it be possible for this to occur at all, or is something more drastic wrong if fDriver is NULL at that point?
comment:10 by , 16 years ago
Boots and seems to correctly handle device replugging/unplugging properly for me with the following diff:
Index: src/system/kernel/device_manager/legacy_drivers.cpp =================================================================== --- src/system/kernel/device_manager/legacy_drivers.cpp (revision 30769) +++ src/system/kernel/device_manager/legacy_drivers.cpp (working copy) @@ -1184,7 +1184,7 @@ { RecursiveLocker _(sLock); - if (!fRemovedFromParent) + if (!fRemovedFromParent && fDriver != NULL) fDriver->devices.Remove(this); delete this;
comment:11 by , 16 years ago
Thanks Rene! I wanted to investigate this today, but didn't find the time. I'll have a look if fDriver is allowed to be NULL at that point; it sounds a bit odd indeed, at least.
comment:12 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Legacy drivers can be published without an actual driver, with the hooks only - so it's indeed perfectly okay that fDriver is NULL.
I've applied your patch, thanks!
KDL screenshot