Opened 16 years ago
Closed 16 years ago
#2894 closed bug (fixed)
Input server crashes at boot on Amilo Li2735
Reported by: | jackburton | Owned by: | axeld |
---|---|---|---|
Priority: | critical | Milestone: | R1/alpha1 |
Component: | Servers/input_server | Version: | R1/pre-alpha1 |
Keywords: | Cc: | fredrik.holmqvist@… | |
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
Attachments (9)
Change History (34)
by , 16 years ago
Attachment: | Immagine.png added |
---|
comment:1 by , 16 years ago
Owner: | changed from | to
---|
comment:3 by , 16 years ago
On a laptop with no mouse attached, just touchpad (synaptics, so give the new driver and stuff). Clevo 120TN-R: Intel DC 2,5GHz Intel 965 http://www.clevo.com.tw/en/products/prodinfo_2.asp?productid=63
comment:4 by , 16 years ago
Tested from Rene Gollant suggestion: cd src/servers/input ; svn up -r 28240 svn up -r 28240 headers/os/add-ons/input_server/InputServerDevice.h
but that seemed to hang hard, as terminal cursor is not blinking. (Or mouse and keyboard don't work.)
comment:5 by , 16 years ago
Milestone: | R1 → R1/alpha1 |
---|---|
Priority: | normal → critical |
Stefano, can you add some more debug output to the MouseDevice? I haven't yet tried on my laptop, but the other machines don't expose this problem.
Interesting would be which paths are added and removed in particular. At least I don't see anything particularly wrong with the code itself; it could of course also be a problem of the BPathMonitor, as this one wasn't used before.
comment:6 by , 16 years ago
I have some logs but they are on the other machine, which I forgot at home. Looking at the code I also noticed that in PathList.cpp, the path_entry constructor does not initialize the ref_count member. And the ref_count is nowhere else initialized.
comment:7 by , 16 years ago
btw, after typing "continue" for ten times or so, I'm able to use the system normally.
comment:8 by , 16 years ago
That's a good hint the missing ref_count initialization is to blame here; since the input_server is restarted by the app_server when its gone, you have another chance to have ref_counts that don't bring it down again.
This should be fixed in hrev28277. Thanks for the investigation, and please close this bug if you can confirm it being fixed :-)
comment:9 by , 16 years ago
I'll try as soon as I can. In the meanwhile, maybe tqh could check if it's fixed.
by , 16 years ago
Attachment: | input_server_crash.jpg added |
---|
by , 16 years ago
Attachment: | thread_info_75(input_server).jpg added |
---|
by , 16 years ago
Attachment: | bt_of_thread_75(input_server).jpg added |
---|
comment:10 by , 16 years ago
Tested with hrev28277, but having a different crash now. Added additional images of the crash debug.
comment:12 by , 16 years ago
The thread crashes when trying to retrieve the next message. This usually happens when memory was corrupted before, for example when processing the previous message or during setup in case it's the first message it's trying to process.
comment:13 by , 16 years ago
Replying to jackburton:
I had that crash too, also before this change.
I meant, that's the other crash I was talking about when I wrote : "although the stack trace changed once or twice" in the description.
comment:14 by , 16 years ago
Seems the previous bug also did not dissappear completely. Additional debug output follows.
by , 16 years ago
Attachment: | last_part_of_syslog_thread_82.jpg added |
---|
by , 16 years ago
Attachment: | last_part_of_syslog_thread_82.2.jpg added |
---|
by , 16 years ago
Attachment: | sc_of_thread_82(add-on_manager).jpg added |
---|
comment:15 by , 16 years ago
Happens here too with hrev28289 on my dev laptop (Asus A8J) that worked nicely a week ago :) Backtrace from gdb follows.
by , 16 years ago
Attachment: | aldeck_bt.JPG added |
---|
comment:16 by , 16 years ago
I suspect it's related to the touchpad. I have the laptop here, will try to supply some debug output when (if) I have time.
comment:17 by , 16 years ago
It is indeed related to the touchpad. I made some investigations on PS/2 recognition of touchpad found some lead, I hope. I first tried using VirtualBox, how a normal PS/2 mouse was found looking at it's syslog. It finds PS/2 mouse in first ps/s mouse probe. While in real hardware(touchpad) mouse probe tries 4 times to find the mouse which the first three failed to find a mouse. The last probe finds it and lets you use it. The crash occurs after unpublishing of failed ps2/mouse dev nodes. And input_server crashes after that. It was also finding ps/2 mouse like that, before. So the same problem was still there before hrev28241, but it wasn't crashing input_server. After hrev28241 it just helps to reveal a bug by crashing input_server.
Hope it helps.
comment:18 by , 16 years ago
Description: | modified (diff) |
---|
I am currently investigating this. Publishing my findings so far:
herdemir is correct: Somehow, the PS/2 driver publishes a mouse twice, even when no PS/2 mouse is attached at all. On the input_server side there will be an InputDeviceListItem created in _RegisterDevices(). Such objects have a member "fDevice" which is constructed in such a way that its "name" member points to memory by the original input_device_ref provided by the MouseDevice. Later, strcmp() to find the device is then called with the same pointers for the name, I don't know if that even works.
I've fixed this in my local tree, but I can still reproduce corrupted memory when I unplug my USB mouse. It always crashes in the heap management asserts the second time I re-plug the mouse (in _RegisterDevices()).
What also happens is that InputServer::_RegisterDevices() will not let you register the same device name twice. This is documented and correct behavior. But at least with the current implementation, if two devices are added with the same name, and the input_device_ref is deleted for the second instance in the MouseDevice, there will be a mix up and the InputDeviceListItem::fDevice::name member will point to freed memory. I don't know if that is what's actually happening though, because I don't see the output I added when removing devices. Here is some syslog output, stripped of unrelated messages:
KERN: loaded driver /boot/beos/system/add-ons/kernel/drivers/dev/input/ps2_hid KERN: loaded driver /boot/beos/system/add-ons/kernel/drivers/dev/input/usb_hid KERN: InputServer::RegisterDevices() device_ref: USB Keyboard 1 KERN: MouseInputDevice::_AddDevice(/dev/input/mouse/usb/0), name: Usb Mouse 1 KERN: InputServer::RegisterDevices() device_ref: Usb Mouse 1 KERN: InputServer::RegisterDevices() device_ref: Wacom Tablets KERN: wacom: device_open() open: 2 KERN: ps2: devfs_publish_device input/mouse/ps2/0, status = 0x00000000 KERN: void AddOnManager::MessageReceived(BMessage *) what: NMP_ KERN: MouseInputDevice::_AddDevice(/dev/input/mouse/ps2/0), name: PS/2 Mouse 1 KERN: InputServer::RegisterDevices() device_ref: PS/2 Mouse 1 KERN: ps2: probe_mouse reset failed KERN: ps2: probing mouse input/mouse/ps2/0 failed KERN: void AddOnManager::MessageReceived(BMessage *) what: NMP_ KERN: MouseInputDevice::_AddDevice(/dev/input/mouse/ps2/0), name: PS/2 Mouse 1 KERN: InputServer::RegisterDevices() device_ref already exists: PS/2 Mouse 1 KERN: ps2: devfs_publish_device input/keyboard/at/0, status = 0x00000000 KERN: void AddOnManager::MessageReceived(BMessage *) what: NMP_ KERN: ps2: devfs_unpublish_device input/mouse/ps2/0, status = 0x00000000 KERN: InputServer::RegisterDevices() device_ref: AT Keyboard 1 KERN: ps2: keyboard found KERN: void AddOnManager::MessageReceived(BMessage *) what: NMP_ KERN: InputServer::RegisterDevices() device_ref already exists: AT Keyboard 1 KERN: void AddOnManager::MessageReceived(BMessage *) what: NMP_ KERN: MouseInputDevice::_RemoveDevice(/dev/input/mouse/ps2/0), name: PS/2 Mouse 1 KERN: InputServer::UnregisterDevices() device_ref: PS/2 Mouse 1
comment:19 by , 16 years ago
Axel, could you check if publishing a node in the devfs twice will still trigger a node monitor event B_ENTRY_CREATED the second time? Is this intended?
comment:20 by , 16 years ago
I had crashing problems at around the same time as the original bug as well, dumping me into gdb. I don't think the input server is to blame though since I was able to move my cursor (with a track point) and resize or drag the terminal window around (no UI response otherwise).
Now with build 28303 the problem still exists but I no longer get cursor movement when this happens. Also, I recently had a single bootup when the program in question didn't crash. Here's a link to my complete syslog: http://dl.getdropbox.com/u/128703/syslog
comment:21 by , 16 years ago
Ok, I've found the sucker. Only took me all day. The problem was introduced in hrev28242 by switching to the BObjectList and configuring it to "own the contained objects". A RemoveItem() therefor already deletes the item, but the original code that deleted it was left in place. I will commit this soon after I have cleaned it all up again. I've also found a few other problems...
comment:25 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Thanks for the feedback, guys!
If I add a "return B_OK" before line 483 in
MouseInputDevice::_HandleMonitor(BMessage* message),
thus skipping the device removal, I don't get the crash anymore. Obviously something is double-freed or something like that. Assigning to Axel, who might know better.