Opened 16 years ago
Closed 16 years ago
#2481 closed bug (fixed)
USB mouse and keyboard sometimes stop working
Reported by: | stippi | Owned by: | mmlr |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Drivers/USB | Version: | R1/pre-alpha1 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
I searched the tickets for "USB Mouse" and didn't find this exact issue reported yet. It is not a new issue, I have been getting this from time to time for ages, but mistakenly thought that it had been reported before.
Sometimes, mostly not shortly after booting, but after a while of using Haiku, my USB mouse and keyboard suddenly stop working. The sensor of the mouse goes off. My PS/2 devices continue to work. I looked at the syslog and found these lines:
[...] usb_uhci: host controller halted USB ControlPipe: timeout waiting for queued request to complete USB Hub 2: error updating port status USB ControlPipe: timeout waiting for queued request to complete USB Hub 2: error updating port status USB ControlPipe: timeout waiting for queued request to complete USB Hub 2: error updating port status USB ControlPipe: timeout waiting for queued request to complete USB Hub 2: error updating port status [...]
The last two lines keep repeating. Usually, the hot plugging works good for me, but after this situation, the hot plugging (switching the KVM) does not recover from the error.
Attachments (2)
Change History (35)
comment:1 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:2 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Ah darn. Just happened again (GCC4, hrev29533). :-(
follow-up: 4 comment:3 by , 16 years ago
I can reproduce a similar scenario here. I haven't had the USB drop off, but if switch the KVM away from the system and then back I get this in the syslog:
KERN: usb error uhci 1: td (0x02614f60) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256d3a0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256d980) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256d9a0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256d9c0) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256d9e0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256da00) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256da20) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256da40) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256da60) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256da80) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256daa0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256dac0) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dae0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256db00) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256db20) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256db40) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256db60) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256db80) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dba0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256dbc0) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dbe0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb error uhci 1: td (0x0256dc00) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dc20) error: status: 0x254507ff; token: 0x00e08669; KERN: usb hub 23: port 1: device removed KERN: usb error uhci 1: td (0x0256dca0) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dcc0) error: status: 0x254507ff; token: 0x00e08669; KERN: usb hub 23: port 2: device removed KERN: usb error uhci 1: td (0x0256dde0) error: status: 0x254507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256de00) error: status: 0x254507ff; token: 0x00e08669; KERN: usb hub 23: port 3: device removed KERN: usb error uhci 1: td (0x0256dec0) error: status: 0x2d0507ff; token: 0x00e08769; KERN: usb error uhci 1: td (0x0256dee0) error: status: 0x2d0507ff; token: 0x00e08669; KERN: usb hub 23: port 1: new device connected KERN: usb hub 23: port 2: new device connected KERN: usb hub 23: port 3: new device connected KERN: usb hub 81: port 1: new device connected
comment:4 by , 16 years ago
Replying to schmedly:
I can reproduce a similar scenario here. I haven't had the USB drop off, but if switch the KVM away from the system and then back I get this in the syslog:
This is not an error. This just means that all the transfers in between the unplug event and the detection of the unplug failed. They can safely be ignored.
comment:5 by , 16 years ago
I should have been more specific... I have not had the USB drop off during normal usage.
If I switch away from the system with the KVM and then come back to it, the USB keyboard and mouse are no longer responsive. Those log entries are from that exact scenario.
comment:6 by , 16 years ago
I also see this error in hrev30543 (with ata_stack, but I don't think that matters) After using haiku for about 5 to 15 minutes, suddenly usb mouse and keyboard no longer work. in the syslog there is a message "usb uhci: host controller halted"
comment:7 by , 16 years ago
USB is stable (> 1 hour now) when not using my KVM switch to switch to different machines. Perhaps using the KVM switch CS1764 somehow triggers a bug.
comment:10 by , 16 years ago
Iteresting. Sadly "Host Controller Halted" is not a very informative error code on it's own. The specs list it to be set when in single stepping mode after fetching a descriptor (which we're not using), when the run/stop bit goes to 0 (which we never do) or when an internal error occurs (only candidate). They also list a few explicit conditions that cause it to get set namely PCI master/target aborts (unlikely, I've never seen that anywhere at least), suspend mode (which we aren't using), again run/stop which we are only setting to run on initialization, or the two actual error cases invalid max length or invalid PID. In case it is a host error the "Host System Error" bit would be set as well, and in case it's a protocol error the "Process Error" bit should be set (likely the device not properly shutting down the connection). Any of you that can reproduce this please change the TRACE_MODULE() to TRACE_MODULE_ERROR() in lines 1716 and 1721 in src/add-ons/kernel/busses/usb/uhci.cpp to verfiy that. In case it's a protocol error it's possible that we can just reenable the schedule, but we'll likely need some cleanup to happen and I'd like to cross reference other implementations before doing any change.
follow-up: 13 comment:12 by , 16 years ago
It also stops here for me (sometimes it runs for hours, sometimes for minutes) and I am not using any KVM switches.
*BUT* i am using the keyboard connected to the computer and the mouse connected to the keyboard (the keyboard acts as a hub).
comment:13 by , 16 years ago
Replying to bga:
It also stops here for me (sometimes it runs for hours, sometimes for minutes) and I am not using any KVM switches.
Then it's likely an unrelated issue. That it's behind a hub shouldn't matter. There is at least one known issue related to hubs, but it will either work or not, not stop after some time.
There's already another bug report open for that issue I think. If you can make out an actual error in syslog when this happens that'd be helpful. Best to just leave a "tail -F /var/log/syslog" running in a visible Terminal somewhere so you can catch it even without input devices. Also if this is a host controller going down, you should be able to connect to other ports controlled by other controllers to get the devices back.
comment:14 by , 16 years ago
I can reproduce this bug without ever using the KVM to switch computers. I am not sure that what BGA is seeing is something else necessarily.
follow-up: 16 comment:15 by , 16 years ago
This is what I got now with hrev30575:
KERN: usb uhci: process error KERN: usb uhci: host controller halted KERN: usb error control pipe 44: timeout waiting for queued request to complete KERN: usb error hub 43: error updating port status KERN: usb error control pipe 44: timeout waiting for queued request to complete KERN: usb error hub 43: error updating port status KERN: usb error control pipe 44: timeout waiting for queued request to complete KERN: usb error hub 43: error updating port status KERN: usb error control pipe 44: timeout waiting for queued request to complete KERN: usb error hub 43: error updating port status
The last two lines keep repeating every second or so. So it's a "process error", whatever that means.
follow-up: 17 comment:16 by , 16 years ago
Replying to stippi:
KERN: usb uhci: process error KERN: usb uhci: host controller halted
Thank's, so it's the device that somehow messes up either the transfer length or the PID values. I'll cross reference other implementations and see how to best resolve such an error. In the best case we may be able to simply acknowledge the error and restart the schedule.
Replying to stippi:
KERN: usb error control pipe 44: timeout waiting for queued request to complete KERN: usb error hub 43: error updating port status
The last two lines keep repeating every second or so. So it's a "process error", whatever that means.
It means that an unexpected value for the PID (identifying the contents of the arriving or outgoing packet) or an invalid transfer length was received. It's most probably a device bug.
That the lines repeat is expected. In the error case the schedule is stopped and interrupts are disabled for the controller, so no transfers will work anymore and instead simply time out.
comment:17 by , 16 years ago
Replying to mmlr:
It means that an unexpected value for the PID (identifying the contents of the arriving or outgoing packet) or an invalid transfer length was received. It's most probably a device bug.
Actually it isn't. It must be something coming from the software side. The UHCI specs declare the process error as a "catastrophic failure", but I wouldn't quite go that far ;-). It still can only be either a wrong PID or max length, that gets set by us. I've reviewed the PIDs and there's not really anything that could go wrong. They are set in any case. Even if the wrong ones would be set that wouldn't trigger a process error but one of the other normal error cases. This leaves the max length which I reviewed as well and couldn't make out anything suspicious. But if at all it's the more likely candidate as it is the more complex of both and has a chance of being miscalculated or invalid. Of course it could also be a general memory corruption, but I'd say that's rather unlikely as it should affect more use cases.
I can reproduce this error in one setup here, though only relatively randomly and will try tracking it down now.
comment:18 by , 16 years ago
Awesome, thanks! It's like the last remaining "general" bug I keep hitting...
comment:19 by , 16 years ago
FYI, my problem is indeed exactly the same as the one reported by stippi:
2009-05-06 22:01:57 KERN: usb uhci: process error 2009-05-06 22:01:57 KERN: usb uhci: host controller halted 2009-05-06 22:01:59 KERN: usb error control pipe 65: timeout waiting for queued request to complete 2009-05-06 22:01:59 KERN: usb error hub 64: error updating port status 2009-05-06 22:02:00 KERN: usb error control pipe 65: timeout waiting for queued request to complete 2009-05-06 22:02:00 KERN: usb error hub 64: error updating port status 2009-05-06 22:02:01 KERN: usb error control pipe 65: timeout waiting for queued request to complete
comment:21 by , 16 years ago
I think I've found the cause of this. It's probably not a wrongly set descriptor field at all, but a descriptor leak. This leak only happens on control transfers apparently and this explains why you all saw it and I couldn't. Control transfers are usually only used in initial device setup and for some rare later configuration.
But in the case of hubs (and the KVM is a hub) it is used to poll the port status. This means that when there are hubs involved it would leak about 3 descriptors per second. In the end, if the controller lived long enough, it would have exhausted the physical memory pool, but it seems that way before that the controller stops with a process error. Probably because the queue gets too long after some time (the descriptors are inactive but still part of the queue that the controller has to parse through on every iteration). It's possible that when the linked descriptor list becomes too long that the controller simply doesn't have enough time to skim through all of them and then throws a process error.
Going to fix the leak in any case (not sure where it happens yet) and then check if the other host controller drivers contain the same error.
comment:22 by , 16 years ago
Hi Michael.
Great that you are looking into this. Does your theory account for the fact that sometimes I see the problem after 5 minutes and sometimes after several hours?
comment:23 by , 16 years ago
I take it back. The leak is actually a symptom of the process error, not the other way around. And it's not even a leak at all because canceling correctly cleans up. Since all values that according to specs could cause that process error look fine I'm back to the start. I'm clueless.
comment:25 by , 16 years ago
I am very happy to report that I have run Haiku in three sessions that lasted longer than 5 to 6 hours each and have not seen this problem again. Usually, I would see it after 30 minutes to 2 hours at most. Awesome! What about you, BGA?
comment:26 by , 16 years ago
Sorry, I forgot to comment here. Yes, I used haiku yesterday for several hours and saw no problem. The only reason I can not say for sure that this is fixed is because it used to happen wuite randomly but based on stippi's report I guess this is really sorted up.
Thanks Michael.
follow-up: 28 comment:27 by , 16 years ago
Been running for nearly 24 hours here and no problem at all with the dedicated keyboard or mouse ports. I did however notice that the shared USB ports exhibit behavior similar to the original problem. Can anyone else verify this?
comment:28 by , 16 years ago
Replying to schmedly:
Been running for nearly 24 hours here and no problem at all with the dedicated keyboard or mouse ports. I did however notice that the shared USB ports exhibit behavior similar to the original problem. Can anyone else verify this?
I don't quite understand. What do you mean by 'shared ports' and what behaviour do you mean exactly? The one of the original report or your initial comment?
follow-up: 30 comment:29 by , 16 years ago
On this KVM model (ATEN CS-1764) there are four USB ports. Two are the dedicated console (keyboard and mouse) and two are extra ports for device sharing. I'm assuming the distinction is that PS/2 emulation may be done for the console ports.
While the most recent changes allowed those dedicated console ports to work for more than 24 hours in my test.. the original behavior of the 'keyboard or mouse dropping off randomly' still occurs with other items which are attached to the device sharing ports.
In this case I have two keyboards. One older Apple keyboard attached to the console port (which now works perfectly) and another (the new aluminum Apple keyboard) attached to a device sharing port. This is because I prefer the aluminum keyboard BUT the console ports do not provide enough power for it to work. Odd situation I know, but it allowed me to notice the 'dropping off' behavior continue.
follow-up: 31 comment:30 by , 16 years ago
Replying to schmedly:
I'm assuming the distinction is that PS/2 emulation may be done for the console ports.
That makes no real sense, there's no emulation going on anywhere.
While the most recent changes allowed those dedicated console ports to work for more than 24 hours in my test.. the original behavior of the 'keyboard or mouse dropping off randomly' still occurs with other items which are attached to the device sharing ports.
What does the syslog say when this happens?
Also does this really happen randomely or is it related to switching?
A listusb -v output would make sense in any case to judge what device topology is present.
by , 16 years ago
Attachment: | listusb-aten1764.txt added |
---|
listusb output (topology for aten cs-1764)
comment:31 by , 16 years ago
Replying to mmlr:
Replying to schmedly:
I'm assuming the distinction is that PS/2 emulation may be done for the console ports.
That makes no real sense, there's no emulation going on anywhere.
Sorry, I was told when USB KVM's first started showing up that they still used a form of PS/2 emulation internally. My point was just that the console ports are separate from the shared device port.
What does the syslog say when this happens?
Also does this really happen randomely or is it related to switching?
A listusb -v output would make sense in any case to judge what device topology is present.
And sorry for the double post of the file, BeZilla errored. That's what I get for doing it in Haiku. I'll do more switching and add syslog next.
comment:32 by , 16 years ago
Okay it's more than 2 days of uptime, and the problem isn't exactly what I thought. Sometimes when switching back to Haiku, if the screensaver is active the screen will stay blank. However, if I switch away and switch back again, the display comes up. This behavior does not occur when the screensaver is not enabled.
comment:33 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
So I take this as a "fixed".
This was still an issue until a short while ago, but one of the very last USB fixes seems to have cured this problem for me. Today I have an uptime of 13:40 minutes and have not had an issue with the USB devices. Since a couple of days, I am using my desktop Haiku installation with my USB KVM switch again and have not had this issue. Maybe two weeks before, I dug up my PS/2 KVM switch, because then it was definitely still a problem. Revision right now is hrev29421. I will re-open if I ever encounter this again. Pretty cool, this was annoying!