Opened 11 years ago

Closed 10 years ago

#2481 closed bug (fixed)

USB mouse and keyboard sometimes stop working

Reported by: stippi Owned by: mmlr
Priority: normal Milestone: R1
Component: Drivers/USB Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

I searched the tickets for "USB Mouse" and didn't find this exact issue reported yet. It is not a new issue, I have been getting this from time to time for ages, but mistakenly thought that it had been reported before.

Sometimes, mostly not shortly after booting, but after a while of using Haiku, my USB mouse and keyboard suddenly stop working. The sensor of the mouse goes off. My PS/2 devices continue to work. I looked at the syslog and found these lines:

[...]
usb_uhci: host controller halted
USB ControlPipe: timeout waiting for queued request to complete
USB Hub 2: error updating port status
USB ControlPipe: timeout waiting for queued request to complete
USB Hub 2: error updating port status
USB ControlPipe: timeout waiting for queued request to complete
USB Hub 2: error updating port status
USB ControlPipe: timeout waiting for queued request to complete
USB Hub 2: error updating port status
[...]

The last two lines keep repeating. Usually, the hot plugging works good for me, but after this situation, the hot plugging (switching the KVM) does not recover from the error.

Attachments (2)

listusb-atenkvm.txt (10.0 KB) - added by schmedly 10 years ago.
listusb output (topology for aten cs-1764)
listusb-aten1764.txt (10.0 KB) - added by schmedly 10 years ago.
listusb output (topology for aten cs-1764)

Download all attachments as: .zip

Change History (35)

comment:1 Changed 10 years ago by stippi

Resolution: fixed
Status: newclosed

This was still an issue until a short while ago, but one of the very last USB fixes seems to have cured this problem for me. Today I have an uptime of 13:40 minutes and have not had an issue with the USB devices. Since a couple of days, I am using my desktop Haiku installation with my USB KVM switch again and have not had this issue. Maybe two weeks before, I dug up my PS/2 KVM switch, because then it was definitely still a problem. Revision right now is hrev29421. I will re-open if I ever encounter this again. Pretty cool, this was annoying!

comment:2 Changed 10 years ago by stippi

Resolution: fixed
Status: closedreopened

Ah darn. Just happened again (GCC4, hrev29533). :-(

comment:3 Changed 10 years ago by schmedly

I can reproduce a similar scenario here. I haven't had the USB drop off, but if switch the KVM away from the system and then back I get this in the syslog:

KERN: usb error uhci 1: td (0x02614f60) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256d3a0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256d980) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256d9a0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256d9c0) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256d9e0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256da00) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256da20) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256da40) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256da60) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256da80) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256daa0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256dac0) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dae0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256db00) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256db20) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256db40) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256db60) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256db80) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dba0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256dbc0) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dbe0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb error uhci 1: td (0x0256dc00) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dc20) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb hub 23: port 1: device removed
KERN: usb error uhci 1: td (0x0256dca0) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dcc0) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb hub 23: port 2: device removed
KERN: usb error uhci 1: td (0x0256dde0) error: status: 0x254507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256de00) error: status: 0x254507ff; token: 0x00e08669;
KERN: usb hub 23: port 3: device removed
KERN: usb error uhci 1: td (0x0256dec0) error: status: 0x2d0507ff; token: 0x00e08769;
KERN: usb error uhci 1: td (0x0256dee0) error: status: 0x2d0507ff; token: 0x00e08669;
KERN: usb hub 23: port 1: new device connected
KERN: usb hub 23: port 2: new device connected
KERN: usb hub 23: port 3: new device connected
KERN: usb hub 81: port 1: new device connected

comment:4 in reply to:  3 Changed 10 years ago by mmlr

Replying to schmedly:

I can reproduce a similar scenario here. I haven't had the USB drop off, but if switch the KVM away from the system and then back I get this in the syslog:

This is not an error. This just means that all the transfers in between the unplug event and the detection of the unplug failed. They can safely be ignored.

comment:5 Changed 10 years ago by schmedly

I should have been more specific... I have not had the USB drop off during normal usage.

If I switch away from the system with the KVM and then come back to it, the USB keyboard and mouse are no longer responsive. Those log entries are from that exact scenario.

comment:6 Changed 10 years ago by marcusoverhagen

I also see this error in hrev30543 (with ata_stack, but I don't think that matters) After using haiku for about 5 to 15 minutes, suddenly usb mouse and keyboard no longer work. in the syslog there is a message "usb uhci: host controller halted"

comment:7 Changed 10 years ago by marcusoverhagen

USB is stable (> 1 hour now) when not using my KVM switch to switch to different machines. Perhaps using the KVM switch CS1764 somehow triggers a bug.

comment:8 Changed 10 years ago by schmedly

I have the same KVM.

comment:9 Changed 10 years ago by stippi

Yes, same KVM here. ATEN CS-1764.

comment:10 Changed 10 years ago by mmlr

Iteresting. Sadly "Host Controller Halted" is not a very informative error code on it's own. The specs list it to be set when in single stepping mode after fetching a descriptor (which we're not using), when the run/stop bit goes to 0 (which we never do) or when an internal error occurs (only candidate). They also list a few explicit conditions that cause it to get set namely PCI master/target aborts (unlikely, I've never seen that anywhere at least), suspend mode (which we aren't using), again run/stop which we are only setting to run on initialization, or the two actual error cases invalid max length or invalid PID. In case it is a host error the "Host System Error" bit would be set as well, and in case it's a protocol error the "Process Error" bit should be set (likely the device not properly shutting down the connection). Any of you that can reproduce this please change the TRACE_MODULE() to TRACE_MODULE_ERROR() in lines 1716 and 1721 in src/add-ons/kernel/busses/usb/uhci.cpp to verfiy that. In case it's a protocol error it's possible that we can just reenable the schedule, but we'll likely need some cleanup to happen and I'd like to cross reference other implementations before doing any change.

comment:11 Changed 10 years ago by mmlr

I enabled it generally hrev30565 now, so you can just update as well.

comment:12 Changed 10 years ago by bga

It also stops here for me (sometimes it runs for hours, sometimes for minutes) and I am not using any KVM switches.

*BUT* i am using the keyboard connected to the computer and the mouse connected to the keyboard (the keyboard acts as a hub).

comment:13 in reply to:  12 Changed 10 years ago by mmlr

Replying to bga:

It also stops here for me (sometimes it runs for hours, sometimes for minutes) and I am not using any KVM switches.

Then it's likely an unrelated issue. That it's behind a hub shouldn't matter. There is at least one known issue related to hubs, but it will either work or not, not stop after some time.

There's already another bug report open for that issue I think. If you can make out an actual error in syslog when this happens that'd be helpful. Best to just leave a "tail -F /var/log/syslog" running in a visible Terminal somewhere so you can catch it even without input devices. Also if this is a host controller going down, you should be able to connect to other ports controlled by other controllers to get the devices back.

comment:14 Changed 10 years ago by stippi

I can reproduce this bug without ever using the KVM to switch computers. I am not sure that what BGA is seeing is something else necessarily.

comment:15 Changed 10 years ago by stippi

This is what I got now with hrev30575:

KERN: usb uhci: process error
KERN: usb uhci: host controller halted
KERN: usb error control pipe 44: timeout waiting for queued request to complete
KERN: usb error hub 43: error updating port status
KERN: usb error control pipe 44: timeout waiting for queued request to complete
KERN: usb error hub 43: error updating port status
KERN: usb error control pipe 44: timeout waiting for queued request to complete
KERN: usb error hub 43: error updating port status
KERN: usb error control pipe 44: timeout waiting for queued request to complete
KERN: usb error hub 43: error updating port status

The last two lines keep repeating every second or so. So it's a "process error", whatever that means.

comment:16 in reply to:  15 ; Changed 10 years ago by mmlr

Replying to stippi:

KERN: usb uhci: process error
KERN: usb uhci: host controller halted

Thank's, so it's the device that somehow messes up either the transfer length or the PID values. I'll cross reference other implementations and see how to best resolve such an error. In the best case we may be able to simply acknowledge the error and restart the schedule.

Replying to stippi:

KERN: usb error control pipe 44: timeout waiting for queued request to complete
KERN: usb error hub 43: error updating port status

The last two lines keep repeating every second or so. So it's a "process error", whatever that means.

It means that an unexpected value for the PID (identifying the contents of the arriving or outgoing packet) or an invalid transfer length was received. It's most probably a device bug.

That the lines repeat is expected. In the error case the schedule is stopped and interrupts are disabled for the controller, so no transfers will work anymore and instead simply time out.

comment:17 in reply to:  16 Changed 10 years ago by mmlr

Replying to mmlr:

It means that an unexpected value for the PID (identifying the contents of the arriving or outgoing packet) or an invalid transfer length was received. It's most probably a device bug.

Actually it isn't. It must be something coming from the software side. The UHCI specs declare the process error as a "catastrophic failure", but I wouldn't quite go that far ;-). It still can only be either a wrong PID or max length, that gets set by us. I've reviewed the PIDs and there's not really anything that could go wrong. They are set in any case. Even if the wrong ones would be set that wouldn't trigger a process error but one of the other normal error cases. This leaves the max length which I reviewed as well and couldn't make out anything suspicious. But if at all it's the more likely candidate as it is the more complex of both and has a chance of being miscalculated or invalid. Of course it could also be a general memory corruption, but I'd say that's rather unlikely as it should affect more use cases.

I can reproduce this error in one setup here, though only relatively randomly and will try tracking it down now.

comment:18 Changed 10 years ago by stippi

Awesome, thanks! It's like the last remaining "general" bug I keep hitting...

comment:19 Changed 10 years ago by bga

FYI, my problem is indeed exactly the same as the one reported by stippi:

2009-05-06 22:01:57 KERN: usb uhci: process error
2009-05-06 22:01:57 KERN: usb uhci: host controller halted
2009-05-06 22:01:59 KERN: usb error control pipe 65: timeout waiting for queued request to complete
2009-05-06 22:01:59 KERN: usb error hub 64: error updating port status
2009-05-06 22:02:00 KERN: usb error control pipe 65: timeout waiting for queued request to complete
2009-05-06 22:02:00 KERN: usb error hub 64: error updating port status
2009-05-06 22:02:01 KERN: usb error control pipe 65: timeout waiting for queued request to complete

comment:20 Changed 10 years ago by bga

BTW, this time it happened with a 30 minute uptime more or less.

comment:21 Changed 10 years ago by mmlr

I think I've found the cause of this. It's probably not a wrongly set descriptor field at all, but a descriptor leak. This leak only happens on control transfers apparently and this explains why you all saw it and I couldn't. Control transfers are usually only used in initial device setup and for some rare later configuration.

But in the case of hubs (and the KVM is a hub) it is used to poll the port status. This means that when there are hubs involved it would leak about 3 descriptors per second. In the end, if the controller lived long enough, it would have exhausted the physical memory pool, but it seems that way before that the controller stops with a process error. Probably because the queue gets too long after some time (the descriptors are inactive but still part of the queue that the controller has to parse through on every iteration). It's possible that when the linked descriptor list becomes too long that the controller simply doesn't have enough time to skim through all of them and then throws a process error.

Going to fix the leak in any case (not sure where it happens yet) and then check if the other host controller drivers contain the same error.

comment:22 Changed 10 years ago by bga

Hi Michael.

Great that you are looking into this. Does your theory account for the fact that sometimes I see the problem after 5 minutes and sometimes after several hours?

comment:23 Changed 10 years ago by mmlr

I take it back. The leak is actually a symptom of the process error, not the other way around. And it's not even a leak at all because canceling correctly cleans up. Since all values that according to specs could cause that process error look fine I'm back to the start. I'm clueless.

comment:24 Changed 10 years ago by mmlr

Hopefully fixed in hrev30714, please test.

comment:25 Changed 10 years ago by stippi

I am very happy to report that I have run Haiku in three sessions that lasted longer than 5 to 6 hours each and have not seen this problem again. Usually, I would see it after 30 minutes to 2 hours at most. Awesome! What about you, BGA?

comment:26 Changed 10 years ago by bga

Sorry, I forgot to comment here. Yes, I used haiku yesterday for several hours and saw no problem. The only reason I can not say for sure that this is fixed is because it used to happen wuite randomly but based on stippi's report I guess this is really sorted up.

Thanks Michael.

comment:27 Changed 10 years ago by schmedly

Been running for nearly 24 hours here and no problem at all with the dedicated keyboard or mouse ports. I did however notice that the shared USB ports exhibit behavior similar to the original problem. Can anyone else verify this?

comment:28 in reply to:  27 Changed 10 years ago by mmlr

Replying to schmedly:

Been running for nearly 24 hours here and no problem at all with the dedicated keyboard or mouse ports. I did however notice that the shared USB ports exhibit behavior similar to the original problem. Can anyone else verify this?

I don't quite understand. What do you mean by 'shared ports' and what behaviour do you mean exactly? The one of the original report or your initial comment?

comment:29 Changed 10 years ago by schmedly

On this KVM model (ATEN CS-1764) there are four USB ports. Two are the dedicated console (keyboard and mouse) and two are extra ports for device sharing. I'm assuming the distinction is that PS/2 emulation may be done for the console ports.

While the most recent changes allowed those dedicated console ports to work for more than 24 hours in my test.. the original behavior of the 'keyboard or mouse dropping off randomly' still occurs with other items which are attached to the device sharing ports.

In this case I have two keyboards. One older Apple keyboard attached to the console port (which now works perfectly) and another (the new aluminum Apple keyboard) attached to a device sharing port. This is because I prefer the aluminum keyboard BUT the console ports do not provide enough power for it to work. Odd situation I know, but it allowed me to notice the 'dropping off' behavior continue.

comment:30 in reply to:  29 ; Changed 10 years ago by mmlr

Replying to schmedly:

I'm assuming the distinction is that PS/2 emulation may be done for the console ports.

That makes no real sense, there's no emulation going on anywhere.

While the most recent changes allowed those dedicated console ports to work for more than 24 hours in my test.. the original behavior of the 'keyboard or mouse dropping off randomly' still occurs with other items which are attached to the device sharing ports.

What does the syslog say when this happens?

Also does this really happen randomely or is it related to switching?

A listusb -v output would make sense in any case to judge what device topology is present.

Changed 10 years ago by schmedly

Attachment: listusb-atenkvm.txt added

listusb output (topology for aten cs-1764)

Changed 10 years ago by schmedly

Attachment: listusb-aten1764.txt added

listusb output (topology for aten cs-1764)

comment:31 in reply to:  30 Changed 10 years ago by schmedly

Replying to mmlr:

Replying to schmedly:

I'm assuming the distinction is that PS/2 emulation may be done for the console ports.

That makes no real sense, there's no emulation going on anywhere.

Sorry, I was told when USB KVM's first started showing up that they still used a form of PS/2 emulation internally. My point was just that the console ports are separate from the shared device port.

What does the syslog say when this happens?

Also does this really happen randomely or is it related to switching?

A listusb -v output would make sense in any case to judge what device topology is present.

And sorry for the double post of the file, BeZilla errored. That's what I get for doing it in Haiku. I'll do more switching and add syslog next.

comment:32 Changed 10 years ago by schmedly

Okay it's more than 2 days of uptime, and the problem isn't exactly what I thought. Sometimes when switching back to Haiku, if the screensaver is active the screen will stay blank. However, if I switch away and switch back again, the display comes up. This behavior does not occur when the screensaver is not enabled.

comment:33 Changed 10 years ago by mmlr

Resolution: fixed
Status: reopenedclosed

So I take this as a "fixed".

Note: See TracTickets for help on using tickets.