Opened 19 months ago

Closed 5 months ago

#13768 closed bug (fixed)

USB hub spamming syslog

Reported by: humdinger Owned by: nobody
Priority: normal Milestone: R1/beta2
Component: Drivers/USB/XHCI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

This is hrev51536.

Since the USB3 changes in hrev51533 I un-blacklisted xhci and now see the syslog being continously spammed with:

KERN: usb hub 2: KERN: port 15 is not in an over current state
KERN: usb hub 2: port 15 was reset
KERN: usb hub 2: port 16 is not suspended
KERN: usb hub 2: port 16 is not in an over current state
KERN: usb hub 2: port 16 was reset
KERN: usb hub 2: port 17 is not suspended
KERN: usb hub 2: port 17 is not in an over current state
KERN: usb hub 2: port 17 was reset
KERN: usb hub 2: port 18 is not suspended
KERN: usb hub 2: port 18 is not in an over current state
KERN: usb hub 2: port 18 was reset
KERN: usb hub 2: port 15 is not suspended
KERN: usb hub 2: KERN: port 15 is not in an over current state
KERN: usb hub 2: port 15 was reset
KERN: usb hub 2: port 16 is not suspended
KERN: usb hub 2: port 16 is not in an over current state
KERN: usb hub 2: port 16 was reset
KERN: usb hub 2: port 17 is not suspended
KERN: usb hub 2: port 17 is not in an over current state
KERN: usb hub 2: port 17 was reset
KERN: usb hub 2: port 18 is not suspended
KERN: usb hub 2: port 18 is not in an over current state
KERN: usb hub 2: port 18 was reset
KERN: usb hub 2: port 15 is not suspended
KERN: usb hub 2: KERN: port 15 is not in an over current state
KERN: usb hub 2: port 15 was reset
KERN: usb hub 2: port 16 is not suspended
KERN: usb hub 2: port 16 is not in an over current state
KERN: usb hub 2: port 16 was reset
KERN: usb hub 2: port 17 is not suspended
KERN: usb hub 2: KERN: port 17 is not in an over current state
KERN: usb hub 2: port 17 was reset
KERN: usb hub 2: port 18 is not suspended
KERN: usb hub 2: port 18 is not in an over current state
KERN: usb hub 2: port 18 was reset

Attachments (8)

syslog_blacklisted_xhci.txt (258.3 KB) - added by humdinger 19 months ago.
All OK when XHCI is blacklisted
syslog_spam.txt (267.9 KB) - added by humdinger 19 months ago.
Spamming after a 'warm' reboot
syslog_No_spam.txt (258.7 KB) - added by humdinger 19 months ago.
No spamming when clean booting from powered off
cold_vs_warm.diff (192.0 KB) - added by waddlesplash 19 months ago.
0001-xhci-fix-hub-descriptor-command-request.patch (1.1 KB) - added by GregCrain 18 months ago.
syslog (291.9 KB) - added by hugeping 8 months ago.
usb3.patch (1.9 KB) - added by hugeping 8 months ago.
Clear status fix
0001-xhci-GetPortStatus-ClearPortFeature-PS_PLC-PS_WRC-fi.patch (2.4 KB) - added by hugeping 8 months ago.

Download all attachments as: .zip

Change History (31)

comment:1 Changed 19 months ago by waddlesplash

Component: Drivers/USB/XHCIDrivers/USB
Owner: changed from nobody to mmlr

Those messages are coming from the USB hub manager, which has them as TRACE_ALWAYS which seems odd: http://xref.plausible.coop/source/xref/haiku/src/add-ons/kernel/bus_managers/usb/Hub.cpp#325

comment:2 Changed 19 months ago by humdinger

I've switched to another port and that one isn't spamming the syslog (at least for this session...)

comment:3 in reply to:  1 ; Changed 19 months ago by mmlr

Component: Drivers/USBDrivers/USB/XHCI
Owner: changed from mmlr to nobody

Replying to waddlesplash:

Those messages are coming from the USB hub manager, which has them as TRACE_ALWAYS which seems odd

It's not odd, it's supposed to always be logged. These are change indicator bits, which are then acknowledged through the following clear feature request.

Since it's unlikely that the bits actually toggle the whole time, I'd guess that the clear feature request to reset them just doesn't go through. That would then be an issue in the corresponding driver.

What's odd is that it apparently doesn't happen on another port. Is this a root port of the machine or is there an actual hub in between? A syslog may shed some more light on the actual topology and possible issues in setting up any ports.

comment:4 Changed 19 months ago by humdinger

New info: Apparently the spamming only happens when having rebooted. When cold starting after a real power-off shutdown, there's no syslog spamming.

Hardwareinfo:

listusb

062a:4102 /dev/bus/usb/0/1 "MosArt Semiconductor Corp." "2.4G Wireless Mouse" ver. 0312
8087:07da /dev/bus/usb/0/6 "Intel Corp." "" ver. 7869
04f2:b35a /dev/bus/usb/0/7 "Chicony Electronics Co., Ltd" "Chicony USB 2.0 Camera" ver. 9162
0000:0000 /dev/bus/usb/0/hub "HAIKU Inc." "XHCI RootHub" ver. 0300
8087:8008 /dev/bus/usb/1/0/hub "Intel Corp." "" ver. 0005
0000:0000 /dev/bus/usb/1/hub "HAIKU Inc." "EHCI RootHub" ver. 0200
8087:8000 /dev/bus/usb/2/0/hub "Intel Corp." "" ver. 0005
0000:0000 /dev/bus/usb/2/hub "HAIKU Inc." "EHCI RootHub" ver. 0200

listdev

device Serial bus controller (USB controller, EHCI) [c|3|20]
  vendor 8086: Intel Corporation
  device 8c26: 8 Series/C220 Series Chipset Family USB EHCI #1

device Serial bus controller (USB controller, EHCI) [c|3|20]
  vendor 8086: Intel Corporation
  device 8c2d: 8 Series/C220 Series Chipset Family USB EHCI #2

device Serial bus controller (USB controller, XHCI) [c|3|30]
  vendor 8086: Intel Corporation
  device 8c31: 8 Series/C220 Series Chipset Family USB xHCI

I attach a spamming and a non-spamming syslog, both when my wireless mouse toggle is in the USB3 port on the left of my norebook.

comment:5 Changed 19 months ago by mmlr

The syslogs are unfortunately incomplete. The USB initialization, where the hub topology would be visible isn't in there as it happens quite early.

comment:6 Changed 19 months ago by waddlesplash

My machine has the same issue. Would just one syslog plus a listdev be good enough? Or do you really need the USB init output (if the latter, perhaps I need to write a userbootscript to copy the syslogs immediately after startup...)

Changed 19 months ago by humdinger

Attachment: syslog_blacklisted_xhci.txt added

All OK when XHCI is blacklisted

Changed 19 months ago by humdinger

Attachment: syslog_spam.txt added

Spamming after a 'warm' reboot

Changed 19 months ago by humdinger

Attachment: syslog_No_spam.txt added

No spamming when clean booting from powered off

comment:9 Changed 19 months ago by humdinger

I removed the useless syslogs and up'ed 3 new ones
1 - when the xhci is blacklisted - all OK
2 - after a 'warm' reboot - spamming (no blacklisting)
3 - when freshly booting after a powerdown - no spamming (no blacklisting)

comment:10 Changed 19 months ago by waddlesplash

Weirdly, these exact lines appear in both the warm and cold boots:

KERN: usb xhci -1: starting XHCI host controller
KERN: usb xhci -1: successfully started the controller
KERN: usb uhci: no devices found
KERN: usb ohci: no devices found
KERN: add_memory_type_range(802, 0xf7e1c000, 0x1000, 0)
KERN: sitd entry size 64, itd entry size 128
KERN: usb ehci -1: successfully started the controller
KERN: add_memory_type_range(807, 0xf7e1b000, 0x1000, 0)
KERN: sitd entry size 64, itd entry size 128
KERN: usb ehci -1: successfully started the controller
KERN: usb hub 2: port 1: new device connected
KERN: usb error xhci -1: unsuccessful command Parameter (17)
KERN: usb xhci -1: cancel queued transfers for pipe 0x827a3ec8 (0)
KERN: usb hub 2: port 6: new device connected
KERN: usb error xhci -1: unsuccessful command Parameter (17)
Last message repeated 4 times
KERN: usb xhci -1: cancel queued transfers for pipe 0x827a3ec8 (0)
KERN: usb hub 2: port 7: new device connected
KERN: usb error xhci -1: unsuccessful command Parameter (17)
KERN: usb xhci -1: cancel queued transfers for pipe 0x827a3ec8 (0)

Immediately after this in the "warm" boot the spamming begins, whereas in the "cold" boot (and after one round of spam in the "warm" boot):

KERN: usb hub 7: port 0: new device connected
KERN: usb hub 12: port 0: new device connected
KERN: get_boot_partitions(): boot volume message:

Attaching a diff the two boots for easy comparison (notable: APIC ticks are different, add_memory_type_range is off-by-one.)

Changed 19 months ago by waddlesplash

Attachment: cold_vs_warm.diff added

comment:11 Changed 18 months ago by GregCrain

Has a Patch: set

comment:12 Changed 18 months ago by GregCrain

So I found a bug in the xhci driver when getting the descriptor for a hub. I attached a patch here. I don't know if it impacts this ticket or not, since I haven't reproduced the exact same issue in this ticket. But your error log shows a lot of hub errors.

On my machine, with a USB2 hub connected on the XHCI port, it wouldn't even iterate. The command to get the hub descriptor was incorrect, and it created a stall error. The stall is not handled in the driver, and it looks like the hub driver continues trying to address it with no response, which might be the spamming issue that you see.

This is also related to ticket #13792.

comment:13 Changed 18 months ago by humdinger

I'm sad to say that this didn't fix my syslog spamming issue. Would've been a sensational one-line fix. :)

comment:15 Changed 17 months ago by pulkomandy

Has a Patch: unset

comment:16 in reply to:  3 ; Changed 15 months ago by KapiX

Replying to Michael Lotz:

It's not odd, it's supposed to always be logged. These are change indicator bits, which are then acknowledged through the following clear feature request.

I'd argue for TRACE_ALWAYS to be disabled until this issue is fixed. It causes syslog to grow about 1.5 KiB every second which wears out SSD without purpose. Unless someone is actively investigating the issue, these traces are useless.

comment:17 in reply to:  16 Changed 14 months ago by GregCrain

I'd argue for TRACE_ALWAYS to be disabled until this issue is fixed. It causes syslog to grow about 1.5 KiB every second which wears out SSD without purpose. Unless someone is actively investigating the issue, these traces are useless.

I would be good to disable logging until the problem is fixed. It's not really giving any useful information for the open issues, and just slowing the system. I don't see this spamming issue on my NEC/Renesas chipset, but it occurs consistently on an Intel chipset.

I had a computer and hard drive go bad, and now that I've got things set back up, I've been spending a lot of time trying to understand where and how things are failing with the xhci. I've added a bunch of debug output in my own build.

On my system at some point, usually after scanning partitions of a bulk device, interrupts completely stop for Normal or Control requests. All status registers, slot contexts and endpoint contexts report no errors, running, not stalled or halted.

The Ring Doorbell function gets rang until endpoint transfers used is exceeded, up to XHCI_MAX_TRANSFERS, with no interrupts. No-Op Command IRQ continues to work fine though.

Then I get "_LinkDescriptorForPipe max transfers count exceeded 8". I'm a bit baffled.

comment:18 Changed 8 months ago by hugeping

Same problem here. xhci spams syslog. Here is the log with USB tracing enabled:

https://dev.haiku-os.org/attachment/ticket/13768/syslog

Last edited 8 months ago by hugeping (previous) (diff)

Changed 8 months ago by hugeping

Attachment: syslog added

Changed 8 months ago by hugeping

Attachment: usb3.patch added

Clear status fix

comment:19 Changed 8 months ago by hugeping

Ok, i have made the patch: https://dev.haiku-os.org/attachment/ticket/13768/usb3.patch that works for me. This is still dirty, but i hope it is correct. Please, verify it. The problem was in error mixing bits number vs bitmask while clearing status.

Last edited 8 months ago by hugeping (previous) (diff)

comment:20 Changed 8 months ago by tqh

Can verify that patch does fix log issue (on my machine), have not done any actual XHCI testing.

Last edited 8 months ago by tqh (previous) (diff)

comment:21 Changed 8 months ago by korli

About the patch:

  • The features used in XHCI::ClearPortFeature() should be C_PORT_LINK_STATE and C_PORT_BH_PORT_RESET.
  • The same applies for Hub::Explore(): USB_REQUEST_CLEAR_FEATURE should be used with C_PORT_LINK_STATE and C_PORT_BH_PORT_RESET.

comment:22 Changed 8 months ago by hugeping

I have attached cleaned patch. https://dev.haiku-os.org/attachment/ticket/13768/0001-xhci-GetPortStatus-ClearPortFeature-PS_PLC-PS_WRC-fi.patch

Currently i have no idea how to work with gerrit and i have no time, exuse me. Will try it later, after reading manuals.

comment:24 Changed 7 months ago by pulkomandy

Milestone: UnscheduledR1/beta2

comment:25 Changed 5 months ago by waddlesplash

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.