Opened 9 years ago

Closed 7 years ago

#5372 closed bug (fixed)

atheroswifi doesn't disable interrupts on shutdown Aspire One Netbook / Intel 82801G

Reported by: kallisti5 Owned by: nobody
Priority: normal Milestone: R1
Component: Drivers/Network Version: R1/Development
Keywords: Cc: kallisti5@…
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

Attaching USB devices does not work as of hrev35410 x86gcc2hybrid on the Acer Aspire One Netbook.

Multiple USB drives tried..

syslog:

KERN: usb hub 22: KERN: port 2: new device connected
KERN: usb error control pipe 26: KERN: timeout waiting for queued request to complete
KERN: usb error control pipe 26: KERN: timeout waiting for queued request to complete
KERN: usb error ehci 4: qtd (0x039ebc00) error: 0x00080248
KERN: usb error control pipe 26: timeout waiting for queued request to complete
KERN: usb error ehci 4: qtd (0x039ebe00) error: 0x00080248
KERN: usb error ehci 4: KERN: error while setting device address

Intel 82801G controller Intel Atom cpu

Can't get full syslog at this time due to lack of working network and usb drivers.

Attachments (2)

syslog-aspireone.gz (37.3 KB) - added by kallisti5 9 years ago.
syslog from aspire one
devices-aspireone (3.7 KB) - added by kallisti5 9 years ago.

Download all attachments as: .zip

Change History (25)

comment:1 in reply to:  description Changed 9 years ago by mmlr

Replying to kallisti5:

Attaching USB devices does not work as of hrev35410 x86gcc2hybrid on the Acer Aspire One Netbook.

When you say "as of" does that mean previous revision worked? If so what was the last one you tested? Did you possibly change any BIOS settings since when it used to work?

When this happens, do you also see an unusually high CPU load or the like?

comment:2 Changed 9 years ago by kallisti5

the cpu load looks normal. USB sticks used to work on this machine many moons ago (not sure what rev though, maybe in the r28xxx range)

I can boot from a Haiku USB stick.

Install OS on disk, restart, plug usb stick.. it does not show up.

no editable bios settings which would change usb behaviour.

comment:3 Changed 9 years ago by kallisti5

just a quick note, same issue is still seen on hrev35580

comment:4 Changed 9 years ago by axeld

I also have a stick with the same symptoms - I'm almost sure it worked fine a few months ago. However, other USB devices work just fine here.

comment:5 Changed 9 years ago by axeld

Just tried on another machine running Haiku: it also does not work there. On the EeePC, it only works if it's already sticked in during boot.

Should I open another bug report for this possibly unrelated problem (the error message and code is the same, though)?

comment:6 Changed 9 years ago by mmlr

I find that very curious really. The timeouts most likely are a result of an interrupt issue, likely someone stealing USB interrupts. There haven't been exactly many changes to USB in the past months, so if it worked before it's likely not related to USB itself. From other reports by Urias for example it seems that sometimes the hda driver provokes these issues. If the hda driver is in use on your systems could you try once with it removed and once if it makes a difference whether the system is cold or warm booted?

comment:7 Changed 9 years ago by kallisti5

Cc: kallisti5@… added

Functionality still seems pretty random even without hda driver.

Removed hda driver (moved it to desktop)
Warm boot, insert flash drive, flash drive does not work with issue above
Cold boot, insert flash drive, flash drive works
Cold boot, insert flash drive, flash drive works
Warm boot, insert flash drive, flash drive does not work with issue above
Warm boot, insert flash drive, flash drive does not work with issue above

comment:8 Changed 9 years ago by axeld

The cold boot/warm boot difference doesn't look random at all.

Anyway, I guess in my case it's something different, as USB itself works well, and the HDA driver only shares an interrupt with UHCI (removing it makes no difference).

I can bring the stick to the next BeGeistert :-)

comment:9 Changed 9 years ago by kallisti5

just a quick note, i get the same error message when plugging and unplugging a USB Mouse on this system. This seems to be a general USB bug on the Intel 82801G chipset. Performing a cold start resolved the mouse issue just like it resolved the USB stick issue.

Changed 9 years ago by kallisti5

Attachment: syslog-aspireone.gz added

syslog from aspire one

Changed 9 years ago by kallisti5

Attachment: devices-aspireone added

comment:10 Changed 9 years ago by mmlr

Please don't attach zipped stuff, it's just soo unhandy. Better trim/split the log in these size wise edge cases.

Oh and congratulations: you just won the prize for the worst possible interrupt mapping! Everything that has an interrupt is mapped to line 11. And better yet as the syslog tells:

KERN: Disabling unhandled io interrupt 11

Which should pretty much render all devices depending on interrupts useless: audio, USB, wired and wireless network, even storage would be affected if it used the SATA interface (which it doesn't though so falling back to legacy interrupts 14 and 15).

It happens between AHCI and UHCI init. Since AHCI is not in use it's unlikely this triggers something, so a possible scenario is that the UHCI controller is left in a state with pending interrupts which neither the system reset nor the explicit host controller reset clear (which would be a quirk). The init of the UHCI driver then enables interrupts after having set up the interrupt handler which causes an interrupt storm. Since at this point in time UHCI has the only interrupt handler installed the interrupt code decides that it isn't a shared interrupt and therefore disables the whole line.

Another possibility though is that the interrupt line gets enabled because of the first interrupt handler being installed and some other device is actually causing the storm resulting in the same end result. It's a bit hard to tell, but since there is explicit "storm protection" by clearing disabled UHCI interrupts in the UHCI code this strikes me as the indeed more likely case. So even if the UHCI controller did still have pending interrupts when enabling them they are cleared at the first interrupt handler call at the latest.

To blame is therefore likely any of the other devices sharing this interrupt line. What you can try is to remove drivers for them one by one. The thing is that you need to cold boot once you experience the problem to make sure you don't just carry over the problem from the last reboot which makes things a bit tedious. The process would be to boot, remove a driver, if problem already exists cold boot, warm reboot to check if problem comes up and continue with removing the next driver if yes. Drivers in question would be ehci, hda, intel_extreme, the wired rtl network driver and the atheros wireless one if installed. Personally I'd start with removing ehci.

In general this is a pretty nasty problem though because a driver can't really do anything about situations like these. It can only control the interrupts of its own device, and not reset interrupt states for others. Therefore if such a state is present on boot the first unsuspecting driver happening to install an interrupt handler will trigger it. A solution would be to seperate device init and interrupt enabling into two seperate driver calls, which might not necessarily be possible for all device types (since some may depend on interrupts for initial device setup).

Another possible solution would be what Be did where the disabled interrupt lines get re-enabled after a while if there is a handler which might just get the system go far enough to call the drivers of the other devices and therefore clear the problem.

Also I'm not sure if we actually unload all drivers when shutting down. At least the USB host controller drivers don't necessarily clean up after themselves properly. Usually this isn't a problem, but in your case it might just be. Since the USB stack is B_KEEP_LOADED though I'm not even sure whether it is ever destroyed at all giving the drivers the chance for cleanups. If it is a good idea would be to additionally disable/clear interrupts on host controller driver teardown.

comment:11 in reply to:  10 Changed 9 years ago by kallisti5

Replying to mmlr:

Please don't attach zipped stuff, it's just soo unhandy. Better trim/split the log in these size wise edge cases.

Sorry, i wanted to get syslog and all it's goodness in to make sure I didn't miss anything and the log was too big.

Oh and congratulations: you just won the prize for the worst possible interrupt mapping!

Haha, yeah.. given this net-book's "modern" BIOS, I can't change the interrupts for anything. The bios setup is pretty basic.( fyi, i am running the latest bios version avail for this system) I am thinking of adding a 3g Mini Pci-e card to this baby as well in the future.. lets hope thats not one more thing on int 11 ;)

What you can try is to remove drivers for them one by one. The thing is that you need to cold boot once you experience the problem to make sure you don't just carry over the problem from the last reboot which makes things a bit tedious. The process would be to boot, remove a driver, if problem already exists cold boot, warm reboot to check if problem comes up and continue with removing the next driver if yes. Drivers in question would be ehci, hda, intel_extreme, the wired rtl network driver and the atheros wireless one if installed. Personally I'd start with removing ehci.

I'll start this process this afternoon when I get home. If I can trace it to one driver I'll post it here.

Thanks for looking into this!

comment:12 Changed 9 years ago by kallisti5

Blocked By: 5492 added

comment:13 Changed 9 years ago by kallisti5

Component: Drivers/USBDrivers/Network
Owner: changed from mmlr to nobody
Summary: USB errors on Aspire One Netbook / Intel 82801Gatheroswifi doesn't disable interrupts on shutdown Aspire One Netbook / Intel 82801G

atheroswifi doesn't disable interrupts on shutdown.

with atheroswifi driver installed:
Cold boot -> usb works -> warm reboot -> usb doesn't work with errors in description.

without atheroswifi driver installed:
Cold boot -> usb works -> warm reboot -> usb works.

<@mmlr_mc> in freebsd they take care to not share interrupts, so their drivers usually don't behave well when we doing it anyway

comment:14 in reply to:  13 ; Changed 9 years ago by mmadia

Blocked By: 5 added

Replying to kallisti5:

<@mmlr_mc> in freebsd they take care to not share interrupts, so their drivers usually don't behave well when we doing it anyway

I'm guessing this can be blocked by #5 as well?

comment:15 in reply to:  14 Changed 9 years ago by colin

Replying to mmadia:

Replying to kallisti5:

<@mmlr_mc> in freebsd they take care to not share interrupts, so their drivers usually don't behave well when we doing it anyway

I'm guessing this can be blocked by #5 as well?

I would say so.
As a side note: For me it looks like the atheroswifi driver is attached to irq 11 while the real hardware serves a different irq (wild guessing based on experience :). I just rechecked the wifi compat layer and there is everything in place so that interrupts are disabled when the driver gets unloaded.
So maybe the driver never gets unloaded. This could be checked by enabling the TRACE in freebsd_network/driver.c and watch out for "uninit_driver" messages printed when _fbsd_uninit_driver() is called.

comment:16 Changed 8 years ago by umccullough

Cc: umccullough added

comment:17 Changed 8 years ago by scottmc

can you recheck this with a recent Haiku build?

comment:18 Changed 8 years ago by umccullough

Cc: umccullough removed

I know interrupt issues are pretty much gone on my Acer Aspire One since the IOAPIC changes at least.

comment:19 Changed 7 years ago by modeenf

Does the problem still exists?

comment:20 Changed 7 years ago by umccullough

kallisti5, can you please retest this?

comment:21 Changed 7 years ago by kallisti5

no longer have the laptop :)

Given the recent changes, I think we can call this one resolved. Feel free to re-open if there are other complaints though.

comment:22 Changed 7 years ago by umccullough

Blocked By: 5, 5492 removed

comment:23 Changed 7 years ago by umccullough

Resolution: fixed
Status: newclosed

Seems to be fixed on my Aspire One per progress in ticket #5

Note: See TracTickets for help on using tickets.