Opened 10 years ago

Closed 8 years ago

Last modified 8 years ago

#5551 closed bug (fixed)

ATI/AMD SB700/SB800/SBx00 USB Interrupt Issues

Reported by: MrSunshine Owned by: mmlr
Priority: normal Milestone: R1
Component: Drivers/USB Version: R1/Development
Keywords: ati_via_dropped_interrupts Cc: vegarwa@…
Blocked By: Blocking: #6026, #6191, #6222, #6223, #6527, #7477
Has a Patch: yes Platform: All

Description (last modified by mmlr)

ID: 0x4396 Vendor: 0x1002

When inserting an usb stick i get this:
KERN: usb error control pipe 63: timeout waiting for queued request to complete
KERN: usb error ehci 5: qtd (0x03cd3c80) error 0x00080248
KERN: usb error control pipe 63: timeout waiting for queued request to complete
KERN: usb error ehci 5: qtd (0x03cd3c80) error 0x00080248
KERN; usb error ehci 5: error while setting device address

Attachments (29)

ehci_sb600.diff (1.8 KB ) - added by mmlr 9 years ago.
ehci_sb600.2.diff (2.7 KB ) - added by mmlr 9 years ago.
syslog (115.0 KB ) - added by stargatefan 9 years ago.
syslog.2 (119.3 KB ) - added by stargatefan 9 years ago.
from recent 41373 build.
syslog.3 (113.9 KB ) - added by stargatefan 9 years ago.
syslog.4 (104.8 KB ) - added by stargatefan 9 years ago.
syslog.5 (164.1 KB ) - added by stargatefan 9 years ago.
syslog.6 (269.9 KB ) - added by stargatefan 9 years ago.
syslog.7 (105.1 KB ) - added by stargatefan 9 years ago.
syslog.8 (210.5 KB ) - added by stargatefan 9 years ago.
syslog.9 (142.9 KB ) - added by stargatefan 9 years ago.
tqh_syslog_sb600.txt (214.4 KB ) - added by tqh 9 years ago.
syslog for my SB600 which boots ok for reference.
syslog.10 (171.0 KB ) - added by stargatefan 9 years ago.
debugone.jpg (333.2 KB ) - added by stargatefan 9 years ago.
debugtwo.jpg (321.3 KB ) - added by stargatefan 9 years ago.
syslog.11 (167.4 KB ) - added by stargatefan 9 years ago.
syslog.12 (166.8 KB ) - added by stargatefan 9 years ago.
syslogohcichanges (373.4 KB ) - added by stargatefan 9 years ago.
syslog.13 (167.2 KB ) - added by stargatefan 9 years ago.
44758.pdf (1.0 MB ) - added by stargatefan 9 years ago.
45482.pdf (2.5 MB ) - added by stargatefan 9 years ago.
syslog.14 (170.8 KB ) - added by stargatefan 9 years ago.
41550 build gcc2
syslog.15 (163.7 KB ) - added by stargatefan 9 years ago.
41559
amdquircks (2.2 KB ) - added by stargatefan 9 years ago.
45481.pdf (557.5 KB ) - added by stargatefan 8 years ago.
42413_sb7xx_rpr_pub_1.00.pdf (617.2 KB ) - added by stargatefan 8 years ago.
sb700 register programming errata
46155_sb600_rrg_pub_3.03.pdf (1.7 MB ) - added by stargatefan 8 years ago.
sb600 errata page 42 on
syslog.16 (281.1 KB ) - added by stargatefan 8 years ago.
syslog.17 (272.4 KB ) - added by Xbertl 8 years ago.
28092011

Change History (98)

comment:1 by vegardw, 10 years ago

Cc: vegarwa@… added

comment:2 by mmlr, 10 years ago

Description: modified (diff)

This is almost certainly an interrupt issue. Someone may eat the USB interrupts (sharing problem) or they aren't correctly configured (i.e. the advertised interrupt lines aren't actually correct). If it is, there's nothing that could be done from the USB stack side of things.

comment:3 by mmlr, 10 years ago

Maybe hrev36234 changes this, though it's still most likely an interrupt issue.

comment:4 by mmlr, 9 years ago

Blocking: 6527 added

(In #6527) It's an ATI/AMD SBx00 chipset again. Closing this as a duplicate of #5551 as it is an interrupt issue on these chipsets that generally affects at least USB.

comment:5 by mmlr, 9 years ago

Summary: SB700/SB800 USB EHCI Controller not workingATI/AMD SB700/SB800/SBx00 USB Interrupt Issues

Making this a general ticket for the SBx00 chipset USB problems.

comment:6 by mmlr, 9 years ago

Blocking: 6222 added

(In #6222) Another SBx00 chipset. It is a generic problem, no real idea what and I don't have the hardware to try it on...

comment:7 by mmlr, 9 years ago

Blocking: 6223 added

(In #6223) Replying to X512:

Mayble duplicate of #5551. This notebook have ATI SB700/SB800 USB controller.

Indeed, looks exactly like it. Thanks for the note!

comment:8 by mmlr, 9 years ago

Blocking: 7477 added

(In #7477) It's the same. Closing as a duplicate. I'm obviously aware of the impact of #5551, but it's kinda hard to debug and it probably would simply go away with using ACPI for IRQ routing, so it's questionable whether it's really worth it...

comment:9 by mmlr, 9 years ago

Blocking: 6026 added

(In #6026) It's a SB600 chipset and therefore a duplicate of #5551.

comment:10 by mmlr, 9 years ago

Blocking: 6191 added

(In #6191) This is another ATI/AMD SB700/SB800 issue and therefore a duplicate of #5551.

by mmlr, 9 years ago

Attachment: ehci_sb600.diff added

comment:11 by mmlr, 9 years ago

I've attached a patch that ports one of the many SBx00 quirk workarounds from NetBSD/linux. It's difficult to tell whether or not it applies to the devices at hand or not as there's like absolutely no documentation for these bugs and therefore it's not so easy to judge what's going on at all. If anyone with one of those super-broken SBx00 chipsets could apply and report back that'd be helpful.

comment:12 by stargatefan, 9 years ago

http://www.coreboot.org/Datasheets#AMD_SB700.2FSB710.2FSB750

Link to documentation on the AMD southbridges from core boot. will report back tonight on the patch.

comment:13 by mmlr, 9 years ago

Keywords: ati_via_dropped_interrupts added

The generic quirk of dropped interrupts seen in the BSDs might be at work here as well, causing more sporadic problems. Adding a keyword to not forget about that.

by mmlr, 9 years ago

Attachment: ehci_sb600.2.diff added

comment:14 by mmlr, 9 years ago

Has a Patch: set

comment:15 by mmlr, 9 years ago

Replying to stargatefan:

http://www.coreboot.org/Datasheets#AMD_SB700.2FSB710.2FSB750 Link to documentation on the AMD southbridges from core boot. will report back tonight on the patch.

Thanks, the SB700 errata and the SB600 documentation describe this bit as (Advanced) Periodic List Cache. According to the errata it is broken and the Linux workaround (the same as the NetBSD one this patch is based on) is the way to go to work around it. Updated the patch with the additional info to use descriptive vendor, device, register and value names.

comment:16 by stargatefan, 9 years ago

Figured out my issue with the patch, got the patch applied and did a build. Still no luck. I am attaching my syslog to this ticket.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog added

comment:17 by X512, 9 years ago

[please delete this comment]

Last edited 9 years ago by X512 (previous) (diff)

in reply to:  17 comment:18 by X512, 9 years ago

I rebuild only usb bus modules an replace it. Usb drives still not work. Linux work good. Also where are long pause before disk icon during boot. If I remove usb drivers, then there are no pause.

Last edited 9 years ago by X512 (previous) (diff)

in reply to:  11 comment:19 by stargatefan, 9 years ago

Replying to mmlr:

I've attached a patch that ports one of the many SBx00 quirk workarounds from NetBSD/linux. It's difficult to tell whether or not it applies to the devices at hand or not as there's like absolutely no documentation for these bugs and therefore it's not so easy to judge what's going on at all. If anyone with one of those super-broken SBx00 chipsets could apply and report back that'd be helpful.

still won't work on my motherboard "the one I attached the syslog for in another ticket or maybe this one" and I tried to boot on a 760g chipset and it flat out won't boot at all. I have not tried scrubbing the usb subsystem out of the image but I bet given the boot debugger data that it would help. still looks like a interupt problem the onscreen debug hangs when it trys to route interupts.

I saw you made some commits today regarding IO APIC and IRQ routing tables etc ,once you finish those I will retest with both boards and report back with the resulting information.

by stargatefan, 9 years ago

Attachment: syslog.2 added

from recent 41373 build.

comment:20 by anevilyak, 9 years ago

That log appears to be missing the early part of boot where all the interrupt configuration happens. Also note that one must currently enter the boot menu and pick "Enable IOAPIC" in the safe mode options in order for those changes to take effect.

comment:21 by mmlr, 9 years ago

Please test this again with hrev41402 or newer and explicitly enabling the IO-APIC using the "Enable IO-APIC" menu item in the bootloader safemode settings menu. Please see #5 (comment 39) for additional info.

comment:22 by stargatefan, 9 years ago

tested with 41421 gcc4 nightly attaching syslog, still not working right and system is very unstable. Boot is really really long to a good 45-second to 1 minute from the pci card icon to the disk drive icon.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog.3 added

comment:23 by anevilyak, 9 years ago

That log has the same problem the previous one did, it's missing the beginning of the boot process where all the interesting information for this ticket would be. Please enlarge the syslog size in the kernel settings and try again.

in reply to:  23 comment:24 by stargatefan, 9 years ago

Replying to anevilyak:

That log has the same problem the previous one did, it's missing the beginning of the boot process where all the interesting information for this ticket would be. Please enlarge the syslog size in the kernel settings and try again.

I am not sure why this is occuring but there is no old syslog.which would "according to my reading " here http://www.haiku-os.org/documents/dev/system_logging should cuase the log file to reset and be renamed oldsyslog once it reachs 512kb, which has not occured yet. I checked the config but for some reason its just not putting the data into the log.

update, I cleaned out the log file, and did a reboot and it looks like it has the info your after.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog.4 added

by stargatefan, 9 years ago

Attachment: syslog.5 added

by stargatefan, 9 years ago

Attachment: syslog.6 added

comment:25 by stargatefan, 9 years ago

Is there anyway you can build the binarys for the USB patch for me ? I am having some problems with building with, its been crashing like crazy. I don't know why but it is what it is. Thanx I will test the resulting binarys and see if that takes care of the usb issues.

comment:26 by anevilyak, 9 years ago

That syslog's more like it...however, it confirms that the IO-APIC isn't being used. Are you by any chance disabling ACPI on that system? The IO-APIC isn't being configured due to the ACPI module either not being loaded or not having initialized properly.

comment:27 by stargatefan, 9 years ago

In this log I switched from APIC MPS table 1.4 to MPS table 1.1. I really don't have any idea what the difference is.

I will note that during bootup, the bios screen shos the usb hub controllers on shared IRQ's. I have tried to find a way to release the bios from IRQ setup, but I don't see a way to do so. I will look again and here is a update log with table version MPS 1.1 instead of the previous logs of MPS table 1.4

I hope this sheds some light on the subject.

by stargatefan, 9 years ago

Attachment: syslog.7 added

comment:28 by stargatefan, 9 years ago

New log file with ACPI disabled on the motherboard altogether. I don't think the settings in the safeboot mode are being kept or the kernel drive file is misconfigured ??? Operator error maybe?. Regardless of what I change, IOAPIC does not seem to become active. I did not clear the log file after log.7 so log .8 is a boot with the mps table rev 1.1 and with APIC disabled altogether.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog.8 added

comment:29 by stargatefan, 9 years ago

Not trying to spam the ticket but I wanted to keep the sequence of testing.

this time I went into the kernel setting files, uncommented the #apic and it appears to have forced acpi on, I can assure you it is enabled in the bios.I have tested every config. I am going to check for a bios update and see if that helps the situation. In the meantime Here is the latest log file. syslog.9 with APIC MPS 1.4 and the changes made to the kernel setting file.I hope I did that right but its the first time I have seen IOAPIC in the log.

BIOS reflash had no effect. Also I tried disabling HPET, no sucess there either.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog.9 added

comment:30 by anevilyak, 9 years ago

The last one looks better, except it seems to have not picked up the setting from the menu properly (needs to be looked into). Can you try adding enable_ioapic true to your kernel settings file?

comment:31 by mmlr, 9 years ago

Replying to stargatefan:

Not trying to spam the ticket but I wanted to keep the sequence of testing.

That's fine, your testing efforts are certainly appreciated!

this time I went into the kernel setting files, uncommented the #apic and it appears to have forced acpi on, I can assure you it is enabled in the bios.I have tested every config. I am going to check for a bios update and see if that helps the situation. In the meantime Here is the latest log file. syslog.9 with APIC MPS 1.4 and the changes made to the kernel setting file.I hope I did that right but its the first time I have seen IOAPIC in the log.

If ACPI is disabled in the BIOS then the ACPI module will fail to load, resulting in IO-APICs not being used, so that explains some of the syslogs. MPS means Multi-Processor Specification and the number denotes the version of the specs (there are really only 1.1 or 1.4). The MP tables were used to communicate the multi processor relevant part of the system configuration back when ACPI didn't exist. Now that pretty much every BIOS supports ACPI, the MP tables are only there for legacy support. While Haiku could also use the MP tables for IO-APIC configuration (as they also include the PCI interrupt routing information) this is not implemented, and with ACPI now being supported pretty much universally, it's not really sensible to implement it at all. In short: the MPS version setting doesn't matter to Haiku, only the ACPI setting does (and MPS and ACPI don't belong together, so if the BIOS labels the MPS tables with ACPI then that'd just be a mistake on the BIOS maker side).

BIOS reflash had no effect. Also I tried disabling HPET, no sucess there either.

Yeah, as you reported it looks like the setting to enable the IO-APIC really didn't stick. The syslog actually tells that it didn't find the IO-APICs enabled. I've seen this happen over here as well some time and I think it is just a bug with the safemode settings. Unfortunately this makes it very unhandy for you to test, as you'd need to build a new kernel with the setting forced on. I'll try to investigate that problem and give you a heads up once you could try again.

in reply to:  30 ; comment:32 by stargatefan, 9 years ago

Replying to anevilyak:

The last one looks better, except it seems to have not picked up the setting from the menu properly (needs to be looked into). Can you try adding enable_ioapic true to your kernel settings file?

Absolutely I can do this. Now my question is "might seem dumb" to make this active in the kernel settings file it should be "enable_ioapic true" without the # in front of it correct ?I read somewhere a while back removing the # made the settings work etc etc etc. But I could be incorrect about this. any clarification much appreciated.

in reply to:  31 comment:33 by stargatefan, 9 years ago

Replying to mmlr:

Replying to stargatefan:

Not trying to spam the ticket but I wanted to keep the sequence of testing.

That's fine, your testing efforts are certainly appreciated!

this time I went into the kernel setting files, uncommented the #apic and it appears to have forced acpi on, I can assure you it is enabled in the bios.I have tested every config. I am going to check for a bios update and see if that helps the situation. In the meantime Here is the latest log file. syslog.9 with APIC MPS 1.4 and the changes made to the kernel setting file.I hope I did that right but its the first time I have seen IOAPIC in the log.

If ACPI is disabled in the BIOS then the ACPI module will fail to load, resulting in IO-APICs not being used, so that explains some of the syslogs. MPS means Multi-Processor Specification and the number denotes the version of the specs (there are really only 1.1 or 1.4). The MP tables were used to communicate the multi processor relevant part of the system configuration back when ACPI didn't exist. Now that pretty much every BIOS supports ACPI, the MP tables are only there for legacy support. While Haiku could also use the MP tables for IO-APIC configuration (as they also include the PCI interrupt routing information) this is not implemented, and with ACPI now being supported pretty much universally, it's not really sensible to implement it at all. In short: the MPS version setting doesn't matter to Haiku, only the ACPI setting does (and MPS and ACPI don't belong together, so if the BIOS labels the MPS tables with ACPI then that'd just be a mistake on the BIOS maker side).

BIOS reflash had no effect. Also I tried disabling HPET, no sucess there either.

Yeah, as you reported it looks like the setting to enable the IO-APIC really didn't stick. The syslog actually tells that it didn't find the IO-APICs enabled. I've seen this happen over here as well some time and I think it is just a bug with the safemode settings. Unfortunately this makes it very unhandy for you to test, as you'd need to build a new kernel with the setting forced on. I'll try to investigate that problem and give you a heads up once you could try again.

Yes I though I was seeing that in the log settings, but I don't know the system/kernel well enough to really interpret the log in a meaningful way.

IOAPIC is enabled in the BIOS. I just asks for a MPS table version I tried both "not knowing the diffrence" and took logs and gave them to you.

If you can make a build with the kernel IOAPIC mode forced on make a ISO and I will test. Unfortunately I can't build haiku lately, it just constantly crashs during the build. It actually brings the system totally down to. I have been hesistant to put in a ticket but if I can get some Backtraces and other logs I will start a ticket for the problem. I will say it has been getting progressively worse in the last 3 weeks with every nightly. Otherwise the system seems stable enough outside of crashing during builds.

I don't get it. But if you provide a ISO nightly image,USB anyboots doesn't work "no usb" I will gladly test. Just dump a link to dropbox or something else here.

in reply to:  32 ; comment:34 by anevilyak, 9 years ago

Replying to stargatefan:

Absolutely I can do this. Now my question is "might seem dumb" to make this active in the kernel settings file it should be "enable_ioapic true" without the # in front of it correct ?I read somewhere a while back removing the # made the settings work etc etc etc. But I could be incorrect about this. any clarification much appreciated.

Correct, # in a setting file denotes a comment, ergo any text after it until end of line is ignored. Ergo the line you want to add should not have a # preceding it if you want it to actually have any effect.

by tqh, 9 years ago

Attachment: tqh_syslog_sb600.txt added

syslog for my SB600 which boots ok for reference.

in reply to:  34 comment:35 by stargatefan, 9 years ago

Replying to anevilyak:

Replying to stargatefan:

Absolutely I can do this. Now my question is "might seem dumb" to make this active in the kernel settings file it should be "enable_ioapic true" without the # in front of it correct ?I read somewhere a while back removing the # made the settings work etc etc etc. But I could be incorrect about this. any clarification much appreciated.

Correct, # in a setting file denotes a comment, ergo any text after it until end of line is ignored. Ergo the line you want to add should not have a # preceding it if you want it to actually have any effect.

I thought that was the case,thank you for clearing that up though.

to THQ my high speed USB does not work at all, low speed usb is ok and the machine runs, although building code does cuase crashing. I don't know if its interupt related problem hence why I haven't posted a ticket yet. I will post a DEBUG back trace here though just in case it is a hardware stability issue.

comment:36 by tqh, 9 years ago

stargatefan, oops I havn't tested USB that much, although mouse and keyboard works. Will test some more and see how it goes on my machine.

comment:37 by stargatefan, 9 years ago

Looks like IOAPIC is working on this chipset. I will test on the nonbooting 760 chipset machine and port back with a seperate log. This is a 870 chipset.

Log attached syslog.10 should be the proper log name.

USB is still fialing, but If I could get the build system working. I could try the usb pathcs. Can someone attach patch binary to try ?

by stargatefan, 9 years ago

Attachment: syslog.10 added

by stargatefan, 9 years ago

Attachment: debugone.jpg added

by stargatefan, 9 years ago

Attachment: debugtwo.jpg added

comment:38 by stargatefan, 9 years ago

debugone and debugtwo are from a 760 amd chipset motherboard. It won't even attempt botting on the latest nightlys thats from revision 40xxx IIRC about 2 months ago. Anyways I don't know if its related, it might not be. But I figured it might be beneficial. I also noticed there is no apic or ioapic enabling in the bios. could be a screwy bios.

in reply to:  37 comment:39 by mmlr, 9 years ago

Replying to stargatefan:

Looks like IOAPIC is working on this chipset. I will test on the nonbooting 760 chipset machine and port back with a seperate log. This is a 870 chipset. Log attached syslog.10 should be the proper log name.

From syslog.10 it looks like you're missing the ACPI module. Not that it's disabled, but it's apparently completely missing from your install. Because of that IO-APICs were not enabled there either.

The safemode settings in the bootloader should now be fixed (hrev41505 and newer). So you should be able to try again with setting "Enable IO-APIC" (on the machine referred to in comment:33).

comment:40 by stargatefan, 9 years ago

Testing result with 41508

Now that I know how to get the build system functional again without crashing, I'll try the patchs a bit later.

The depressing parts is what stops the crashing :-(

attached new log.

by stargatefan, 9 years ago

Attachment: syslog.11 added

by stargatefan, 9 years ago

Attachment: syslog.12 added

comment:41 by stargatefan, 9 years ago

syslog.12 is with the attched patch from this ticket. Still no resolution on the high speed usb,

added new log, this time I removed the ohci driver, I was thinking the usb lines might be internally multiplexed inside the sb and by th elooks of the last log report in the last boot cycles that may very well be the case. So the question becomes, how to determine the adress changes.

Got suspicous, seen this type of stuff on C.A.N.

hope this log sheds a bit of light. BTW no matter how you slice it, the linux patch doesn't work. Is there a linux usb driver for the 600/700 chipsets that work ?

I also found this older ticket on a similar bsd issue especially with hubs.

http://gnats.netbsd.org/40056

didn't resolve but had some different idea on the buss layout. I will keep diggin around see if I can get anything related enough to be useful.

another patch for aspm problems.

http://www.spinics.net/lists/linux-usb/msg41277.html

I will try to find more information tommorow.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslogohcichanges added

in reply to:  41 comment:42 by mmlr, 9 years ago

Replying to stargatefan:

syslog.12 is with the attched patch from this ticket. Still no resolution on the high speed usb,

You are still missing the ACPI module:

KERN: module: Search for bus_managers/acpi/v1 failed.
KERN: acpi module not available, not configuring io-apics

If this is a general interrupt issue then it's really important to check if it'll work with IO-APICs. Please check what happened to your "/boot/system/add-ons/kernel/bus_managers/acpi" file. Is it missing or is it from an incompatible install?

added new log, this time I removed the ohci driver, I was thinking the usb lines might be internally multiplexed inside the sb and by th elooks of the last log report in the last boot cycles that may very well be the case. So the question becomes, how to determine the adress changes.

Not really sure what you're saying here. The USB lines as in the actual USB connector are obviously shared among the EHCI and companion host controller (OHCI in this case). They are routed to one or the other by means of EHCI config (that's what happens on "low-speed device connected, giving up port ownership"). It doesn't have an effect on this issue though. The problem is that interrupts from the EHCI controller don't arrive for whatever reason. This can be the controller being programmed incorrectly by the driver and therefore not generating interrupts at all, the controller being broken and needing a workaround or the interrupt routing being broken and not actually routing the interrupt to the legacy PIC or not at the vector indicated by the BIOS. If the latter then it's quite likely that using the IO-APIC would solve the issue, as then the legacy PIC routing wouldn't be used anymore.

hope this log sheds a bit of light. BTW no matter how you slice it, the linux patch doesn't work. Is there a linux usb driver for the 600/700 chipsets that work ?

You mean the patch attached here? Or you mean it doesn't work for you under linux either?

I also found this older ticket on a similar bsd issue especially with hubs.

http://gnats.netbsd.org/40056

Yes I know. That, and by extension the linux patches that are linked to, are the basis for the patch attached here. Some versions of these chipsets are simply broken (AMD errata documents the issues) and need such workarounds, however those are almost always edge cases. In the case of the attached patch it'd require multiple devices being plugged in and heavily used to trigger that issue at all. So the early error in initialization seen by most people in this ticket (and its duplicates) most certainly are just an interrupt generation or routing issue, not a well hidden chipset bug. That's why I'm so eager to check the situation with IO-APICs on.

comment:43 by mmlr, 9 years ago

Can you try with hrev41514? There's a glimpse of hope that the BIOS just disabled PCI interrupts for the devices at hand which would now be changed on initialization (and a corresponding line would be printed to debug output).

comment:44 by stargatefan, 9 years ago

rev 41516 gcc2 h

by stargatefan, 9 years ago

Attachment: syslog.13 added

comment:45 by kallisti5, 9 years ago

I see the same issue on my SB700/SB800:

kallisti5@eris:~$ lspci -nn | grep -i USB
00:12.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller [1002:4397]
00:12.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1 Controller [1002:4398]
00:12.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI Controller [1002:4396]
00:13.0 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller [1002:4397]
00:13.1 USB Controller [0c03]: ATI Technologies Inc SB700 USB OHCI1 Controller [1002:4398]
00:13.2 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB EHCI Controller [1002:4396]
00:14.5 USB Controller [0c03]: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller [1002:4399]

Will test latest revision when it builds again :)

comment:46 by stargatefan, 9 years ago

mmlr "micheal" right ?

do you think it is possiable that during the interupt reset that the usb controller goes into the low power off hibernate state ? I see if in the eratta on page 26 of this document. Apprently there are a number of power states, they are not available in the ohci and according to the block diagrams the ehci must recieve a power state message to begin working.

Is the IO-APIC implmentation dispatching a power state message to wake the controller up after the reset ?

pages 24-27 of the attached pdf on the 810-850 chipsets. I have a suspicion that this applys to most of the AMD 600-800 chipsets and it could explain why the interupt request is being ignored.

by stargatefan, 9 years ago

Attachment: 44758.pdf added

in reply to:  46 ; comment:47 by mmlr, 9 years ago

Replying to stargatefan:

mmlr "micheal" right ?

Michael, yes.

do you think it is possiable that during the interupt reset that the usb controller goes into the low power off hibernate state ? I see if in the eratta on page 26 of this document. Apprently there are a number of power states, they are not available in the ohci and according to the block diagrams the ehci must recieve a power state message to begin working.

I find that highly unlikely and I wouldn't have seen anything in that direction in any of the controller driver sources of any other OS. The reset procedure is narrowly defined by EHCI and if there were problems like these you'd find corresponding quirks in most implementations.

Is the IO-APIC implmentation dispatching a power state message to wake the controller up after the reset ?

No and it wouldn't be the place to do anything like that either. If at all then it'd be in ACPI.

pages 24-27 of the attached pdf on the 810-850 chipsets. I have a suspicion that this applys to most of the AMD 600-800 chipsets and it could explain why the interupt request is being ignored.

Most any PCI device has those power modes, so that's not really an argument. Otherwise you couldn't really suspend a laptop for example or hibernate a PC.

in reply to:  47 comment:48 by stargatefan, 9 years ago

Replying to mmlr:

Replying to stargatefan:

mmlr "micheal" right ?

Michael, yes.

I find that highly unlikely and I wouldn't have seen anything in that direction in any of the controller driver sources of any other OS. The reset procedure is narrowly defined by EHCI and if there were problems like these you'd find corresponding quirks in most implementations.

Well, There are lots of problems with linux and bsd and windows "the only other OS's people use" with these chipsets.My question was is it possiable that some boards may have bios quirks or even chipset quirks that fall outside of the specifications ?? If IOAPIC is doing everything properly, and the work arounds aren't helping. Then something else must be culpable.

Is the IO-APIC implmentation dispatching a power state message to wake the controller up after the reset ?

No and it wouldn't be the place to do anything like that either. If at all then it'd be in ACPI.

When my bios shows the IRQ screen, it shows several shared IRQ's for usb on both the ohci and the ehci bus's. I will snap a picture and post it in a bit.

Most any PCI device has those power modes, so that's not really an argument. Otherwise you couldn't really suspend a laptop for example or hibernate a PC.

I understand that, but maybe these chipsets or bios's are just quirky in the way they handle resets ? How hard would it be to test such a thing ? If the ioapic is working, then the ehci buss is turning off or its outside of the specification for operation ??? I know to get this chipset working in windows, you absolutely have to install the AMD driver or you get no usb 2.0. I had completely forgotten about that to. SO maybe something is falling outside of the bus/pci specification ?

anyways I am going to get a picture of the bios IRQ routing screen, maybe it'll shed some light. If there is anything else I can get for you. Let me know.

comment:49 by stargatefan, 9 years ago

attaching PDF of register programming information for sb810/850

by stargatefan, 9 years ago

Attachment: 45482.pdf added

by stargatefan, 9 years ago

Attachment: syslog.14 added

41550 build gcc2

comment:50 by stargatefan, 9 years ago

also I added a pci usb card, sort of needed some usb ports, botting from cd's was starting to get pricey. that should be hub 32 and 37.

Interesting to note, looks like the log indicated that usb is on irq 2 and irq7 which is what the bios reports. first time I have seen that. .

Last edited 9 years ago by stargatefan (previous) (diff)

in reply to:  50 ; comment:51 by mmlr, 9 years ago

And again...

KERN: module: Search for bus_managers/acpi/v1 failed.

...your ACPI module is missing, but I might now understand why: This is a CD you're booting, right? If that is the case then the that'd explain why this happens, as the ACPI module is yet missing from the bootable CD part. Can you retry that with a USB stick? For that early part it wouldn't even matter if the stick fully booted. I'd be interested in seeing 1) whether or not IO-APICs get used and if so 2) if it makes any difference on that board.

Replying to stargatefan:

Interesting to note, looks like the log indicated that usb is on irq 2 and irq7 which is what the bios reports. first time I have seen that.

Obviously, since we're not doing any routing...

in reply to:  51 ; comment:52 by tangobravo, 9 years ago

Replying to mmlr:

And again...

KERN: module: Search for bus_managers/acpi/v1 failed.

...your ACPI module is missing, but I might now understand why: This is a CD you're booting, right? If that is the case then the that'd explain why this happens, as the ACPI module is yet missing from the bootable CD part.

Does ACPI break CD booting or something? It would be good to add it to the CD for Alpha 3 if possible, as your recent IO-APIC work looks like it fixes lots of booting troubles, and people might miss out on those fixes if their first attempt to boot Haiku is with a CD. If it's not possible to add to the CD we should definitely recommend booting from USB as the preferred method.

in reply to:  52 ; comment:53 by mmlr, 9 years ago

Replying to tangobravo:

Does ACPI break CD booting or something? It would be good to add it to the CD for Alpha 3 if possible, as your recent IO-APIC work looks like it fixes lots of booting troubles, and people might miss out on those fixes if their first attempt to boot Haiku is with a CD.

Obviously. The reason for it not being included is the same as for many things: nobody did it / needed it / noticed it. I've noticed earlier, but hadn't completed my tests then. It's now included as of hrev41550 (or hrev41551 in the alpha branch), please retest with a CD including that change or with a different boot medium.

in reply to:  51 comment:54 by stargatefan, 9 years ago

Replying to mmlr:

And again...

KERN: module: Search for bus_managers/acpi/v1 failed.

...your ACPI module is missing, but I might now understand why: This is a CD you're booting, right? If that is the case then the that'd explain why this happens, as the ACPI module is yet missing from the bootable CD part. Can you retry that with a USB stick? For that early part it wouldn't even matter if the stick fully booted. I'd be interested in seeing 1) whether or not IO-APICs get used and if so 2) if it makes any difference on that board.

Replying to stargatefan:

Interesting to note, looks like the log indicated that usb is on irq 2 and irq7 which is what the bios reports. first time I have seen that.

Obviously, since we're not doing any routing...

I get the same result on the hdd boot to. yes that log is from a CD boot. I will retest again tonight.

in reply to:  53 comment:55 by stargatefan, 9 years ago

Replying to mmlr:

Replying to tangobravo:

Does ACPI break CD booting or something? It would be good to add it to the CD for Alpha 3 if possible, as your recent IO-APIC work looks like it fixes lots of booting troubles, and people might miss out on those fixes if their first attempt to boot Haiku is with a CD.

Obviously. The reason for it not being included is the same as for many things: nobody did it / needed it / noticed it. I've noticed earlier, but hadn't completed my tests then. It's now included as of hrev41550 (or hrev41551 in the alpha branch), please retest with a CD including that change or with a different boot medium.

no problem. that was with a svn build of 41550. am I updating thw wrong svn ???I have just been running the svn up command from directory boot/haiku/haiku.

Last edited 9 years ago by stargatefan (previous) (diff)

comment:56 by diver, 9 years ago

You need hrev41551 or hrev41552.

comment:57 by stargatefan, 9 years ago

log from 41559

Founds this info maybe related errors to???. don't know if they will be helpful, but figured I had time to dig and see whats out there.

http://www.mail-archive.com/unionfs-cvs@www.fsl.cs.sunysb.edu/msg00497.html

http://www.kernel.org/pub/linux/kernel/people/gregkh/usb/2.5/usb-ehci-2-2.5.72.patch

http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg30054.html

http://lkml.indiana.edu/hypermail/linux/kernel/0412.0/0131.html http://mail-index.netbsd.org/netbsd-bugs/2011/04/19/msg022334.html

do you think its possiable that the usb 3.0 chip is cuasing trouble ? I don't think it should but I mean is it possiable ? It can be disabled in the bios but I think I did that a while back with no effect.

Also found this link. I put the text in th attach file. its a long liwst of commits on this page.

https://lkml.org/lkml/2011/3/21/311

had some stuff that looks related to me chipset about inverse polarity If I read that corretly.I attached the code snipets to.

Last edited 9 years ago by stargatefan (previous) (diff)

by stargatefan, 9 years ago

Attachment: syslog.15 added

41559

by stargatefan, 9 years ago

Attachment: amdquircks added

comment:58 by stargatefan, 8 years ago

attaching sb800 specific documentation on enabling usb controllers during reset. Pages 34-39. Found some interesting things in there and I didn't see any of these implementations in the haiku source. Could be I don't know it very well however or I could be looking in the wrong places. According to the doc there there are a few specific setup procedures that must be followed to get the ehci ports up afte tht ephy events and irq enumeration. I will poke around for the sb600 and sb700 docs to go with these.if anything its just more errata for the archives.

hope its helpful

45481.pdf

by stargatefan, 8 years ago

Attachment: 45481.pdf added

by stargatefan, 8 years ago

sb700 register programming errata

by stargatefan, 8 years ago

sb600 errata page 42 on

comment:59 by mmlr, 8 years ago

I already have all of those documents. And you're looking at the resume state documentation that describes the required programming when waking the controllers from a sleep state. This is not what's going on here, as we're not resuming from another power state.

As for the bunch of links in the earlier comment: Now that's just random Linux EHCI patches. It's really getting crowded in this ticket, so please refrain from just adding stuff that happens to match the keyword "EHCI". It takes a lot of time to follow such links and evaluate their relevance. And in the end the distilled fixes of all those problem reports simply end up in the main sources of the BSDs and Linux, which I've already looked through for relevant information. So the individual debug attempts and interim patches just add confusion and consume extra time without getting this any further.

What's needed here is a systematic approach to find out what is actually going on. So far everything just looked like missing interrupts, so my hope was that with IO-APICs this would resolve itself. Since it didn't I will have to add more debug output to check if the transfers were actually executed and the interrupts are just missing or if they weren't executed at all. I'm pretty sure that I remember that polling worked in the past on such chipsets. If that is correct, then it means that it simply is an interrupt issue (only), and all those erratas, while possibly concerning other problems that may crop up, aren't really helpful for this specific ticket.

in reply to:  59 comment:60 by stargatefan, 8 years ago

Replying to mmlr:

I already have all of those documents. And you're looking at the resume state documentation that describes the required programming when waking the controllers from a sleep state. This is not what's going on here, as we're not resuming from another power state.

As for the bunch of links in the earlier comment: Now that's just random Linux EHCI patches. It's really getting crowded in this ticket, so please refrain from just adding stuff that happens to match the keyword "EHCI". It takes a lot of time to follow such links and evaluate their relevance. And in the end the distilled fixes of all those problem reports simply end up in the main sources of the BSDs and Linux, which I've already looked through for relevant information. So the individual debug attempts and interim patches just add confusion and consume extra time without getting this any further.

What's needed here is a systematic approach to find out what is actually going on. So far everything just looked like missing interrupts, so my hope was that with IO-APICs this would resolve itself. Since it didn't I will have to add more debug output to check if the transfers were actually executed and the interrupts are just missing or if they weren't executed at all. I'm pretty sure that I remember that polling worked in the past on such chipsets. If that is correct, then it means that it simply is an interrupt issue (only), and all those erratas, while possibly concerning other problems that may crop up, aren't really helpful for this specific ticket.

Sorry, knew you were busy figured I would try to save you some leg work. I have been specificaly hunting down stuff related only to the AMD EHCI implementation and interupt requests and fialures.

do you have some spare ddr3 ram and a hdd hanging around ? I think I can get a sb800 board and cpu to you if your in the states. Certainly this should help ?

the latest changes to the ioapic or ehci stack has now cuased breakage with the nec chip card in this machine. I am attaching that log file. Syslog.16

Last edited 8 years ago by stargatefan (previous) (diff)

by stargatefan, 8 years ago

Attachment: syslog.16 added

comment:61 by mmlr, 8 years ago

I have a SB600 machine available now and I can reproduce the issue. While I haven't yet figured out what's going on, I was able to confirm that the transfers actually work and the status registers are fine, just the interrupts are missing.

I've implemented a simplistic poll mode for EHCI in hrev41688. If you set "ehci_polling on" in your kernel settings file and/or using the new advanced debug options in the bootloader, the EHCI controller should at least be usable. The performance is reasonable and the resource consumption is relatively low, so this should be fine for a temporary workaround.

comment:62 by mmlr, 8 years ago

Resolution: fixed
Status: newclosed

Should finally be fixed in hrev41690. At least it is on my SB600 machine EHCI works fine now. Please reopen if you still run into those errors.

comment:63 by stargatefan, 8 years ago

EHCI does work now. Congrats.

comment:64 by Xbertl, 8 years ago

Resolution: fixed
Status: closedreopened

I have a "MSI K9A2 Platinum (NB:790FX; SB:SB600)" and EHCI is not working

Build: x86 GCC 2 Hybrid hrev42768

by Xbertl, 8 years ago

Attachment: syslog.17 added

28092011

comment:65 by diver, 8 years ago

Blocking: 8002 added

in reply to:  64 comment:66 by mmlr, 8 years ago

Replying to Xbertl:

I have a "MSI K9A2 Platinum (NB:790FX; SB:SB600)" and EHCI is not working

That's a different issue. Apparently, looking at the very last line of the log, the interrupts arrive at the old (BIOS configured) vector, not the one advertised by ACPI. There's not much one can do in such cases, as the ACPI info simply seems broken. You could try updating your BIOS to see if that helps. In either case, please create a new ticket for that problem and, if you can, attach a log from a Linux boot as well. Also try disabling the IO-APIC and/or ACPI in the boot menu to see if it works when not using that routing information.

comment:67 by mmlr, 8 years ago

Blocking: 8002 removed

(In #8002) This looks like a more generic interrupt issue, as it affects both OHCI and EHCI. Does it also affect networking, i.e. does networking work? You mention that you tried without ACPI, can you also post a log with ACPI on to check the interrupt routing info it provides?

comment:68 by mmlr, 8 years ago

Resolution: fixed
Status: reopenedclosed

This issue was definitely fixed by hrev41690, so new issues aren't likely to be related to this one anymore. Please open new tickets for further issues that need to be looked into.

comment:69 by mmlr, 8 years ago

Someone tried to contact me privately (via e-mail) about this ticket, but the mail_daemon (again) ate the mail before I could take a look at it. So if that person sees this, please re-send your message to me or post here.

Note: See TracTickets for help on using tickets.