Opened 10 years ago

Closed 10 years ago

#4101 closed bug (fixed)

nVidia CK804 OHCI USB controller detects only 1 connected device

Reported by: monni Owned by: mmlr
Priority: normal Milestone: R1
Component: Drivers/USB Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

Revision: 31621 / gcc4

When I boot with EHCI driver removed, Haiku does only show USB mouse connected even though I have Bluetooth dongle connected. Power led on dongle doesn't get lit.

If I boot with EHCI driver still in place, power led on dongle works correctly but there is no other activity on that port and Bluetooth driver soft-locks when trying to initialize. Removing the dongle resumes "normal" operation of software.

With USB tracing enabled syslog doesn't say anything about device insertion or status change.

Attachments (3)

syslog.ohci (136.1 KB ) - added by monni 10 years ago.
syslog without ehci & usb_disk
syslog (388.1 KB ) - added by monni 10 years ago.
syslog with normal boot
listusb (8.6 KB ) - added by monni 10 years ago.
listusb with normal boot

Download all attachments as: .zip

Change History (29)

comment:1 by mmlr, 10 years ago

Please attach a full syslog. This info doesn't help really, as I don't happen to have said hardware.

There are numerous reasons for this description to match, but I guess you already invalidated most of them. For one, it's possible that said port is EHCI only. Then it's always possible that it's a hardware fault on said port/controller. It's also possible that the device does only work in USB 2.0 mode. You can verify most of that by plugging in the different devices at different ports. Please tell what you've already tried.

Also what output do you get when the "soft-lock" happens? What exactly do you mean by "soft-lock"?

If you are able to, please also check with a gcc2 build to verify that there's no problem with how the driver is compiled (similar effects have been seen with the radeon driver for example).

by monni, 10 years ago

Attachment: syslog.ohci added

syslog without ehci & usb_disk

by monni, 10 years ago

Attachment: syslog added

syslog with normal boot

by monni, 10 years ago

Attachment: listusb added

listusb with normal boot

comment:2 by monni, 10 years ago

Added syslogs with just non-working devices...

I tried...

  1. replacing USB mouse with PS/2 mouse
  2. removing ehci, disabling it in BIOS
  3. removing ehci and usb_disk (uses EHCI)
  4. booting with just internal card reader, plugging in dongle when already running
  5. booting with internal card reader, bluetooth dongle, USB DVB-T tuner, USB mouse
  6. booting with internal card reader, bluetooth dongle, USB mouse
  7. booting with internal card reader, bluetooth dongle, PS/2 mouse

I can't compile same build with gcc2, because this machine is 64-bit.

comment:3 by mmlr, 10 years ago

Status: newassigned

Can you please retry with a revision >= hrev32551. Especially with multiple devices connected OHCI would likely break down easily before this change.

comment:4 by monni, 10 years ago

Tried with hrev32554... When I look at vector lengths in syslog, it starts with "normal" values, but after a few transfers, all seem to be 4 bytes long.

in reply to:  4 comment:5 by mmlr, 10 years ago

Replying to monni:

Tried with hrev32554... When I look at vector lengths in syslog, it starts with "normal" values, but after a few transfers, all seem to be 4 bytes long.

Most often that is fine. 4 byte transfers are usually either from hub status polling or from mouse devices. The length of the transfers are in fact coming from the user of the USB stack, i.e. the drivers using it.

The question is if the behaviour changed regarding when only OHCI is present.

The soft lock when accessing the dongle is most likely a problem with the bluetooth driver, so if no OHCI specific problems persist I would like to change the component to bluetooth and reassign this bug.

comment:6 by monni, 10 years ago

I disabled EHCI in BIOS and noticed that power light on bluetooth dongle does work now, but it still fails after driver tries to send data. I also tried with USB disk but it doesn't work even though it is detected. "Access light" on the drive does blink for like 3 times and then stops.

in reply to:  6 comment:7 by mmlr, 10 years ago

Replying to monni:

I disabled EHCI in BIOS and noticed that power light on bluetooth dongle does work now, but it still fails after driver tries to send data. I also tried with USB disk but it doesn't work even though it is detected. "Access light" on the drive does blink for like 3 times and then stops.

Let's not mix different problems in this one bug report. Does the bluetooth device now experience the same issue as with EHCI enabled? If so, then please file a new bug report for that problem and assign it to bluetooth, as that's likely a driver problem.

About the drive, does that one work when you have EHCI enabled? If not, then it probably uses a protocol that is unsupported by the (quite limited) usb_disk driver.

I'd like to find out if the OHCI problems are gone now so that this ticket can be closed.

comment:8 by monni, 10 years ago

USB drive works with EHCI enabled.

comment:9 by oruizdorantes, 10 years ago

Hi Michael,

We have been debugging a bit, may it can spot us some light.

The first frame(reported in the callback) is the following :

Len=10 Data = ## e:4:1:3:c:0:0:0:0:0: ##

Lets say only the first 6 bytes are valid (bluetooth)data. Remaining as you can see is filled with 4 zeros which starts provoking problems. I would be expecting from this frame to be 6 bytes len, or having valid data on those 4 bytes(not 0).

Is this expectable?

in reply to:  9 comment:10 by mmlr, 10 years ago

Replying to oruizdorantes:

Lets say only the first 6 bytes are valid (bluetooth)data. Remaining as you can see is filled with 4 zeros which starts provoking problems. I would be expecting from this frame to be 6 bytes len, or having valid data on those 4 bytes(not 0).

What transfer length do you use when doing mentioned transfer? The thing is, USB doesn't know about the data it is moving, it basically just allocates as much as you hand in and then sees what the device says. Usually the actual length field is then what the device returned. It is possible (and not too unlikely) that there is an error in transfer length handling in OHCI. Just to verify that, could you try the same on a UHCI controller instead and see if it behaves differently? If it does return the correct actual length there (or different data), then it is a problem in OHCI, because both should really behave the same. If it does the same on UHCI, then it could as well be a problem on the bluetooth side (or both controller drivers could be buggy, but since UHCI is quite heavily tested I think that's rather unlikely). I really have no idea of the bluetooth side of things, so I'm just pointing out what I can tell from this info. In any case, if you could tell me the exact way you scheduled that transfer (transfer type and length basically), I will look over OHCI in that regard.

comment:11 by oruizdorantes, 10 years ago

What transfer length do you use when doing mentioned transfer?

For that pipe the reported max_packet_size.

You can check in kernel/drivers/bluetooth/h2/h2generic/h2generic.c line 307

new_bt_dev->max_packet_size_intr_in = ep->descr->max_packet_size;

Just to verify that, could you try the same on a UHCI controller instead and see if it behaves differently?

Hmm, I developed the stack with a similar device to that one BCM2035(product ID is different) in a UHCI, & always size was a coherent value with valid data. But Monni could confirm that for his particular dongle.

My concern is not that I get 0's at end(driver "could" somehow discard it or realize is invalid), but after that buffer, next one comming valid, data but incomplete, therefore blocks the stack waiting for the remaining part. So maybe those 0's might give a clue...

I really have no idea of the bluetooth side of things, so I'm just pointing out what I can tell from this info. In any case, if you could tell me the exact way you scheduled that transfer (transfer type and length basically), I will look over OHCI in that regard.

It is a simple incoming Interrupt one:

kernel/drivers/bluetooth/h2/h2generic/h2transactions.c

254 status = usb->queue_interrupt(bdev->intr_in_ep->handle, buf, size, 255 event_complete, (void*)bdev);

Although this one is triggered because we issued a control transfer previously. In ohter words, this incoming interrupt is a reply of a previous ongoing control transfer. But in a USB side would not matter, would it? (submit_tx_command() in same file)

In the meanwhile I remember Monni's dongle has some special flags in Linux driver which I will review, but I think 90% we have that covered

in reply to:  11 comment:12 by mmlr, 10 years ago

Replying to oruizdorantes:

What transfer length do you use when doing mentioned transfer?

For that pipe the reported max_packet_size.

I more meant in that concrete case, to see if the length reported had any relation to it.

Just to verify that, could you try the same on a UHCI controller instead and see if it behaves differently?

Hmm, I developed the stack with a similar device to that one BCM2035(product ID is different) in a UHCI, & always size was a coherent value with valid data. But Monni could confirm that for his particular dongle.

It'd really help to know, as I don't see anything obviously problematic when looking over it.

My concern is not that I get 0's at end(driver "could" somehow discard it or realize is invalid), but after that buffer, next one comming valid, data but incomplete, therefore blocks the stack waiting for the remaining part. So maybe those 0's might give a clue...

So it's like the first one is too long with useless data and the second one is too short? What lengths are we talking about, do they make any sense, like is the extra bytes matching what is missing on the second one or something similar?

In ohter words, this incoming interrupt is a reply of a previous ongoing control transfer. But in a USB side would not matter, would it? (submit_tx_command() in same file)

It doesn't matter, no. The USB stack has no knowledge of that, the pipes are completely independent.

A few remarks about that driver:

  • The files need a _serious_ review concerning coding style
  • There is a synchronous submit_request() method you might want to use for sending commands. I don't know if it makes sense, just mentioning it in case you haven't seen it.
  • You may want to think about locking the device structures you are accessing from the completed hooks as you really never know when these are going to be called and when your other threads (from which you schedule the transfers) are preempted. Right now it looks like non-vital data, but always keep in mind that the thread where you schedule the transfer might be preempted right after scheduling, possibly leading to the hook being called before you return from your submit function. So you must always make sure that whatever the hook triggers is prepared for this situation.
  • In a similar note, everything you do from that hook will block all other USB transfers on the same controller, be they outgoing or incoming. It is therefore vital that the hooks are processed as quickly as possible. You could look at them like interrupt handlers. Therefore, if there is any heavy stuff going on you must employ a DPC to handle the work in a separate thread. Otherwise you will degrade performance and may trigger timeouts for other devices. Accordingly you must also never block in such a handler, or call functions that may block.

comment:13 by mmlr, 10 years ago

Just realizing that BTW: The max packet size that is reported for the endpoint is only the max packet size the device can transfer per one USB transaction. This has no influence or meaning outside of USB. Meaning that if there is a max packet size of 8 bytes this does by no means mean that you can or should only schedule 8 bytes at a time. The only thing it means is that if you schedule more bytes that those it will (internally) split these up into multiple USB packets. With that context, is it possible that the event you are receiving simply doesn't fit into that buffer?

comment:14 by mmlr, 10 years ago

Could you please try hrev32680? This could very well fix the issues seen here. My above comment is still valid though and the use of the max packet size field should be reviewed in any case.

comment:15 by oruizdorantes, 10 years ago

So it's like the first one is too long with useless data and the second one is too short?

First is too long with usefull data(even complete packet) and 0's. Second one is again with useful data, but too short 2 bytes, missing 12 more... I am able to say I miss 12 because, the incoming 2 bytes already contained information about the size.

What lengths are we talking about,

These bluetooth packets on UCHI use to go from 3 to 256 bytes. And they dont use to come splitted in any way.

do they make any sense, like is the extra bytes matching what is missing on the second one or something similar?

From the example I had, I could not extract any further matching.

  • The files need a _serious_ review concerning coding style

Feel free to comment the most appearing ones. I guess this is the Identifier rule for functions and var names(underscores), But I expected it was ok in a C file.

  • You may want to think about locking the device structures [...]

Yep, this is in the TODO list once the number of threads in the playground is clear.

Meaning that if there is a max packet size of 8 bytes this does by no means mean that you can or should only schedule 8 bytes at a time. The only thing it means is that if you schedule more bytes that those it will (internally) split these up into multiple USB packets.

So would be fine to schedule the size just as the biggest packet I am expecting to receive?

With that context, is it possible that the event you are receiving simply doesn't fit into that buffer?

I remember those interrupt endpoints use to be 16, but I have been receiving completelly in one hook buffers of 256. So being honest never understood the size parameter in the queue call, as I could get bigger or smaller buffers.

Well lets wait Monni's test.

comment:16 by monni, 10 years ago

Test pretty much failed as Haiku doesn't boot anymore... Hangs after last icon but before display should turn black.

in reply to:  15 comment:17 by mmlr, 10 years ago

Replying to oruizdorantes:

So it's like the first one is too long with useless data and the second one is too short?

First is too long with usefull data(even complete packet) and 0's. Second one is again with useful data, but too short 2 bytes, missing 12 more... I am able to say I miss 12 because, the incoming 2 bytes already contained information about the size.

Yeah I guess the mixed up transfer length of the last packet would explain this totally.

  • The files need a _serious_ review concerning coding style

Feel free to comment the most appearing ones. I guess this is the Identifier rule for functions and var names(underscores), But I expected it was ok in a C file.

Take h2generic.c: 272: No spaces after "(".

273: We don't indent like that, we only indent one tab, in this case though you'd indent two because of the sub-expression "() pair" you are in, the operators "
/&&" go on the next line.

305: The only opening brace that opens on the next line is the one after the "case:" because this one has a different semantical meaning. All others go on the same line. 307: 80 chars per line. 344: Spaces after commas and operators. 369: No braces for single line blocks (ifs, loops) 370: I would avoid special characters in code (the upside down question mark isn't an ascii char).

In general, variable names are supposed to be obvious, and camel cased. So "ep" -> "endPoint", "new_bt_dev" -> "newBTDevice" and all such. They are even inconsistently named, some are camel cased, some not, some are abbreviated some aren't, you get the drift.

  • You may want to think about locking the device structures [...]

Yep, this is in the TODO list once the number of threads in the playground is clear.

Through the async callbacks the thread count already is > 1, so locking already makes sense.

Meaning that if there is a max packet size of 8 bytes this does by no means mean that you can or should only schedule 8 bytes at a time. The only thing it means is that if you schedule more bytes that those it will (internally) split these up into multiple USB packets.

So would be fine to schedule the size just as the biggest packet I am expecting to receive?

Yes that's exactly what you'd want to do.

With that context, is it possible that the event you are receiving simply doesn't fit into that buffer?

I remember those interrupt endpoints use to be 16, but I have been receiving completelly in one hook buffers of 256. So being honest never understood the size parameter in the queue call, as I could get bigger or smaller buffers.

If you ever get bigger sizes than the buffers you handed in, this is a severe bug and must be reported. Because in that case random buffers would be overwritten, which must not happen.

in reply to:  16 comment:18 by mmlr, 10 years ago

Replying to monni:

Test pretty much failed as Haiku doesn't boot anymore... Hangs after last icon but before display should turn black.

Is this a general regression or is it related to hrev32680 specifically?

comment:19 by monni, 10 years ago

I'm pretty sure it's not hrev32680 specifically because when I reverted to older revision and manually re-applied it, TRACE()'s I added before the changed lines show totally normal values...

From h2generic I now see a lot of 16 byte packets and one 14 byte packet, which is expected as far as I know.

in reply to:  19 comment:20 by mmlr, 10 years ago

Replying to monni:

From h2generic I now see a lot of 16 byte packets and one 14 byte packet, which is expected as far as I know.

So it is fixed? What about the other problems? Can this ticket be closed?

comment:21 by monni, 10 years ago

I think we can close this ticket.

in reply to:  19 ; comment:22 by oruizdorantes, 10 years ago

From h2generic I now see a lot of 16 byte packets and one 14 byte packet, which is expected as far as I know.

Is now bt_dev_info command now showing coherent info?

in reply to:  22 comment:23 by monni, 10 years ago

Replying to oruizdorantes:

From h2generic I now see a lot of 16 byte packets and one 14 byte packet, which is expected as far as I know.

Is now bt_dev_info command now showing coherent info?

I can tell that better when I manage to get latest revision to boot. I will try partial "svn update" to update just Bluetooth code.

comment:24 by mmlr, 10 years ago

Any update on this? From what I gather the issue has been fixed, so I'd like to close it.

comment:25 by monni, 10 years ago

Last time I checked, it was working as far what goes to detecting plugged-in devices.

comment:26 by mmlr, 10 years ago

Resolution: fixed
Status: in-progressclosed

Ok, closing, thanks for the feedback!

Note: See TracTickets for help on using tickets.