Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#15657 closed bug (fixed)

My system will not boot nightly cd's newer than hrev53659

Reported by: WildKeccak Owned by: waddlesplash
Priority: normal Milestone: R1/beta2
Component: Drivers/USB/XHCI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: x86-64

Description

I am booting off an external LG blu-ray drive. These are DVD media. x86_64, AMD Ryzen 5 3400G December 31,2019 at 7:23:55 AM. x570 chipset Ryzen 5700 XT video card.

Attachments (8)

IMG-0349.JPG (2.1 MB ) - added by WildKeccak 4 years ago.
minicom.cap (100.4 KB ) - added by WildKeccak 4 years ago.
hrev53864, serial log
syslog.asc (291.2 KB ) - added by WildKeccak 4 years ago.
Boot of a system log from a (definitely) homebrew cd of hrev 53671 after successful boot
minicom.cap.2 (118.4 KB ) - added by WildKeccak 4 years ago.
A serial capture of hrev53672 struggling to boot on hrev53672
minicom.cap.3 (108.7 KB ) - added by WildKeccak 4 years ago.
An extra capture of hrev53672
minicom.cap.4 (103.1 KB ) - added by WildKeccak 4 years ago.
syslog.2 (118.7 KB ) - added by WildKeccak 4 years ago.
screenshot1.png (145.3 KB ) - added by WildKeccak 4 years ago.
A (modest) success

Change History (35)

by WildKeccak, 4 years ago

Attachment: IMG-0349.JPG added

comment:2 by pulkomandy, 4 years ago

Should be fixed in hrev53777.

comment:3 by WildKeccak, 4 years ago

The issue remains. I think hrev53672 and hrev53673 are relevant.

comment:4 by pulkomandy, 4 years ago

Component: SystemDrivers/USB/XHCI
Keywords: hrev53659 usb removed
Owner: changed from nobody to waddlesplash

comment:5 by diver, 4 years ago

Yep, looks like hrev53672 is the culprit.

comment:6 by waddlesplash, 4 years ago

The messages in the picture have appeared in other XHCI tickets long before I made that change. Can you please confirm that a hrev < hrev53672 actually works? (Make sure to use the exact same USB drive and port when testing as this can affect results.)

comment:7 by waddlesplash, 4 years ago

If hrev53672 is really the culprit, I'll need a full syslog from a failed boot.

comment:8 by WildKeccak, 4 years ago

The file is never made, even on a flash stick. The mobo is MSI X570-A Pro.

comment:9 by waddlesplash, 4 years ago

Yes, you will have to capture it via serial log, or taking pictures of the syslog pages using "on screen debug output" mode.

comment:10 by waddlesplash, 4 years ago

Please retest after hrev53855.

by WildKeccak, 4 years ago

Attachment: minicom.cap added

hrev53864, serial log

comment:11 by WildKeccak, 4 years ago

It still doesn't work.

comment:12 by waddlesplash, 4 years ago

usb xhci 1: transfer error on slot 3 endpoint 1: Length invalid

This error also showed up in #15333 recently. Interesting.

comment:13 by waddlesplash, 4 years ago

OK, actually this is just due to the StopEndpoint command being invoked on the endpoint; we should ignore it.

It is very odd that there are no errors before the first timeout, after which it seems everything just fails. But since there is just one commit that is the clear break, I guess trying to determine what I did wrong may not be so difficult.

comment:14 by waddlesplash, 4 years ago

So, I just noticed that the stalling pipe and requests are pipe 0 (i.e. the default control pipe.) The changes in hrev53672 should not change pipe 0 initialization at all (the refactors should not amount to a functional change, and the SubmitNormalRequest items will not affect it either, as Control transfers go through SubmitControlRequest.)

WildKeccak, if possible, can you do a git-revert hrev53672 locally, rebuild Haiku ("@nightly-anyboot" target, probably) and test to see if it is still broken under that? Either way, please post a syslog. (Also, if you have a syslog from a successful boot with this setup, that will also be helpful.)

comment:15 by WildKeccak, 4 years ago

Working on the build, but I've noticed that there is activity on the DVD drive, long after the computer has "crashed".

by WildKeccak, 4 years ago

Attachment: syslog.asc added

Boot of a system log from a (definitely) homebrew cd of hrev 53671 after successful boot

comment:16 by WildKeccak, 4 years ago

Some ingredients are missing, such as a boot logo and ability to press space and get a special boot menu when I make my nightlies. I don't know how.

comment:17 by pulkomandy, 4 years ago

The logo is not included because only official images provided by Haiku have it (you can use "--distro-compatibility official" argument to configure to enable it in your own builds).

However the boot menu should be there in either case, but the timing can be a bit tight. For the BIOS loader it's easier to hold shift, but for the EFI one you have to spam the space key very quickly, I guess.

by WildKeccak, 4 years ago

Attachment: minicom.cap.2 added

A serial capture of hrev53672 struggling to boot on hrev53672

by WildKeccak, 4 years ago

Attachment: minicom.cap.3 added

An extra capture of hrev53672

comment:18 by waddlesplash, 4 years ago

Can you add a TRACE_ALWAYS statement where "endpoint->max_burst_payload" is assigned, and capture a new serial log? That's the only thing that changed in hrev53672, so somehow that is the cause here.

by WildKeccak, 4 years ago

Attachment: minicom.cap.4 added

comment:19 by WildKeccak, 4 years ago

Yah, I should've added a newline, but I did that.

The format is 'trbSize is n'.

At first, n is 1, and then it's 512 until the end.

Attachment will be here momentarily

comment:20 by WildKeccak, 4 years ago

Attachment from working laptop momentarily

trbSize is 64, period.

by WildKeccak, 4 years ago

Attachment: syslog.2 added

comment:21 by waddlesplash, 4 years ago

In your working version, usb_disk never initializes, so it may be the case that the 1-sized TRB is always constructed anyway.

Otherwise, 512 should be an entirely reasonable size and in accordance with the specification here. However, I noticed there is a bug in the breaking commit which makes TD Size incorrectly computed, which may be relevant.

comment:22 by WildKeccak, 4 years ago

Is this happening? https://www.ti.com/lit/pdf/sllz076

Number 5

comment:23 by waddlesplash, 4 years ago

The four XHCI controllers you have appear to be two of:

PCI:   vendor 1022: Advanced Micro Devices, Inc. [AMD]
PCI:   device 149c: Matisse USB 3.0 Host Controller

and then:

PCI:   vendor 1022: Advanced Micro Devices, Inc. [AMD]
PCI:   device 15e0: Raven USB 3.1

PCI:   vendor 1022: Advanced Micro Devices, Inc. [AMD]
PCI:   device 15e1: Raven USB 3.1

I don't see any indications that these are really TI controllers in disguise...

Also, the problem here is actually a transfer stall, as noted above. So we are actually getting an error condition from the hardware. So, no, it's not item 5 under any circumstance.

comment:24 by waddlesplash, 4 years ago

See if hrev53889 changes anything here.

comment:25 by WildKeccak, 4 years ago

It works!

by WildKeccak, 4 years ago

Attachment: screenshot1.png added

A (modest) success

comment:26 by waddlesplash, 4 years ago

Resolution: fixed
Status: newclosed

Thanks for testing!

comment:27 by nielx, 4 years ago

Milestone: UnscheduledR1/beta2

Assign tickets with status=closed and resolution=fixed within the R1/beta2 development window to the R1/beta2 Milestone

Note: See TracTickets for help on using tickets.