Opened 2 years ago

Closed 18 months ago

Last modified 18 months ago

#17484 closed bug (fixed)

NVMe: No interrupts delivered (both MSI and MSI-X)

Reported by: smallstepforman Owned by: waddlesplash
Priority: normal Milestone: R1/beta4
Component: Drivers/Disk/NVMe Version:
Keywords: Cc: korli
Blocked By: #17334 Blocking:
Platform: All

Description

Boot failure, 4th Icon (disk scan). See attached image (nvme.jpg). Panic: vfs_mount_boot_file_system

Attempting to boot from USB flash stick (nightly x64 hrev55736), as well as from USB external hard disk with approx 2 week older version of Haiku_x64.

Note - when I use on screen paging, I get spinlock timeouts (spinlock.jpg)

Applying all debug options makes no difference - it always fails after the 4th icon (disk scan).

KDiskDeviceManager::InitialDeviceScan() returned error: No such file or directory

HP Omen 16 Advantage Edition laptop. AMD 5800H RX6600M video 32Gb RAM

nvme0 Samsung 970 Evo Plus, 1Tb Partition table gpt heads 255 sectors/track 2 Cylinders 3,830,441 Total sectors: 1,953,525,168 Sector size 512 Partitions:

  • future Bfs (256Gb)
  • Ext4 (Mint 20.2 256Gb)
  • Ntfs (data 454Gb)

nvme 1 - MTFDHBA512TDV-1AZ1AABHA (512Gb) Partition table: gpt Heads 255 Sectors/track 2 Cylinders: 1,961,206 Total Sectors: 1,000,215,216 Sector size 512 Partitions:

  • EFI (260Mb)
  • Microsoft reserved 16Mb
  • Ntfs (Windows 11) 476Gb

Attachments (6)

nvme.jpg (3.4 MB ) - added by smallstepforman 2 years ago.
nvme panic
spinlock.jpg (2.5 MB ) - added by smallstepforman 2 years ago.
spinlock error when on screen debug messages
syslog (157.3 KB ) - added by smallstepforman 2 years ago.
Syslog
listdev.txt (5.2 KB ) - added by smallstepforman 2 years ago.
listdev
syslog.2 (94.2 KB ) - added by smallstepforman 2 years ago.
syuslog from succesful (long) post
syslog.3 (152.3 KB ) - added by smallstepforman 2 years ago.
Syslog from hrev55541

Change History (26)

by smallstepforman, 2 years ago

Attachment: nvme.jpg added

nvme panic

by smallstepforman, 2 years ago

Attachment: spinlock.jpg added

spinlock error when on screen debug messages

comment:1 by waddlesplash, 2 years ago

The spinlock error is "expected" when using onscreen paging, and can be mitigated by disabling SMP when using onscreen paging.

comment:2 by waddlesplash, 2 years ago

Did this used to work with an older version of Haiku?

comment:3 by waddlesplash, 2 years ago

Actually, I notice that we do not even attempt to allocate MSI or MSI-X interrupts for the NVMe devices; this is probably the real issue. That would likely be due to failed initialization of the MSI module, which could be for any number of reasons. Any chance you can find anything related to that in the log, or maybe get a more complete log?

comment:4 by smallstepforman, 2 years ago

OK, I tried to disable SMP and on screen paging, and after over 20 minutes of nvme timeouts I actually booted to the USB Haiku installer. I will now try to delete and repartition the nvme partition I dedicated to Haiku and see if a valid GUID will help. Command line linux gdisk -l /dev/nvme0p1 is showing some errors with the partition tables. Funny enough, it was repartitioned Linux Mint.

Will report back once I figure out how to correctly write the Haiku BFS GUID.

comment:5 by waddlesplash, 2 years ago

If the system boots with only SMP disabled and onscreen paging, then that means MSI allocation succeeded, so this is probably a variation on #17334 or something similar.

Please test with the closest nightly build to hrev55520 (i.e. either exactly that revision or the closest higher number) and see if it works any better.

by smallstepforman, 2 years ago

Attachment: syslog added

Syslog

comment:6 by smallstepforman, 2 years ago

OK, I've confirmed that Haiku will indeed boot, but I had to wait at least 10 minutes (closer to 15 min) before all the nvme timeouts ran their cycles. Once I booted to desktop, both nvme drives seemed to work OK (I haven't benchmarked yet).

Attached syslog to see what may be causing this enormous boot delay.

comment:7 by smallstepforman, 2 years ago

Confirmed that no debug settings applied or any other modifications, vanilla Haiku boot on a freshly installed hrev5736. This is a new laptop, so very first installation of Haiku. Other than enormous boot delay, nothing out of the ordinary. Hope the attached syslog helps.

comment:8 by smallstepforman, 2 years ago

Can we change the title of this ticket to "long boot time with nvme". Haiku does successfully boot after 10 minutes or so if I just leave it alone. I will attach a fresh syslog of hrev55754, as well as the output of listdev.

comment:9 by waddlesplash, 2 years ago

Blocked By: 17334 added
Platform: x86-64All
Priority: highnormal
Summary: nvme trampolineNVMe: No interrupts delivered (MSI-X)

The boot time is long because no interrupts are delivered. This may be a duplicate or variation on #17334.

Please test with the hrev specified in comment:5 and see what differences there are.

by smallstepforman, 2 years ago

Attachment: listdev.txt added

listdev

by smallstepforman, 2 years ago

Attachment: syslog.2 added

syuslog from succesful (long) post

comment:10 by smallstepforman, 2 years ago

I am downloading 55541 and will attempt to boot that. I'll report how that went in the next hour (slow internet today).

comment:11 by smallstepforman, 2 years ago

Same issue with hrev55541. Attaching syslog.

by smallstepforman, 2 years ago

Attachment: syslog.3 added

Syslog from hrev55541

comment:12 by waddlesplash, 2 years ago

Cc: korli added
Summary: NVMe: No interrupts delivered (MSI-X)NVMe: No interrupts delivered (both MSI and MSI-X)

So MSI also does not work. This is very strange.

korli: Might you have any ideas of things I can try in the NVMe driver?

comment:13 by smallstepforman, 2 years ago

Also confirmed that an older nightly (hrev5507) has the same issue (exceptionally long boot time). So the MSI code changes in hrev55520 (before/after) have no impact on the boot time on this board.

comment:14 by korli, 2 years ago

@waddlesplash sorry, not ideas ATM

comment:15 by korli, 2 years ago

@waddlesplash maybe having two NVME connected could trigger a problem somehow?

in reply to:  15 comment:16 by beaglejoe, 2 years ago

Replying to korli:

@waddlesplash maybe having two NVME connected could trigger a problem somehow?

I have the 500 GB Samsung 970 EVO Plus as the only NVME drive and it boots fine.

comment:17 by waddlesplash, 18 months ago

Milestone: UnscheduledR1/beta4
Resolution: fixed
Status: newclosed

Implemented a polling fallback mode in hrev56540.

comment:18 by smallstepforman, 18 months ago

I can confirm that this resolves the issue. The boot time went from 14-15 minutes to 30 seconds. Thank you Augustin, this makes booting into this laptop a pleasure and no longer something I dread.

I hope the NVME performance doesn't suffer too much ...

comment:19 by smallstepforman, 18 months ago

BTW, regarding the "maybe having two NVME connected could trigger a problem", on my desktop PC I also have 2 NVME drives and Haiku boots fine on that box.

comment:20 by waddlesplash, 18 months ago

NVMe performance will only suffer if the polling fallback mode has to be used in the first place, i.e. only on devices that do not support interrupts mode. The Samsung device looks like it has working interrupts, it's the Micron drive that appears not to.

I was testing in VMware with the polling mode forcibly enabled; it was noticeably slower, but probably still within the realm of what some USB disk speeds are, so hopefully it's not too bad.

Note: See TracTickets for help on using tickets.