Opened 8 years ago

Closed 2 years ago

Last modified 14 months ago

#9910 closed enhancement (fixed)

NVMe devices support

Reported by: korli Owned by: waddlesplash
Priority: high Milestone: R1/beta2
Component: Drivers/Disk/NVMe Version: R1/Development
Keywords: Cc: luroh, phoudoin, korli
Blocked By: Blocking: #14999
Platform: All


Qemu 1.6+ offers support for this device type.

Change History (37)

comment:1 by korli, 8 years ago

Version: R1/alpha4.1R1/Development

comment:2 by korli, 8 years ago

Type: bugenhancement

comment:3 by kallisti5, 5 years ago

High level: NVMe is a alternate attachment of SSD devices directly to the PCIe bus. NVMe devices use a "lightweight" communication protocol which is used instead of AHCI.

NVMe essentially replaces AHCI for SSD devices. We likely need to get more information on NVMe and see if AHCI can be used until native NVMe support is created.

Since i'm pretty sure AHCI is still available on NVMe devices, i'm dropping the priority of this ticket for now. If the AHCI emulation ever goes away (or isn't available 100% of the time) we should bump up the priority of this one.

comment:4 by kallisti5, 5 years ago

Priority: normallow

comment:5 by kallisti5, 5 years ago

Keywords: NVMe added

in reply to:  3 comment:6 by korli, 5 years ago

Replying to kallisti5:

Since i'm pretty sure AHCI is still available on NVMe devices, i'm dropping the priority of this ticket for now. If the AHCI emulation ever goes away (or isn't available 100% of the time) we should bump up the priority of this one.

I really wonder where you got the idea that NVMe devices can be used with AHCI emulation. Care to elaborate since you're "sure"? :)

comment:7 by kallisti5, 5 years ago

Hm. I did a bit more reading and you're correct. I saw mentions of NVMe aware BIOS AHCI emulation, but it was more of a "BIOS has support to boot directly from NVMe devices"

So yeah, NVMe is actually *really* important as we won't support installing to systems with NVMe devices (and those systems seem to be growing in numbers)

Bumping this one to high.

The "workaround" for now is to purchase M.2 SSD's that aren't NVMe :-|

comment:8 by kallisti5, 5 years ago

Priority: lowhigh

comment:9 by kallisti5, 5 years ago

Linux kernel NVMe PCIe identification:

FreeBSD driver:

It doesn't look like *too* daunting of a project (esp since QEMU seems to emulate it)

Maybe someone out there will feel zesty and take a crack at it :-)

comment:10 by kallisti5, 5 years ago

I don't think we need to implement a bus like AHCI, IDE, SCSI, etc. Since NVMe devices are attached like PCIe devices, there really isn't a common controller. As such any implementation should likely live here:

The norflash or usb_disk driver would make a good template as it attaches a random block device as a drive:

Examples of drivers detecting supported PCI/PCIe devices:

comment:11 by luroh, 4 years ago

Cc: luroh added

comment:12 by phoudoin, 4 years ago

Cc: phoudoin added

comment:13 by phoudoin, 4 years ago

VirtualBox emulates NVMe controllers too since 5.1.22, FYI.

comment:14 by waddlesplash, 3 years ago

There's a BSD-licensed userspace NVMe implementation here:

It needs libpciaccess (which we have a working port of) and libnuma (which we do not), but it doesn't look like porting it to kernel space and Haiku-specific APIs would be too hard.

comment:15 by KapiX, 3 years ago

Blocking: 14454 added

comment:16 by cb88, 3 years ago

All M.2 SSD's can optionally support SATA, however it's common for the faster ones to only support NVME without backward compatibility. Both B and M key interfaces support Sata, but even then there is no guarantee that the mobo or ssd supports both. I'd imagine in bios emulation of AHCI for purely NVME devices is almost nonexistant.

B key = 2x PCIE + SATA (you have to check your drive and mobo specs)
M key = 4x PCIE + SATA       "                              "
B+M key = SATA (always as it lacks the PCIE I/O)

An M keyed drive is least likely to support SATA.

Haiku should probably put Host Memory Buffer support for NVMe drives on the TODO list also. As it allows a massive speed, up for recent cheaper, and therefore common, "cacheless" drives. This basically means dedicating a chunk of memory to the SSD similar to integrated graphics.

Last edited 3 years ago by cb88 (previous) (diff)

comment:17 by waddlesplash, 3 years ago

AFAIK, "Host Memory Buffer" support is pretty trivial to add once you bring up the controller itself.

I started work on a NVMe driver offline based on libnvme; I should get it posted somewhere.

comment:18 by waddlesplash, 2 years ago

Blocking: 14999 added

comment:19 by cb88, 2 years ago

Wouldn't it make sense to just write a native driver perhaps by adapting a BSD's nvme driver? The libnvme userspace driver seems like it would probably be a lot different from what an in kernel driver should be. And it relies on Linux infrastructure with hugetlbfs etc...

The OpenBSD driver is relatively small... about 4-5k LOC I think across a few files. The FreeBSD driver appears more well organized.

Just for reference the OpenBSD driver was ported to NetBSD and FreeBSD has a driver that appears that it may be more fleshed out, which has contributions from Netflix so is probably solid enough for production use. There are a few more related files in the ic directory also.

comment:20 by waddlesplash, 2 years ago

The libnvme userspace driver seems like it would probably be a lot different from what an in kernel driver should be.

No, it really isn't. It doesn't use interrupts and polls the DMA structures instead, but our USB HCI drivers often do this as a fallback anyway. Interrupts are easy enough to add later on.

And it relies on Linux infrastructure with hugetlbfs etc...

Only because this is the easiest way to get a physically contiguous buffer in an optimized manner. This code is all split out into their "common" directory, it's easy enough to remove it and replace it with code that works on Haiku instead.

libnvme has the distinct advantage over the BSD drivers in that it is both more complete in terms of features, and is also relatively "unopinionated", providing read/write and control APIs like a library, leaving it up to the consumer to decide what to do with that. The BSD drivers are more tuned to their device driver layers.

comment:21 by cb88, 2 years ago

That makes sense, especially if you don't have to wrangle the code into doing what *you* want as much and if the interface isn't really holding you back since the common bits are broken out. Also, with NVMe devices you may actually want to stick with polled IO in many cases, as it's lower latency and NVMe latency can be low enough that the overhead of interrupts vs polling flips backwards from what it would usually be. Which is why Linux added polled IO for NVMe in 4.5

If low priority threads got Interrupt driven IO and everything else polled IO perhaps that would make sense... probably would require benchmarking to tell the minute differences though since it's on the order of tens of microseconds. The cumulative effect may be noticeable though. Since in heavy IO applications it made the difference of tens of percent in performance.

comment:22 by waddlesplash, 2 years ago

Owner: changed from nobody to waddlesplash
Status: newin-progress

Initial (read-only) driver imported in hrev53068.

comment:23 by waddlesplash, 2 years ago

Blocking: 14454 removed

comment:24 by waddlesplash, 2 years ago

Component: Drivers/DiskDrivers/Disk/NVMe
Keywords: NVMe removed

comment:25 by waddlesplash, 2 years ago

Cc: korli added

So, as I mentioned in the commit, it worked initially only because I added a hack for it to find the PCI device. The device manager calls supports_device() for it on a number of items, but only 3 of them are PCI, and none of them have class storage.

korli, if you could take a look here and try to figure out why the device manager seems to be confused here, I'd appreciate it. Building the driver, putting it in /system/non-packaged/add-ons/kernel/drivers/disk and rebooting is sufficient for the kernel to load it; putting it inside the Haiku package and in the boot modules makes no difference.

comment:26 by waddlesplash, 2 years ago

It seems that only "busses" are probed for drivers:

Well, NVMe devices are not "busses", they're just attached to the PCI bus directly. So is it impossible to make this work without defining a "bus"? That sounds a little ridiculous. If that's the case, I'll probably rewrite the NVMe driver to be a legacy driver until the device manager is refactored/rewritten.

comment:27 by waddlesplash, 2 years ago

OK, I figured it out; the NVMe disk driver is now included on the images as of hrev53069. Still read-only however.

comment:28 by waddlesplash, 2 years ago

Resolution: fixed
Status: in-progressclosed

Write support and a initial performance optimizations merged in hrev53079, so this is now fixed. Please open new tickets for any further issues in the driver.

comment:29 by korli, 2 years ago

@waddlesplash we try to outsource external code when possible. I think it would be possible with libnvme, as a source package (although a gcc2 patch is needed). This would be another ticket anyway. what do you think?

comment:30 by waddlesplash, 2 years ago

I strongly disagree. Kernel components should have their source code in the tree unless there is some very good reason not to do so. As libnvme is not very large, the benefits of outsourcing are much smaller than the hasstle it would cause in bootstrapping, etc.

comment:31 by tqh, 2 years ago

I agree with waddlesplash, I've had that pain with GNU-EFI package, and some other package I can't remember. For essential kernel code I'd rather have everything available and instantly editable. If you need to fix anything in an external package you need to:

  • get it locally
  • figure out how to build
  • edit files
  • generate patch and packaging
  • figure out how to use that package
  • figure out how to release

The roundtrip for any kernel development with this is horrible.

comment:32 by cb88, 2 years ago

As a git submodule perhaps? There is probably also pain with maintaining these in tree libraries in sync with upstream right?

comment:33 by kallisti5, 2 years ago

Making it a build-package is ideal as long as Haiku will build when the build-package is unavailable. (aka, no libnvme build-package available?, skip building nvme driver)

Let's keep the bootstrap package list as small as possible.

comment:34 by waddlesplash, 2 years ago

No, the pain of maintaining the in tree libraries is very low. It's actually much less than maintaining the build packages.

The build packages make sense for external noncritical (eg WebKit) or external but large (eg GCC) software. Smaller things like libnvme or the FreeBSD drivers should stay in tree.

There is also the problem for kernel add-ons that at least some of the Haiku specific kernel APIs, eg mutexes, recurisve locks, etc. actually *are not compatible* between KDEBUG and release kernels due to struct layout differences. We really need to fix that, but until someone does, it's impossible to have kernel add-ons in HaikuPorts that use these APIS and work on both.

comment:35 by pulkomandy, 2 years ago

I would also prefer that we minimize the amount of imported code (and Jamfile maintenance) in our git tree.

Managing external sources is painful, but we should fix/improve that, not workaround it by importing 3rd party sources, which only makes them more annoying to update (no one wants to be maintaining the Jamfiles for these when updating to a new upstream version).

We already have qrencode outsourced (for use in the KDL add-on), so there is even an example of 3rd party code for kernel that works.

comment:36 by waddlesplash, 2 years ago

Personally I don't mind maintaining the FreeBSD drivers and now libnvme in the tree; merging stuff is not that bad at all. It's much easier to maintain our own modifications this way. I think glibc is the only thing with complicated Jamfiles really, maintaining those isn't an issue. So at least the components I'm responsible for, I'd greatly prefer they stay in the tree.

comment:37 by nielx, 14 months ago

Milestone: R1R1/beta2

Assign tickets with status=closed and resolution=fixed within the R1/beta2 development window to the R1/beta2 Milestone

Note: See TracTickets for help on using tickets.