Opened 6 years ago

Closed 5 weeks ago

Last modified 2 weeks ago

#9910 closed enhancement (fixed)

NVMe devices support

Reported by: korli Owned by: waddlesplash
Priority: high Milestone: R1
Component: Drivers/Disk/NVMe Version: R1/Development
Keywords: Cc: luroh, phoudoin, korli
Blocked By: Blocking: #14999
Has a Patch: no Platform: All

Description

Qemu 1.6+ offers support for this device type.

Change History (36)

comment:1 Changed 6 years ago by korli

Version: R1/alpha4.1R1/Development

comment:2 Changed 6 years ago by korli

Type: bugenhancement

comment:3 Changed 3 years ago by kallisti5

High level: NVMe is a alternate attachment of SSD devices directly to the PCIe bus. NVMe devices use a "lightweight" communication protocol which is used instead of AHCI.

https://en.wikipedia.org/wiki/NVM_Express

NVMe essentially replaces AHCI for SSD devices. We likely need to get more information on NVMe and see if AHCI can be used until native NVMe support is created.

Since i'm pretty sure AHCI is still available on NVMe devices, i'm dropping the priority of this ticket for now. If the AHCI emulation ever goes away (or isn't available 100% of the time) we should bump up the priority of this one.

comment:4 Changed 3 years ago by kallisti5

Priority: normallow

comment:5 Changed 3 years ago by kallisti5

Keywords: NVMe added

comment:6 in reply to:  3 Changed 3 years ago by korli

Replying to kallisti5:

Since i'm pretty sure AHCI is still available on NVMe devices, i'm dropping the priority of this ticket for now. If the AHCI emulation ever goes away (or isn't available 100% of the time) we should bump up the priority of this one.

I really wonder where you got the idea that NVMe devices can be used with AHCI emulation. Care to elaborate since you're "sure"? :)

comment:7 Changed 3 years ago by kallisti5

Hm. I did a bit more reading and you're correct. I saw mentions of NVMe aware BIOS AHCI emulation, but it was more of a "BIOS has support to boot directly from NVMe devices"

So yeah, NVMe is actually *really* important as we won't support installing to systems with NVMe devices (and those systems seem to be growing in numbers)

Bumping this one to high.

The "workaround" for now is to purchase M.2 SSD's that aren't NVMe :-|

comment:8 Changed 3 years ago by kallisti5

Priority: lowhigh

comment:9 Changed 3 years ago by kallisti5

Linux kernel NVMe PCIe identification: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/nvme/host/pci.c#n2101

FreeBSD driver: https://github.com/freebsd/freebsd/tree/master/sys/dev/nvme

It doesn't look like *too* daunting of a project (esp since QEMU seems to emulate it)

Maybe someone out there will feel zesty and take a crack at it :-)

comment:10 Changed 3 years ago by kallisti5

I don't think we need to implement a bus like AHCI, IDE, SCSI, etc. Since NVMe devices are attached like PCIe devices, there really isn't a common controller. As such any implementation should likely live here:

http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/disk

The norflash or usb_disk driver would make a good template as it attaches a random block device as a drive:

http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/disk/norflash http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/disk/usb/usb_disk

Examples of drivers detecting supported PCI/PCIe devices:

http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/graphics/nvidia/driver.c http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/drivers/graphics/radeon_hd/driver.cpp

comment:11 Changed 2 years ago by luroh

Cc: luroh added

comment:12 Changed 2 years ago by phoudoin

Cc: phoudoin added

comment:13 Changed 2 years ago by phoudoin

VirtualBox emulates NVMe controllers too since 5.1.22, FYI.

comment:14 Changed 12 months ago by waddlesplash

There's a BSD-licensed userspace NVMe implementation here: https://github.com/hgst/libnvme

It needs libpciaccess (which we have a working port of) and libnuma (which we do not), but it doesn't look like porting it to kernel space and Haiku-specific APIs would be too hard.

comment:15 Changed 9 months ago by KapiX

Blocking: 14454 added

comment:16 Changed 9 months ago by cb88

All M.2 SSD's can optionally support SATA, however it's common for the faster ones to only support NVME without backward compatibility. Both B and M key interfaces support Sata, but even then there is no guarantee that the mobo or ssd supports both. I'd imagine in bios emulation of AHCI for purely NVME devices is almost nonexistant.

B key = 2x PCIE + SATA (you have to check your drive and mobo specs)
M key = 4x PCIE + SATA       "                              "
B+M key = SATA (always as it lacks the PCIE I/O)

An M keyed drive is least likely to support SATA.

Haiku should probably put Host Memory Buffer support for NVMe drives on the TODO list also. As it allows a massive speed, up for recent cheaper, and therefore common, "cacheless" drives. This basically means dedicating a chunk of memory to the SSD similar to integrated graphics. https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2015/20150813_FJ31_Chen_Dorgello.pdf

Last edited 9 months ago by cb88 (previous) (diff)

comment:17 Changed 9 months ago by waddlesplash

AFAIK, "Host Memory Buffer" support is pretty trivial to add once you bring up the controller itself.

I started work on a NVMe driver offline based on libnvme; I should get it posted somewhere.

comment:18 Changed 6 weeks ago by waddlesplash

Blocking: 14999 added

comment:19 Changed 6 weeks ago by cb88

Wouldn't it make sense to just write a native driver perhaps by adapting a BSD's nvme driver? The libnvme userspace driver seems like it would probably be a lot different from what an in kernel driver should be. And it relies on Linux infrastructure with hugetlbfs etc...

The OpenBSD driver is relatively small... about 4-5k LOC I think across a few files. The FreeBSD driver appears more well organized.

Just for reference the OpenBSD driver was ported to NetBSD and FreeBSD has a driver that appears that it may be more fleshed out, which has contributions from Netflix so is probably solid enough for production use.

https://github.com/NetBSD/src/blob/trunk/sys/dev/pci/nvme_pci.c https://github.com/NetBSD/src/blob/trunk/sys/dev/ic/nvme.c There are a few more related files in the ic directory also.

https://github.com/freebsd/freebsd/tree/master/sys/dev/nvme

comment:20 Changed 6 weeks ago by waddlesplash

The libnvme userspace driver seems like it would probably be a lot different from what an in kernel driver should be.

No, it really isn't. It doesn't use interrupts and polls the DMA structures instead, but our USB HCI drivers often do this as a fallback anyway. Interrupts are easy enough to add later on.

And it relies on Linux infrastructure with hugetlbfs etc...

Only because this is the easiest way to get a physically contiguous buffer in an optimized manner. This code is all split out into their "common" directory, it's easy enough to remove it and replace it with code that works on Haiku instead.

libnvme has the distinct advantage over the BSD drivers in that it is both more complete in terms of features, and is also relatively "unopinionated", providing read/write and control APIs like a library, leaving it up to the consumer to decide what to do with that. The BSD drivers are more tuned to their device driver layers.

comment:21 Changed 6 weeks ago by cb88

That makes sense, especially if you don't have to wrangle the code into doing what *you* want as much and if the interface isn't really holding you back since the common bits are broken out. Also, with NVMe devices you may actually want to stick with polled IO in many cases, as it's lower latency and NVMe latency can be low enough that the overhead of interrupts vs polling flips backwards from what it would usually be. Which is why Linux added polled IO for NVMe in 4.5

If low priority threads got Interrupt driven IO and everything else polled IO perhaps that would make sense... probably would require benchmarking to tell the minute differences though since it's on the order of tens of microseconds. The cumulative effect may be noticeable though. Since in heavy IO applications it made the difference of tens of percent in performance.

comment:22 Changed 6 weeks ago by waddlesplash

Owner: changed from nobody to waddlesplash
Status: newin-progress

Initial (read-only) driver imported in hrev53068.

comment:23 Changed 6 weeks ago by waddlesplash

Blocking: 14454 removed

comment:24 Changed 6 weeks ago by waddlesplash

Component: Drivers/DiskDrivers/Disk/NVMe
Keywords: NVMe removed

comment:25 Changed 6 weeks ago by waddlesplash

Cc: korli added

So, as I mentioned in the commit, it worked initially only because I added a hack for it to find the PCI device. The device manager calls supports_device() for it on a number of items, but only 3 of them are PCI, and none of them have class storage.

korli, if you could take a look here and try to figure out why the device manager seems to be confused here, I'd appreciate it. Building the driver, putting it in /system/non-packaged/add-ons/kernel/drivers/disk and rebooting is sufficient for the kernel to load it; putting it inside the Haiku package and in the boot modules makes no difference.

comment:26 Changed 6 weeks ago by waddlesplash

It seems that only "busses" are probed for drivers: http://xref.plausible.coop/source/xref/haiku/src/system/kernel/device_manager/device_manager.cpp#1742

Well, NVMe devices are not "busses", they're just attached to the PCI bus directly. So is it impossible to make this work without defining a "bus"? That sounds a little ridiculous. If that's the case, I'll probably rewrite the NVMe driver to be a legacy driver until the device manager is refactored/rewritten.

comment:27 Changed 6 weeks ago by waddlesplash

OK, I figured it out; the NVMe disk driver is now included on the images as of hrev53069. Still read-only however.

comment:28 Changed 5 weeks ago by waddlesplash

Resolution: fixed
Status: in-progressclosed

Write support and a initial performance optimizations merged in hrev53079, so this is now fixed. Please open new tickets for any further issues in the driver.

comment:29 Changed 2 weeks ago by korli

@waddlesplash we try to outsource external code when possible. I think it would be possible with libnvme, as a source package (although a gcc2 patch is needed). This would be another ticket anyway. what do you think?

comment:30 Changed 2 weeks ago by waddlesplash

I strongly disagree. Kernel components should have their source code in the tree unless there is some very good reason not to do so. As libnvme is not very large, the benefits of outsourcing are much smaller than the hasstle it would cause in bootstrapping, etc.

comment:31 Changed 2 weeks ago by tqh

I agree with waddlesplash, I've had that pain with GNU-EFI package, and some other package I can't remember. For essential kernel code I'd rather have everything available and instantly editable. If you need to fix anything in an external package you need to:

  • get it locally
  • figure out how to build
  • edit files
  • generate patch and packaging
  • figure out how to use that package
  • figure out how to release

The roundtrip for any kernel development with this is horrible.

comment:32 Changed 2 weeks ago by cb88

As a git submodule perhaps? There is probably also pain with maintaining these in tree libraries in sync with upstream right?

comment:33 Changed 2 weeks ago by kallisti5

Making it a build-package is ideal as long as Haiku will build when the build-package is unavailable. (aka, no libnvme build-package available?, skip building nvme driver)

Let's keep the bootstrap package list as small as possible.

comment:34 Changed 2 weeks ago by waddlesplash

No, the pain of maintaining the in tree libraries is very low. It's actually much less than maintaining the build packages.

The build packages make sense for external noncritical (eg WebKit) or external but large (eg GCC) software. Smaller things like libnvme or the FreeBSD drivers should stay in tree.

There is also the problem for kernel add-ons that at least some of the Haiku specific kernel APIs, eg mutexes, recurisve locks, etc. actually *are not compatible* between KDEBUG and release kernels due to struct layout differences. We really need to fix that, but until someone does, it's impossible to have kernel add-ons in HaikuPorts that use these APIS and work on both.

comment:35 Changed 2 weeks ago by pulkomandy

I would also prefer that we minimize the amount of imported code (and Jamfile maintenance) in our git tree.

Managing external sources is painful, but we should fix/improve that, not workaround it by importing 3rd party sources, which only makes them more annoying to update (no one wants to be maintaining the Jamfiles for these when updating to a new upstream version).

We already have qrencode outsourced (for use in the KDL add-on), so there is even an example of 3rd party code for kernel that works.

comment:36 Changed 2 weeks ago by waddlesplash

Personally I don't mind maintaining the FreeBSD drivers and now libnvme in the tree; merging stuff is not that bad at all. It's much easier to maintain our own modifications this way. I think glibc is the only thing with complicated Jamfiles really, maintaining those isn't an issue. So at least the components I'm responsible for, I'd greatly prefer they stay in the tree.

Note: See TracTickets for help on using tickets.