Opened 23 months ago

Closed 21 months ago

Last modified 21 months ago

#15818 closed bug (fixed)

NVMe data loss

Reported by: KapiX Owned by: waddlesplash
Priority: normal Milestone: R1/beta2
Component: Drivers/Disk/NVMe Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Symptoms:

  1. Invalid opcode when running executables from it, ninja crashes trying to rebuild (WebKit). After rebooting had to rebuild entire project.
  1. Now my WebKit git repo is corrupted (unknown index entry format 0x74650000).

Nothing suspicious in syslog, I'm running Linux from the same drive without issues (3 months now).

Drive: Intel 760p 512GB.

Change History (8)

comment:1 by KapiX, 23 months ago

hrev53992 64-bit, checkfs doesn't complain.

comment:2 by KapiX, 23 months ago

rm -f .git/index; git reset -> #15144

comment:3 by KapiX, 23 months ago

Component: - GeneralDrivers/Disk/NVMe
Owner: changed from nobody to waddlesplash

comment:4 by waddlesplash, 23 months ago

There's also #15123 which is a series of very strange KDLs under VMware, which I couldn't reproduce on QEMU, so I guess I should try again there.

X512 mentioned that a bug may be (and the stacktraces there seem to indicate it) in the kernel disk cache code, which may not be able to tolerate NVMe returning stuff in parallel. However, the "completion event for unknown cmd" is much more suspicious, I still don't know what to make of that.

comment:5 by waddlesplash, 23 months ago

KDL probably resolved in hrev53997, but corruption remains.

comment:6 by waddlesplash, 23 months ago

Actually the "corruption" may have just been on the underlying disk and compounded by some BFS driver issues when corruption is present. See comment in #15123; I'd wait to retest until Diver confirms that is gone however.

comment:7 by waddlesplash, 21 months ago

Resolution: fixed
Status: newclosed

Fixed in hrev54102, but be a bit cautious at first.

comment:8 by nielx, 21 months ago

Milestone: UnscheduledR1/beta2

Assign tickets with status=closed and resolution=fixed within the R1/beta2 development window to the R1/beta2 Milestone

Note: See TracTickets for help on using tickets.