Opened 3 months ago
Closed 3 months ago
#19105 closed bug (fixed)
FAT: KDLs on device removal without unmount
Reported by: | Catmengi | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | File Systems/FAT | Version: | R1/beta5 |
Keywords: | r1beta5-fixes | Cc: | Jim906 |
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
After disconnecting pen drive it leaves phantom /dev/disk/usb/* node. Partition isn't unmounting which leads to "general protection exception" 0x0 Tested on: x64 laptop (i3 6006u "acer travelmate p259")
Steps to recreate: connect pendrive, let haiku mount partition, disconnect pendrive, /dev/disk/usb/* will stay. If you didn't open mounted or waited seconds (vary on cpu speed) partition before and try to open it you will have kernel panic
This issue doesnt exist in previous versions of haiku. So this issue is from bsd fat driver. It mostly dont proper handle hard unmount event / or dont have proper checks of existence of device!
Attachments (3)
Change History (48)
by , 3 months ago
Attachment: | IMG_20240917_142227.jpg added |
---|
comment:1 by , 3 months ago
Description: | modified (diff) |
---|
comment:2 by , 3 months ago
Description: | modified (diff) |
---|
comment:3 by , 3 months ago
Component: | Drivers → File Systems/FAT |
---|---|
Priority: | high → normal |
If you didn't Unmount/"Eject" before disconnecting the pendrive, then some misbehavior is expected because there may be writes that weren't flushed to the disk, and you may get data loss. The system shouldn't crash, of course, but disconnecting disks without unmounting them isn't good behavior regardless.
comment:4 by , 3 months ago
Note that crash is going even without writing anything to it. It crashes on attempt to read this phantom device
comment:5 by , 3 months ago
Also phantom /dev/disk/usb nodes is quite bad for tty users (and maybe system)
comment:6 by , 3 months ago
The "phantom nodes" should go away if you unmount the disk properly before removing it. If you don't mount the device at all after plugging it in, they should also go away when you remove it in that case too.
comment:7 by , 3 months ago
Attempt to open right click button leads to gui freeze, then kernel panic. I'll add an attachment
by , 3 months ago
Attachment: | 17265755240137742572183679333737.jpg added |
---|
comment:9 by , 3 months ago
Without. With inserted is everything ok, BUT /dev/disk/usb/* still exist
comment:11 by , 3 months ago
Additional info: partition disappeared now, in 5 seconds ± but /dev path still exists, kernel's device nodes in this path is deleted
comment:12 by , 3 months ago
I would expect the /dev file to disappear only after both unmount and then removal of the device, I think.
comment:13 by , 3 months ago
Files in /dev/disk/usb/*/* path is deleting after hot unplug drive. This is ok) path/directory isn't deleting itself
comment:14 by , 3 months ago
This somehow leading that every pen-drive reconnection create new /dev/disk/usb entry. Leading to garbage in /dev after time
comment:15 by , 3 months ago
Please, can you change ticket to kernel/other suitable? This bug isnt related to FAT FS, its related to /dev
comment:16 by , 3 months ago
Summary: | /dev/disk/usb phantom nodes after disconnecting pen drive → FAT: KDLs on device removal without unmount |
---|
All the images you have posted are kernel panics in the FAT driver, so it definitely is. Whether or not the node "disappears" is probably related to this.
comment:17 by , 3 months ago
Ok, i agree with this. Currently the issue that this phantom disk crashes kernel. It's not unmounted on hot unplug. /dev/disk/usb/* path isnt deleted leading to useless garbage in it. Possible, the part of kernel that work with pen drive know that it disconnected, but there isnt any automatic unmounting mechanism(or it is) for hot unplug in FAT driver (befs driver possible have this, it isnt crashing os and partition content is not visible after hot unplug). I think this issue can be solved in this way(adding some kind of signaling from device driver, usb drive in this case to fat driver to force unmount partitions that was on this device). But im not an expert in os architecture and haiku's codebase, so this may not be an ideal solution.
comment:18 by , 3 months ago
Even attempt to open right click menu of this fat partition crashing the whole system. This isnt the case with BeFS, so this possible really fat issue. Befs unmount property automatically.
comment:19 by , 3 months ago
If you are pulling the device without unmounting it in the right-click menu, you are definitely going to lose data whether you're on BFS or FAT.
comment:21 by , 3 months ago
Sure, which is why this ticket exists. But I'm just noting that even once the kernel crashes are fixed, pulling devices isn't good behavior anyway and will certainly get you in trouble.
comment:22 by , 3 months ago
Description: | modified (diff) |
---|
comment:23 by , 3 months ago
Description: | modified (diff) |
---|
comment:25 by , 3 months ago
Is it ok if you (waddlesplash) change this bug priority to high? Or it isnt too much critical?
comment:26 by , 3 months ago
Also may this be an issue of the new fat driver only? As i read from the beta 5 change log, it use new fat driver from freebsd which may have another method of handling disconnect of the device. I need to do some test on haiku hrev1 beta 4
comment:27 by , 3 months ago
Please be patient and don't comment every day. Haiku is a project mostly run by volunteers, things take time.
comment:28 by , 3 months ago
I discovered that this issue only exist in haiku hrev1/beta 5. This issue doesn't exist in beta 4
comment:29 by , 3 months ago
Description: | modified (diff) |
---|
comment:30 by , 3 months ago
Sorry for yet another comment but this is new info about error. Partition can unmount itself. But if you attempt to "ls" immediately in this disconnected disk you will have crash, if you will wait 5 seconds it will work properly
comment:31 by , 3 months ago
Description: | modified (diff) |
---|
comment:32 by , 3 months ago
Found that this error appear in line 721 kernel_interface.cpp i have no idea what causing it
comment:33 by , 3 months ago
The stack trace appears to show that "brelse" is the crashing function, and that's in vfs_bio.c not kernel_interface.cpp.
comment:34 by , 3 months ago
I am able to fix this issue by adding this line
if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL || bp->b_vp->v_rdev->si_mountpt == NULL) return;
into src/add-ons/kernel/fat/bsd/kern/vfs_bio.c into start of brelse()
(sorry i cant make this patch now, because i modified this file before this fix)
comment:35 by , 3 months ago
This mod caused another issue "called on busy vnode", i'll add screnshot of it
by , 3 months ago
Attachment: | 17271170438896877801345546111705.jpg added |
---|
comment:36 by , 3 months ago
Looks like that i fixed this in this patch:
if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL) { put_buf(bp); return; }
it not crashes in my basic disconnect-ls test now
comment:38 by , 3 months ago
Note about this patch, it should be inserted at start of brelse, before variables, or i will try to create a commit my self, but i cant done this right now, sorry
comment:39 by , 3 months ago
Cc: | added |
---|
comment:40 by , 3 months ago
bsdVolume should only be needed, declared and defined when bp->b_vreg == NULL. Thus add a return in the first if block, then declare bsdVolume and others. Then comes the if block for bp->b_owned == false.
comment:41 by , 3 months ago
I submitted a patch for review (https://review.haiku-os.org/c/haiku/+/8363), based on korli's advice.
comment:42 by , 3 months ago
Unhandled page fault because it accessing element of structure pointer that is NULL
comment:43 by , 3 months ago
this works:
void brelse(struct buf* bp) { if (bp->b_vreg != NULL) { put_buf(bp); return; } struct mount* bsdVolume = bp->b_vp->v_rdev->si_mountpt; void* blockCache = bsdVolume->mnt_cache; bool readOnly = MOUNTED_READ_ONLY(VFSTOMSDOSFS(bsdVolume)); if (bp->b_owned == false) { if (readOnly == true) block_cache_set_dirty(blockCache, bp->b_blkno, false, -1); block_cache_put(blockCache, bp->b_blkno); put_buf(bp); } else { uint32 cBlockCount = bp->b_bufsize / CACHED_BLOCK_SIZE; uint32 i; for (i = 0; i < cBlockCount && bp->b_bcpointers[i] != NULL; ++i) { if (readOnly == true) block_cache_set_dirty(blockCache, bp->b_blkno + i, false, -1); block_cache_put(blockCache, bp->b_blkno + i); bp->b_bcpointers[i] = NULL; } put_buf(bp); } return; }
comment:44 by , 3 months ago
Sorry, i miss understood your commit message in gerrit, i downloaded it latest content and it working now)
comment:45 by , 3 months ago
Keywords: | r1beta5-fixes added |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Fixed in hrev58165 +beta5.
Panic log