Opened 2 months ago

Closed 2 months ago

#19105 closed bug (fixed)

FAT: KDLs on device removal without unmount

Reported by: Catmengi Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: File Systems/FAT Version: R1/beta5
Keywords: r1beta5-fixes Cc: Jim906
Blocked By: Blocking:
Platform: All

Description (last modified by Catmengi)

After disconnecting pen drive it leaves phantom /dev/disk/usb/* node. Partition isn't unmounting which leads to "general protection exception" 0x0 Tested on: x64 laptop (i3 6006u "acer travelmate p259")

Steps to recreate: connect pendrive, let haiku mount partition, disconnect pendrive, /dev/disk/usb/* will stay. If you didn't open mounted or waited seconds (vary on cpu speed) partition before and try to open it you will have kernel panic

This issue doesnt exist in previous versions of haiku. So this issue is from bsd fat driver. It mostly dont proper handle hard unmount event / or dont have proper checks of existence of device!

Attachments (3)

IMG_20240917_142227.jpg (4.6 MB ) - added by Catmengi 2 months ago.
Panic log
17265755240137742572183679333737.jpg (1.8 MB ) - added by Catmengi 2 months ago.
17271170438896877801345546111705.jpg (3.6 MB ) - added by Catmengi 2 months ago.

Change History (48)

by Catmengi, 2 months ago

Attachment: IMG_20240917_142227.jpg added

Panic log

comment:1 by Catmengi, 2 months ago

Description: modified (diff)

comment:2 by Catmengi, 2 months ago

Description: modified (diff)

comment:3 by waddlesplash, 2 months ago

Component: DriversFile Systems/FAT
Priority: highnormal

If you didn't Unmount/"Eject" before disconnecting the pendrive, then some misbehavior is expected because there may be writes that weren't flushed to the disk, and you may get data loss. The system shouldn't crash, of course, but disconnecting disks without unmounting them isn't good behavior regardless.

comment:4 by Catmengi, 2 months ago

Note that crash is going even without writing anything to it. It crashes on attempt to read this phantom device

comment:5 by Catmengi, 2 months ago

Also phantom /dev/disk/usb nodes is quite bad for tty users (and maybe system)

comment:6 by waddlesplash, 2 months ago

The "phantom nodes" should go away if you unmount the disk properly before removing it. If you don't mount the device at all after plugging it in, they should also go away when you remove it in that case too.

comment:7 by Catmengi, 2 months ago

Attempt to open right click button leads to gui freeze, then kernel panic. I'll add an attachment

by Catmengi, 2 months ago

comment:8 by waddlesplash, 2 months ago

This is with the disk still inserted?

comment:9 by Catmengi, 2 months ago

Without. With inserted is everything ok, BUT /dev/disk/usb/* still exist

comment:10 by Catmengi, 2 months ago

NEW INFO: device nodes is deleting but /dev/disk/usb/* is not deleted

comment:11 by Catmengi, 2 months ago

Additional info: partition disappeared now, in 5 seconds ± but /dev path still exists, kernel's device nodes in this path is deleted

comment:12 by waddlesplash, 2 months ago

I would expect the /dev file to disappear only after both unmount and then removal of the device, I think.

comment:13 by Catmengi, 2 months ago

Files in /dev/disk/usb/*/* path is deleting after hot unplug drive. This is ok) path/directory isn't deleting itself

comment:14 by Catmengi, 2 months ago

This somehow leading that every pen-drive reconnection create new /dev/disk/usb entry. Leading to garbage in /dev after time

comment:15 by Catmengi, 2 months ago

Please, can you change ticket to kernel/other suitable? This bug isnt related to FAT FS, its related to /dev

comment:16 by waddlesplash, 2 months ago

Summary: /dev/disk/usb phantom nodes after disconnecting pen driveFAT: KDLs on device removal without unmount

All the images you have posted are kernel panics in the FAT driver, so it definitely is. Whether or not the node "disappears" is probably related to this.

comment:17 by Catmengi, 2 months ago

Ok, i agree with this. Currently the issue that this phantom disk crashes kernel. It's not unmounted on hot unplug. /dev/disk/usb/* path isnt deleted leading to useless garbage in it. Possible, the part of kernel that work with pen drive know that it disconnected, but there isnt any automatic unmounting mechanism(or it is) for hot unplug in FAT driver (befs driver possible have this, it isnt crashing os and partition content is not visible after hot unplug). I think this issue can be solved in this way(adding some kind of signaling from device driver, usb drive in this case to fat driver to force unmount partitions that was on this device). But im not an expert in os architecture and haiku's codebase, so this may not be an ideal solution.

comment:18 by Catmengi, 2 months ago

Even attempt to open right click menu of this fat partition crashing the whole system. This isnt the case with BeFS, so this possible really fat issue. Befs unmount property automatically.

comment:19 by waddlesplash, 2 months ago

If you are pulling the device without unmounting it in the right-click menu, you are definitely going to lose data whether you're on BFS or FAT.

comment:20 by Catmengi, 2 months ago

Doesn't matter system shouldn't crash

comment:21 by waddlesplash, 2 months ago

Sure, which is why this ticket exists. But I'm just noting that even once the kernel crashes are fixed, pulling devices isn't good behavior anyway and will certainly get you in trouble.

comment:22 by Catmengi, 2 months ago

Description: modified (diff)

comment:23 by Catmengi, 2 months ago

Description: modified (diff)

comment:24 by Catmengi, 2 months ago

Is anything about this issue changed yet?

comment:25 by Catmengi, 2 months ago

Is it ok if you (waddlesplash) change this bug priority to high? Or it isnt too much critical?

comment:26 by Catmengi, 2 months ago

Also may this be an issue of the new fat driver only? As i read from the beta 5 change log, it use new fat driver from freebsd which may have another method of handling disconnect of the device. I need to do some test on haiku hrev1 beta 4

comment:27 by waddlesplash, 2 months ago

Please be patient and don't comment every day. Haiku is a project mostly run by volunteers, things take time.

comment:28 by Catmengi, 2 months ago

I discovered that this issue only exist in haiku hrev1/beta 5. This issue doesn't exist in beta 4

comment:29 by Catmengi, 2 months ago

Description: modified (diff)

comment:30 by Catmengi, 2 months ago

Sorry for yet another comment but this is new info about error. Partition can unmount itself. But if you attempt to "ls" immediately in this disconnected disk you will have crash, if you will wait 5 seconds it will work properly

comment:31 by Catmengi, 2 months ago

Description: modified (diff)

comment:32 by Catmengi, 2 months ago

Found that this error appear in line 721 kernel_interface.cpp i have no idea what causing it

comment:33 by waddlesplash, 2 months ago

The stack trace appears to show that "brelse" is the crashing function, and that's in vfs_bio.c not kernel_interface.cpp.

comment:34 by Catmengi, 2 months ago

I am able to fix this issue by adding this line

	if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL || bp->b_vp->v_rdev->si_mountpt == NULL)
		return;

into src/add-ons/kernel/fat/bsd/kern/vfs_bio.c into start of brelse()

(sorry i cant make this patch now, because i modified this file before this fix)

comment:35 by Catmengi, 2 months ago

This mod caused another issue "called on busy vnode", i'll add screnshot of it

by Catmengi, 2 months ago

comment:36 by Catmengi, 2 months ago

Looks like that i fixed this in this patch:

	if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL) {
		put_buf(bp);
		return;
	}

it not crashes in my basic disconnect-ls test now

comment:37 by Catmengi, 2 months ago

Also _HaikuAutoCreated creates immediately

comment:38 by Catmengi, 2 months ago

Note about this patch, it should be inserted at start of brelse, before variables, or i will try to create a commit my self, but i cant done this right now, sorry

comment:39 by korli, 2 months ago

Cc: Jim906 added

comment:40 by korli, 2 months ago

bsdVolume should only be needed, declared and defined when bp->b_vreg == NULL. Thus add a return in the first if block, then declare bsdVolume and others. Then comes the if block for bp->b_owned == false.

comment:41 by Jim906, 2 months ago

I submitted a patch for review (https://review.haiku-os.org/c/haiku/+/8363), based on korli's advice.

comment:42 by Catmengi, 2 months ago

Unhandled page fault because it accessing element of structure pointer that is NULL

comment:43 by Catmengi, 2 months ago

this works:

void
brelse(struct buf* bp)
{
	if (bp->b_vreg != NULL) {
		put_buf(bp);
		return;
	}
	struct mount* bsdVolume = bp->b_vp->v_rdev->si_mountpt;
	void* blockCache = bsdVolume->mnt_cache;
	bool readOnly = MOUNTED_READ_ONLY(VFSTOMSDOSFS(bsdVolume));
	if (bp->b_owned == false) {
		if (readOnly == true)
			block_cache_set_dirty(blockCache, bp->b_blkno, false, -1);
		block_cache_put(blockCache, bp->b_blkno);
		put_buf(bp);
	} else {
		uint32 cBlockCount = bp->b_bufsize / CACHED_BLOCK_SIZE;
		uint32 i;
		for (i = 0; i < cBlockCount && bp->b_bcpointers[i] != NULL; ++i) {
			if (readOnly == true)
				block_cache_set_dirty(blockCache, bp->b_blkno + i, false, -1);
			block_cache_put(blockCache, bp->b_blkno + i);
			bp->b_bcpointers[i] = NULL;
		}

		put_buf(bp);
	}

	return;
}

comment:44 by Catmengi, 2 months ago

Sorry, i miss understood your commit message in gerrit, i downloaded it latest content and it working now)

comment:45 by waddlesplash, 2 months ago

Keywords: r1beta5-fixes added
Resolution: fixed
Status: newclosed

Fixed in hrev58165 +beta5.

Note: See TracTickets for help on using tickets.