Opened 7 weeks ago

Closed 6 weeks ago

#19105 closed bug (fixed)

FAT: KDLs on device removal without unmount

Reported by: Catmengi Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: File Systems/FAT Version: R1/beta5
Keywords: r1beta5-fixes Cc: Jim906
Blocked By: Blocking:
Platform: All

Description (last modified by Catmengi)

After disconnecting pen drive it leaves phantom /dev/disk/usb/* node. Partition isn't unmounting which leads to "general protection exception" 0x0 Tested on: x64 laptop (i3 6006u "acer travelmate p259")

Steps to recreate: connect pendrive, let haiku mount partition, disconnect pendrive, /dev/disk/usb/* will stay. If you didn't open mounted or waited seconds (vary on cpu speed) partition before and try to open it you will have kernel panic

This issue doesnt exist in previous versions of haiku. So this issue is from bsd fat driver. It mostly dont proper handle hard unmount event / or dont have proper checks of existence of device!

Attachments (3)

IMG_20240917_142227.jpg (4.6 MB ) - added by Catmengi 7 weeks ago.
Panic log
17265755240137742572183679333737.jpg (1.8 MB ) - added by Catmengi 7 weeks ago.
17271170438896877801345546111705.jpg (3.6 MB ) - added by Catmengi 6 weeks ago.

Change History (48)

by Catmengi, 7 weeks ago

Attachment: IMG_20240917_142227.jpg added

Panic log

comment:1 by Catmengi, 7 weeks ago

Description: modified (diff)

comment:2 by Catmengi, 7 weeks ago

Description: modified (diff)

comment:3 by waddlesplash, 7 weeks ago

Component: DriversFile Systems/FAT
Priority: highnormal

If you didn't Unmount/"Eject" before disconnecting the pendrive, then some misbehavior is expected because there may be writes that weren't flushed to the disk, and you may get data loss. The system shouldn't crash, of course, but disconnecting disks without unmounting them isn't good behavior regardless.

comment:4 by Catmengi, 7 weeks ago

Note that crash is going even without writing anything to it. It crashes on attempt to read this phantom device

comment:5 by Catmengi, 7 weeks ago

Also phantom /dev/disk/usb nodes is quite bad for tty users (and maybe system)

comment:6 by waddlesplash, 7 weeks ago

The "phantom nodes" should go away if you unmount the disk properly before removing it. If you don't mount the device at all after plugging it in, they should also go away when you remove it in that case too.

comment:7 by Catmengi, 7 weeks ago

Attempt to open right click button leads to gui freeze, then kernel panic. I'll add an attachment

by Catmengi, 7 weeks ago

comment:8 by waddlesplash, 7 weeks ago

This is with the disk still inserted?

comment:9 by Catmengi, 7 weeks ago

Without. With inserted is everything ok, BUT /dev/disk/usb/* still exist

comment:10 by Catmengi, 7 weeks ago

NEW INFO: device nodes is deleting but /dev/disk/usb/* is not deleted

comment:11 by Catmengi, 7 weeks ago

Additional info: partition disappeared now, in 5 seconds ± but /dev path still exists, kernel's device nodes in this path is deleted

comment:12 by waddlesplash, 7 weeks ago

I would expect the /dev file to disappear only after both unmount and then removal of the device, I think.

comment:13 by Catmengi, 7 weeks ago

Files in /dev/disk/usb/*/* path is deleting after hot unplug drive. This is ok) path/directory isn't deleting itself

comment:14 by Catmengi, 7 weeks ago

This somehow leading that every pen-drive reconnection create new /dev/disk/usb entry. Leading to garbage in /dev after time

comment:15 by Catmengi, 7 weeks ago

Please, can you change ticket to kernel/other suitable? This bug isnt related to FAT FS, its related to /dev

comment:16 by waddlesplash, 7 weeks ago

Summary: /dev/disk/usb phantom nodes after disconnecting pen driveFAT: KDLs on device removal without unmount

All the images you have posted are kernel panics in the FAT driver, so it definitely is. Whether or not the node "disappears" is probably related to this.

comment:17 by Catmengi, 7 weeks ago

Ok, i agree with this. Currently the issue that this phantom disk crashes kernel. It's not unmounted on hot unplug. /dev/disk/usb/* path isnt deleted leading to useless garbage in it. Possible, the part of kernel that work with pen drive know that it disconnected, but there isnt any automatic unmounting mechanism(or it is) for hot unplug in FAT driver (befs driver possible have this, it isnt crashing os and partition content is not visible after hot unplug). I think this issue can be solved in this way(adding some kind of signaling from device driver, usb drive in this case to fat driver to force unmount partitions that was on this device). But im not an expert in os architecture and haiku's codebase, so this may not be an ideal solution.

comment:18 by Catmengi, 7 weeks ago

Even attempt to open right click menu of this fat partition crashing the whole system. This isnt the case with BeFS, so this possible really fat issue. Befs unmount property automatically.

comment:19 by waddlesplash, 7 weeks ago

If you are pulling the device without unmounting it in the right-click menu, you are definitely going to lose data whether you're on BFS or FAT.

comment:20 by Catmengi, 7 weeks ago

Doesn't matter system shouldn't crash

comment:21 by waddlesplash, 7 weeks ago

Sure, which is why this ticket exists. But I'm just noting that even once the kernel crashes are fixed, pulling devices isn't good behavior anyway and will certainly get you in trouble.

comment:22 by Catmengi, 7 weeks ago

Description: modified (diff)

comment:23 by Catmengi, 7 weeks ago

Description: modified (diff)

comment:24 by Catmengi, 6 weeks ago

Is anything about this issue changed yet?

comment:25 by Catmengi, 6 weeks ago

Is it ok if you (waddlesplash) change this bug priority to high? Or it isnt too much critical?

comment:26 by Catmengi, 6 weeks ago

Also may this be an issue of the new fat driver only? As i read from the beta 5 change log, it use new fat driver from freebsd which may have another method of handling disconnect of the device. I need to do some test on haiku hrev1 beta 4

comment:27 by waddlesplash, 6 weeks ago

Please be patient and don't comment every day. Haiku is a project mostly run by volunteers, things take time.

comment:28 by Catmengi, 6 weeks ago

I discovered that this issue only exist in haiku hrev1/beta 5. This issue doesn't exist in beta 4

comment:29 by Catmengi, 6 weeks ago

Description: modified (diff)

comment:30 by Catmengi, 6 weeks ago

Sorry for yet another comment but this is new info about error. Partition can unmount itself. But if you attempt to "ls" immediately in this disconnected disk you will have crash, if you will wait 5 seconds it will work properly

comment:31 by Catmengi, 6 weeks ago

Description: modified (diff)

comment:32 by Catmengi, 6 weeks ago

Found that this error appear in line 721 kernel_interface.cpp i have no idea what causing it

comment:33 by waddlesplash, 6 weeks ago

The stack trace appears to show that "brelse" is the crashing function, and that's in vfs_bio.c not kernel_interface.cpp.

comment:34 by Catmengi, 6 weeks ago

I am able to fix this issue by adding this line

	if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL || bp->b_vp->v_rdev->si_mountpt == NULL)
		return;

into src/add-ons/kernel/fat/bsd/kern/vfs_bio.c into start of brelse()

(sorry i cant make this patch now, because i modified this file before this fix)

comment:35 by Catmengi, 6 weeks ago

This mod caused another issue "called on busy vnode", i'll add screnshot of it

by Catmengi, 6 weeks ago

comment:36 by Catmengi, 6 weeks ago

Looks like that i fixed this in this patch:

	if(bp->b_vp == NULL || bp->b_vp->v_rdev == NULL) {
		put_buf(bp);
		return;
	}

it not crashes in my basic disconnect-ls test now

comment:37 by Catmengi, 6 weeks ago

Also _HaikuAutoCreated creates immediately

comment:38 by Catmengi, 6 weeks ago

Note about this patch, it should be inserted at start of brelse, before variables, or i will try to create a commit my self, but i cant done this right now, sorry

comment:39 by korli, 6 weeks ago

Cc: Jim906 added

comment:40 by korli, 6 weeks ago

bsdVolume should only be needed, declared and defined when bp->b_vreg == NULL. Thus add a return in the first if block, then declare bsdVolume and others. Then comes the if block for bp->b_owned == false.

comment:41 by Jim906, 6 weeks ago

I submitted a patch for review (https://review.haiku-os.org/c/haiku/+/8363), based on korli's advice.

comment:42 by Catmengi, 6 weeks ago

Unhandled page fault because it accessing element of structure pointer that is NULL

comment:43 by Catmengi, 6 weeks ago

this works:

void
brelse(struct buf* bp)
{
	if (bp->b_vreg != NULL) {
		put_buf(bp);
		return;
	}
	struct mount* bsdVolume = bp->b_vp->v_rdev->si_mountpt;
	void* blockCache = bsdVolume->mnt_cache;
	bool readOnly = MOUNTED_READ_ONLY(VFSTOMSDOSFS(bsdVolume));
	if (bp->b_owned == false) {
		if (readOnly == true)
			block_cache_set_dirty(blockCache, bp->b_blkno, false, -1);
		block_cache_put(blockCache, bp->b_blkno);
		put_buf(bp);
	} else {
		uint32 cBlockCount = bp->b_bufsize / CACHED_BLOCK_SIZE;
		uint32 i;
		for (i = 0; i < cBlockCount && bp->b_bcpointers[i] != NULL; ++i) {
			if (readOnly == true)
				block_cache_set_dirty(blockCache, bp->b_blkno + i, false, -1);
			block_cache_put(blockCache, bp->b_blkno + i);
			bp->b_bcpointers[i] = NULL;
		}

		put_buf(bp);
	}

	return;
}

comment:44 by Catmengi, 6 weeks ago

Sorry, i miss understood your commit message in gerrit, i downloaded it latest content and it working now)

comment:45 by waddlesplash, 6 weeks ago

Keywords: r1beta5-fixes added
Resolution: fixed
Status: newclosed

Fixed in hrev58165 +beta5.

Note: See TracTickets for help on using tickets.