Opened 7 years ago

Last modified 4 years ago

#9449 new bug

bfs: SegFault while unmounting at shutdown

Reported by: jahaiku Owned by: axeld
Priority: normal Milestone: R1
Component: File Systems/BFS Version: R1/alpha4.1
Keywords: Cc:
Blocked By: Blocking: #10443
Has a Patch: no Platform: All

Description

This is gcc4 build of hrev45257 on real Hardware (Eeepc)

Since long time I install Haiku by writing a AnyBoot-Image to a USB-Stick and booting from that. Now I get a reproducible SegFault when:

  1. Boot from the AnyBoot-USB-Stick
  2. Choose Language and Keyboard in "Welcome Haiku"
  3. Choose "Installer" and NOT "Boot to Desktop"
  4. Setup Disk (HDD in EeePC) with DriveSetup (I have used RAW, without partition)
  5. Install Haiku into that
  6. After Install has finished, click on "Reboot" in the installer.
  7. You will see activity on the HDD, perhaps the umount of HDD
  8. You will see activity on the booted USB-Stick, but then the SegFault will occur

StackTrace attached in 2 JPGs.

Attachments (5)

bfs1.jpg (353.0 KB ) - added by jahaiku 7 years ago.
bfs2.jpg (327.1 KB ) - added by jahaiku 7 years ago.
SAM_0580.JPG (577.9 KB ) - added by dsjonny 7 years ago.
bfs3.jpg (287.1 KB ) - added by jahaiku 7 years ago.
SAM_0611.JPG (312.5 KB ) - added by dsjonny 7 years ago.

Download all attachments as: .zip

Change History (22)

by jahaiku, 7 years ago

Attachment: bfs1.jpg added

by jahaiku, 7 years ago

Attachment: bfs2.jpg added

comment:1 by jahaiku, 7 years ago

I have install without this SegFault 2-3 weeks ago. Perhaps this SegFault is a side effect of the latest changes in the partitioning (GPT add) code.

comment:2 by dsjonny, 7 years ago

I got the same problem too after press the "Reboot" button in the Installer. A have attached an image too. (I found this error since some nightly image.)

by dsjonny, 7 years ago

Attachment: SAM_0580.JPG added

comment:3 by axeld, 7 years ago

jahaiku: is the debugger message the same as for dsjonny? Please always make sure to include that one in the screen shots.

comment:4 by dsjonny, 7 years ago

The bfs1.png (the 2nd part of the iamge) looks like very similar as my image, That's why I add-note my image too.

in reply to:  3 comment:5 by jahaiku, 7 years ago

Replying to axeld:

jahaiku: is the debugger message the same as for dsjonny? Please always make sure to include that one in the screen shots.

I think it is the same. My bfs1.jpg shows the StackTrace which was shown automatically when Haiku entered the kernel debugger. bfs2.jpg was generated right after the first one by additionally typing "bt" in the kernel debugger. This is on EeePC which has only 800x480, there it is a bit difficult to see the complete bt's.

comment:6 by axeld, 7 years ago

The "message" command shows the KDL message again.

comment:7 by jahaiku, 7 years ago

Here is the screen shot (bfs3.jpg) of the message of the StackTrace.

by jahaiku, 7 years ago

Attachment: bfs3.jpg added

comment:8 by dsjonny, 7 years ago

I got it again. After I have installed the Haiku to my SSD from USB (using nightly-anyboot image), I pressed the "Reboot" button in the Installer, and got a KDL.

In the syslog I found some lines, maybe helps something:

usb_disk: unhandled ioctl 10101
usb error ehci -1: qtd (0x7ecbb80) error: 0x88008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
PageWriteWrapper: Failed to write page 0x83c32d18: No media present
usb error ehci -1: qtd (0x7ed3500) error: 0x08008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
PageWriteWrapper: Failed to write page 0x83c32d18: No media present
usb error ehci -1: qtd (0x7ed7f80) error: 0x08008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
bfs: could not write log area: No media present!
usb error ehci -1: qtd (0x7ed8a80) error: 0x02008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
usb_disk: operation 0x35 failed at the SCSI level
bfs: writing current log entry failed: I/O error
usb error ehci -1: qtd (0x7edc100) error: 0x10008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
PageWriteWrapper: Failed to write page 0x83c32d18: No media present
usb error ehci -1: qtd (0x7ede800) error: 0x10008c40
usb_disk: operation 0x2a failed at the SCSI level
usb_disk: write fails with 0x8000a003
PageWriteWrapper: Failed to write page 0x83c32d18: No media present

Maybe the system is unmounting the disks before it writes everything what it want?

And I got a new image attached to the ticket.

by dsjonny, 7 years ago

Attachment: SAM_0611.JPG added

comment:9 by korli, 6 years ago

Can you reproduce with a current nightly?

comment:10 by ithamar, 4 years ago

I have seen the same KDL as the last picture on removal of a R/W mounted BFS partition on a USB disk. I suspect somehow the USB device is already gone when the sync call is made, and BlockWriter does a panic() on not being able to write some blocks back to the device.

I'm looking into the USB removal case, I'll post my findings here.

comment:11 by vidrep, 4 years ago

I believe my ticket #10443 is a duplicate.

comment:12 by pulkomandy, 4 years ago

Blocking: 10443 added

comment:13 by ithamar, 4 years ago

ok, in the case of my USB disk removal the notification of the USB stack that the device was removed comes after BFS tries to update the journal. Since the BlockWriter in the block cache panic's on failed writes (and reads), the panic comes before the notification. I've simply tried making the panics TRACE_ALWAYS() and I see a bunch of them in the syslog then, just before the "Device Removed" notification comes by, and things seem to be fine.

I've had a quick look at the installer code, but I don't see it ejecting the boot volume or such, is this triggered from some other part of the system? Somehow it seems the USB (boot) disk has disappeared by time the block cache tries to write to it. In my removal case, there's no hope of writing it ever, but in this installer case, I guess it should get the chance.

comment:14 by vidrep, 4 years ago

Ticket #10462 is related and can be added to the batch.

in reply to:  13 ; comment:15 by axeld, 4 years ago

Replying to ithamar:

I've simply tried making the panics TRACE_ALWAYS() and I see a bunch of them in the syslog then, just before the "Device Removed" notification comes by, and things seem to be fine.

The panic there should definitely be removed for releases. Or maybe we're already at a point where we can do that. It's just there to help stumble upon bugs that might otherwise be silently ignored.

in reply to:  15 comment:16 by ithamar, 4 years ago

Replying to axeld:

The panic there should definitely be removed for releases. Or maybe we're already at a point where we can do that. It's just there to help stumble upon bugs that might otherwise be silently ignored.

Agreed, though in the case of this ticket, I'm still surprised to see it triggered. The USB removal case is clear-cut, device is gone, but here it seems USB becomes inaccessible at a time it should not be yet. I hope to have some time this week to see if I can reproduce this issue and figure out where the problem is coming from....

comment:17 by vidrep, 4 years ago

While doing USB anyboot installs for another ticket, I noted that not all builds of the same revision cause a KDL after clicking "reboot" in the installer. Using hrev49132, and using the same USB flash drive, I get a KDL with x86_gcc2 and x86_64, but not with x86. I'll try x86_gcc4 next and edit the comments.

Note: See TracTickets for help on using tickets.