Opened 16 years ago

Closed 15 years ago

#2400 closed bug (fixed)

[vfs]: vnode is not becoming unbusy

Reported by: emitrax Owned by: emitrax
Priority: high Milestone: R1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

I packed the whole haiku source code and put it on a usbstick. Trying to unpacking with haiku on vmware causes KDL after a while. Screenshots attached.

Assigning this to myself since it's part of my HCD.

Attachments (5)

bfs_crash.jpg (131.7 KB ) - added by emitrax 16 years ago.
unpacking_crash-st-page-writer.jpg (64.8 KB ) - added by emitrax 16 years ago.
unpacking_crash-st-traced.jpg (166.7 KB ) - added by emitrax 16 years ago.
bfs crash r26082.JPG (148.9 KB ) - added by atomozero 16 years ago.
bt.jpg (295.2 KB ) - added by luroh 16 years ago.

Download all attachments as: .zip

Change History (14)

by emitrax, 16 years ago

Attachment: bfs_crash.jpg added

comment:1 by emitrax, 16 years ago

Summary: [BFS]: KDL while unpacking tarball cointaning haiku source code[vfs]: vnode is not becoming unbusy

This seems to be more a vfs bug. This time it was the page writer who causes the crash.

by emitrax, 16 years ago

by emitrax, 16 years ago

by atomozero, 16 years ago

Attachment: bfs crash r26082.JPG added

comment:2 by atomozero, 16 years ago

I have reproduce the bug whit a centrino laptop (1,40 GHz) and 512 MB ram (Haiku see only 503 MB in about windows)

comment:3 by emitrax, 16 years ago

Component: File Systems/BFSSystem/Kernel

Although the situation has improved since the introduction of the I/O scheduler, this still happens under heavy I/O operations.

The problem may be related to the ide/scsi module as Ingo(or Axel, don't really remember) suggested.

Quoting Axel from the haiku-gsoc mailing list, since this happens on low memory situation: "Codepaths necessary to recover from extreme memory shortage should have some preallocated buffers around they can use then". (Marcus?)

comment:4 by bonefish, 16 years ago

I believe I have already mentioned that on the mailing list, this doesn't need to be any kind of deadlock due to memory problems. The "low memory handler" stack trace even suggests that it is still running (but was preempted). I still think it's possible that syncing the file in question simply took longer than the 10 seconds get_vnode() waits. IIRC you've already tried to increase the timeout without success, which could mean that the low memory handler was starved for some reason. Enabling scheduler tracing would help to verify this.

If you have a test case that allows to reproduce the problem more or less reliably, I could have a look.

in reply to:  4 comment:5 by emitrax, 16 years ago

Replying to bonefish:

If you have a test case that allows to reproduce the problem more or less reliably, I could have a look.

I used to trigger this by unpacking the haiku source tree from a usb stick, but I don't know if that's still the case.

comment:6 by emitrax, 16 years ago

I thought I've added the scheduler tracing support, but seems like I needed to run jam -a to include it. Obviously I realized that only when I triggered the bug. A bit too late.

Anyway, while I rebuild the image in order to re-run the test, if you feel you wanna look into it, try the following to trigger it

#!/bin/sh
for a in `seq 1 1000`
do
  for b in `seq 1 1000`
  do
    for c in `seq 1 1000`
    do
       dd if=/dev/zero of=file bs=1 count=`expr $a \* $b \* $c`;
       rm file;
    done
  done
done

comment:7 by luroh, 16 years ago

emitrax: how's the build coming along? :-)
Seriously though, would you say this ticket is still valid?
Running your above script for hours on real hw results not in a KDL but in a freeze. Pic of "<F12> + bt" attached. Should I open a new ticket?

by luroh, 16 years ago

Attachment: bt.jpg added

comment:8 by luroh, 15 years ago

36733, gcc2, trunk.

Tried to repeat the original problem, unpacking the Haiku source code from a USB stick in VMware, but no KDL. After many hours, it had successfully extracted the 800 MB zip file. Also tried running the above script for 10 hours without any problem. Time to close this one?

comment:9 by bonefish, 15 years ago

Resolution: fixed
Status: newclosed

Yep, closing. Please reopen, if the problem still persists.

Note: See TracTickets for help on using tickets.