Opened 12 years ago

Closed 12 years ago

#1003 closed bug (fixed)

KDL while expanding a file

Reported by: jackburton Owned by: axeld
Priority: blocker Milestone: R1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

After some time, I was thrown into KDL. Screenshots follow.

After the KDL, I can't create files any more onto that partition.

Attachments (5)

IMG004.JPG (174.4 KB) - added by jackburton 12 years ago.
KDL
IMG005.JPG (159.9 KB) - added by jackburton 12 years ago.
backtrace
backtrace2.jpg (38.8 KB) - added by jackburton 12 years ago.
new backtrace
unzip-kdl-sc1.jpg (201.3 KB) - added by serpentor 12 years ago.
kdl.png (25.0 KB) - added by jackburton 12 years ago.
new kdl

Download all attachments as: .zip

Change History (24)

comment:1 Changed 12 years ago by jackburton

Interestingly, I can't seem to be able to reproduce it inside qemu, so you'll have to wait a bit for the screenshot. Maybe it's related to our IDE driver ?

Changed 12 years ago by jackburton

Attachment: IMG004.JPG added

KDL

Changed 12 years ago by jackburton

Attachment: IMG005.JPG added

backtrace

comment:2 Changed 12 years ago by jackburton

And sorry for the crappyness of the shots.

comment:3 Changed 12 years ago by jackburton

Since it happens only on real hardware here, I'll share the configuration: it's a laptop, AMD Sempron 2600+, 512 Mb Ram (64 taken by the onboard graphic adapter, an unsupported sis chipset).

comment:4 Changed 12 years ago by jackburton

Axel, I see that in bfs_free_cookie() we ReadLock() the inode, but since we mess with the index, shouldn't we WriteLock() instead ?

comment:5 Changed 12 years ago by jackburton

BTW I could just reproduce on qemu.

comment:6 Changed 12 years ago by jackburton

Using a writelock in bfs_free_cookie() fixes the problem.

comment:7 Changed 12 years ago by axeld

I've fixed the locking issue in hrev20079, thanks for the note. However, I don't think this can really prevent the bug from happening; at least it doesn't look like the source of the problem. The index inode is of course write locked when updating it, so it should never crash in BPlusTree::SplitNode(). What do you think? How often did you try again?

comment:8 in reply to:  7 Changed 12 years ago by jackburton

Replying to axeld:

However, I don't think this can really prevent the bug from happening; at least it doesn't look like the source of the problem. The index inode is of course write locked when updating it, so it should never crash in BPlusTree::SplitNode(). What do you think? How often did you try again?

I tried twice, once on real hardware and once on qemu. I can try again later, but before it was happening 100% of the times, on real hardware.

comment:9 Changed 12 years ago by jackburton

Ok, I just could reproduce it again. It crashed again in the same function, but the backtrace looks different.

Changed 12 years ago by jackburton

Attachment: backtrace2.jpg added

new backtrace

comment:10 Changed 12 years ago by jackburton

Apparently this was fixed with Ingo's 20402. I'll leave this open still for a while, though.

comment:11 in reply to:  10 Changed 12 years ago by jackburton

Replying to jackburton:

Apparently this was fixed with Ingo's 20402. I'll leave this open still for a while, though.

Unfortunately it still applies. Although now the system hang completely instead of dieing in KDL.

comment:12 Changed 12 years ago by gotaku

I also get dropped into KDL when I try to unzip BeOS5-DevTools.zip in VMWare. It doesn't always happen but it seems to more then not.

Changed 12 years ago by serpentor

Attachment: unzip-kdl-sc1.jpg added

comment:13 Changed 12 years ago by serpentor

I was dropped into KDL while unzipping BeShare.zip that I had just downloaded from BeBits. I am running build 20883 and have attached a picture of the stack crawl after the crash (unzip-kdl-sc1.jpg).

comment:14 Changed 12 years ago by jackburton

Axel, with the patch I've sent you (BPlusTree::_SplitNode()), the behaviour of this bug has changed a bit (for the better ?): Now I can unzip almost all the zip file, although near the end it KDLs again with a different backtrace. Note that, near the point where it KDLd before, the system stalls for ~20 seconds with the cpu at 100%.

Changed 12 years ago by jackburton

Attachment: kdl.png added

new kdl

comment:15 in reply to:  14 Changed 12 years ago by jackburton

Replying to jackburton:

Axel, with the patch I've sent you (BPlusTree::_SplitNode()), the behaviour of this bug has changed a bit (for the better ?): Now I can unzip almost all the zip file, although near the end it KDLs again with a different backtrace. Note that, near the point where it KDLd before, the system stalls for ~20 seconds with the cpu at 100%.

The above applies on vmware player (virtual machine with 128 MB RAM). On real hardware (512 MB RAM) I am able to unzip the file correctly.

comment:16 in reply to:  14 ; Changed 12 years ago by axeld

Replying to jackburton:

Axel, with the patch I've sent you (BPlusTree::_SplitNode()), the behaviour of this bug has changed a bit (for the better ?)

How often did you try after the patch? It sounds a bit strange that a fixed memory leak could prevent the invalid memory access. But then, given the state of our current VM, maybe that indeed has the power to do it. Now it just looks like any other "out of memory" problem indeed.

comment:17 in reply to:  16 Changed 12 years ago by jackburton

Replying to axeld:

How often did you try after the patch?

I tried at least six or seven times.

It sounds a bit strange that a fixed memory leak could prevent the invalid memory >access. But then, given the state of our current VM, maybe that indeed has the power >to do it.

Yeah, seemed strange to me too.

comment:18 Changed 12 years ago by gotaku

I just tried it again after reading the above comments but I notice no change in the status of this bug. I still get an unhandled page fault when trying to unzip the beos devtools.

comment:19 Changed 12 years ago by axeld

Resolution: fixed
Status: newclosed

I fixed the original problem in hrev21480: the block cache might have freed blocks that were still in use due to a bug in BFS. Since the later reported out-of-memory KDL is the same as in bug #517, I'm closing this one now.

Note: See TracTickets for help on using tickets.