Opened 13 years ago

Closed 2 months ago

Last modified 2 months ago

#8392 closed bug (not reproducible)

bfs errors

Reported by: jstressman Owned by: axeld
Priority: normal Milestone: R1
Component: File Systems/BFS Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

I'm having some bfs odd behavior on hrev43844

After looking through my log after another system freeze, I noticed a ton of errors like the following:

bfs: invalid node [0xd7430800] read from offset 518144 (block 7858), inode at 4294
bfs: BPlusTree::_SeekDown() could not open node 518144
bfs: Remove:1958: General system error
bfs: invalid node [0xd7430800] read from offset 518144 (block 7858), inode at 4294
bfs: BPlusTree::_SeekDown() could not open node 518144
bfs: Remove:1958: General system error

while using 'checkfs' on the drive, I get the notice in the attached image...

then I looked at the log and saw a ton more various errors and warnings etc. Those are in the attached text file.

This didn't fix the bfs problem either. I still see the same message now each time I try running checkfs.

Attachments (4)

bfserror1.png (58.0 KB ) - added by jstressman 13 years ago.
bfserrors2.txt (48.6 KB ) - added by jstressman 13 years ago.
IMG_1432.JPG (4.0 MB ) - added by SeanCollins 13 years ago.
IMG_1431.JPG (1.9 MB ) - added by SeanCollins 13 years ago.

Change History (10)

by jstressman, 13 years ago

Attachment: bfserror1.png added

by jstressman, 13 years ago

Attachment: bfserrors2.txt added

comment:1 by axeld, 13 years ago

Unfortunately, checkfs is not capable of repairing the B+trees yet, so it's normal that those things don't go away; I've just recently implemented verifying the trees, so that you at least know there is something wrong. I'll add a help text explaining this until I find the time to make it able to repair this (which might end up in another tool, though, as it's very time and memory consuming).

The bug itself is most likely a duplicate of the existing open BFS bugs, and is most probably related to one or more bugs in the block cache. Unit tests for that thing are high on my to do list.

comment:2 by SeanCollins, 13 years ago

I am attaching a KDL screen shot that I took with my camera, it looks to be related. Hopyfully this information is useful. If it needs a seperate ticket please let me know.

by SeanCollins, 13 years ago

Attachment: IMG_1432.JPG added

by SeanCollins, 13 years ago

Attachment: IMG_1431.JPG added

comment:3 by waddlesplash, 2 months ago

Resolution: not reproducible
Status: newclosed

I think after 13 years we can close this. Errors after a hang/hard reboot are occasionally expected, I think; and there have been many changes to BFS in the interim, including some work on repairing index trees.

comment:4 by X512, 2 months ago

Errors after a hang/hard reboot are occasionally expected, I think

No, BFS corruptions on kernel crash or unexpected shutdown are not expected because BFS implement journaling. The only expected problem is leaking allocated blocks of unexpected shutdown.

But for my experience, BFS survive unexpected shutdowns quite well.

comment:5 by waddlesplash, 2 months ago

I suppose that's true, though if we crashed before the journal was fully flushed, depending on what order blocks were written into it, we could still wind up in an inconsistent state, I think?

But either way, yes, it's been many years and I don't recall seeing issues like this consistently.

comment:6 by X512, 2 months ago

depending on what order blocks were written into it, we could still wind up in an inconsistent state

If it is true, it is a bug and it should be fixed. Journal is expected to handle all cases of unexpected shutdown/unmount.

Note: See TracTickets for help on using tickets.