Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#9763 closed bug (fixed)

Deleting 62k files spams the syslog

Reported by: humdinger Owned by: nobody
Priority: normal Milestone: R1
Component: File Systems/BFS Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

This is hrev45646.

I have deleted (SHIFT+DEL) a folder with over 62k files. Takes a while...

In Tracker the "TrackerTaskLoop" thread is very busy and in the kernel team the "syslog sender". The syslog is practically spammed with this entry:

KERN: bfs: Match:900: General system error
KERN: bfs: KERN: query value conversion to 0x54494d45 requested!

Hmm. checkfs says:

in (inode = 7927634), has blocks already set
        145620 nodes checked,
        0 blocks not allocated,
        512 blocks already set,
        0 blocks could be freed

        files           137596
        directories     7311
        attributes      353
        attr. dirs      321
        indices         39

        direct block runs               150036 (21.68 GiB)
        indirect block runs             4616 (in 121 array blocks, 8.45 GiB)
        double indirect block runs      512 (in 1 array blocks, 2.00 MiB)

The syslog shows many entries of this kind:

KERN: block 10032295 is already set!!!
KERN: bfs: double indirect->runs->run: block_run(153, 5287, 2): blocks 10032295 - 10032296 are already set!
KERN: block 10032670 is already set!!!
KERN: bfs: double indirect->runs->run: block_run(153, 5662, 2): blocks 10032670 - 10032671 are already set!
KERN: block 10032968 is already set!!!
KERN: bfs: double indirect->runs->run: block_run(153, 5960, 2): blocks 10032968 - 10032969 are already set!
KERN: block 10033740 is already set!!!

Not good, I suppose. Do I need to re-Initialize the partition?

Change History (13)

comment:1 by humdinger, 11 years ago

I now have backed up everything and archived a few folders with mail and source etc. into a ~2gb tar.xz. I then initialized the partition and copied the backup'ed folders back, ran checkfs -> OK. I then unpacked the 2gb archive and checkfs shows again stuff like

KERN: block 12173565 is already set!!!
KERN: bfs: double indirect->runs->run: block_run(185, 49405, 2): blocks 12173565 - 12173566 are already set!
KERN: block 12173631 is already set!!!
KERN: bfs: KERN: double indirect->runs->run: block_run(185, 49471, 2): blocks 12173631 - 12173632 are already set!
KERN: block 12173697 is already set!!!
KERN: bfs: KERN: double indirect->runs->run: block_run(185, 49537, 2): blocks 12173697 - 12173698 are already set!

Any ideas how to proceed?

comment:2 by axeld, 11 years ago

That sounds like you found a reliable way to shredder your file system -- congratulations, you lucky bastard! Is there any way I could reproduce this here locally? I assume you wouldn't like to make your private and sensible data accessible to me?

The "is already set" error is pretty much the worst thing that could happen, and means that two files claim the same block on disk -- which means one of them, or even both are definitely lost.

If you have time on your hands, you could try to create a number of smaller archives (first two, then four, etc.), and see if you could reproduce this with one of those archives already? Does it happen again if you create another archive from the same unpacked data?

If you can reproduce the issue, one thing we could do is to shredder the data on the disk, create an archive from that, and try again. It's possible that the file name, attributes or sizes create the problem, but it's very unlikely that the data itself is the problem. Just in case you would prefer to keep your data to yourself.

comment:3 by humdinger, 11 years ago

I'm afraid I have accidentally somehow removed the corruption...
As seen from the checkfs output above, the corruption was rported to be in the folder "in" (which is a mail folder). In order to narrow in the affected files, I created three folders on the same partition - "A1", "A2", A3" - and moved into each 1/4 of the mails from "in". I then renamed "in" to "A4".
A new run of checkfs showed the misbehaving inode in folder "A4". I proceeded to create new folders - "A4B1", "A4B2", A4B3" - moved again 1/4 of the mails of "A4" into those, and renamed "A4" to "A4B4".
Another run of checkbfs now shows no errors at all(!).

What's going on? Can the number of files in a folder have an influence? The "A4" folder had around 10.000 files in it.

comment:4 by axeld, 11 years ago

But you still have the archive that produces the corruption? The interesting part is to create the corruption -- once it's there, it is pretty much impossible to find the cause for it.

Anyway, if you just moved the mails, the corruption must have been in one of the directory B+trees - which means that potentially mails got lost. Did you have any syslog activity when moving the mails? What do you mean by misbehaving inode?

Also, can you provide a "bfsinfo -s" output from the partition?

comment:5 by humdinger, 11 years ago

I don't have a syslog when moving. I now have a 160mb archive of the in folder, that produces the error when unpacked. I tried that by unarchiving it onto another partition "test2". Here's the bfsinfo output of those two partitions:

~/Desktop> bfsinfo -s /test2
Copyright (c) 2001-2010 pinc Software.

disk_super_block:
  name           = test2
  magic1         = 0x42465331 (BFS1) valid
  fs_byte_order  = 0x42494745 (BIGE, little endian)
  block_size     = 2048
  block_shift    = 11
  num_blocks     = 1576448
  used_blocks    = 949875
  inode_size     = 2048
  magic2         = 0xdd121031 (...1) valid
  blocks_per_ag  = 2
  ag_shift       = 15
  num_ags        = 49
  flags          = 0x434c454e (CLEN)
  log_blocks     = (0, 98, 4096)(null)
  log_start      = 1921
  log_end        = 1921
  magic3         = 0x15b6830e (....) valid
  root_dir       = (8, 0, 1)(null)
  indices        = (0, 4194, 1)(null)

~/Desktop> bfsinfo -s /HiQ-Data/
Copyright (c) 2001-2010 pinc Software.

disk_super_block:
  name           = HiQ-Data
  magic1         = 0x42465331 (BFS1) valid
  fs_byte_order  = 0x42494745 (BIGE, little endian)
  block_size     = 2048
  block_shift    = 11
  num_blocks     = 26032128
  used_blocks    = 16865646
  inode_size     = 2048
  magic2         = 0xdd121031 (...1) valid
  blocks_per_ag  = 4
  ag_shift       = 16
  num_ags        = 398
  flags          = 0x434c454e (CLEN)
  log_blocks     = (0, 1590, 4096)(null)
  log_start      = 731
  log_end        = 731
  magic3         = 0x15b6830e (....) valid
  root_dir       = (8, 0, 1)(null)
  indices        = (0, 5686, 1)(null)

I didn't mean anything by that "misbehaving inode", just commenting on the checkfs-output "in (inode = 7927634), has blocks already set".

Since I know you're a decent person and won't snoop around more than necessary in my mails, I'm about to send you the link where I uploaded it to my dropbox, which took for ever...

comment:6 by axeld, 11 years ago

Great, thanks! Just got it. How much RAM do you have, BTW? I'll probably find the time to look into this tonight or tomorrow.

comment:7 by humdinger, 11 years ago

I have a Core2Duo T7200 @ 2ghz with 2gb RAM in soon 7 year old notebook with nvidia 7600Go graphics. Thanks!

comment:8 by axeld, 11 years ago

Alright, are you unpacking in a particular directory?

comment:9 by humdinger, 11 years ago

Nope. In case of my second test partition, I simply unzipped (from Terminal) in /test2. That will extract into an "in" folder.

comment:10 by axeld, 11 years ago

I can reproduce the problem! Looks like all double indirect blocks are affected. This probably just means checkfs is broken, though. Continuing the investigation.

comment:11 by axeld, 11 years ago

Resolution: fixed
Status: newclosed

Indeed, pretty unspectacular, and there was a even a fix already from Andreas Henriksson from last GSoC that I could use. At least I now know that humdinger is married thrice already, and has 6 kids he hasn't seen since 5 years.

Fixed in hrev45682.

comment:12 by humdinger, 11 years ago

And you have to blab it out... In the words of my wives and the eldest: "Meet my lawyer!" :)

This ticket may not have uncovered a serious bug, but I do trust my Haiku a bit more now, knowing that the assumed data-mangling was a false alarm. Thanks!

comment:13 by diver, 11 years ago

Component: - GeneralFile Systems/BFS
Note: See TracTickets for help on using tickets.