Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#2558 closed bug (fixed)

File corruption

Reported by: andreasf Owned by: bonefish
Priority: normal Milestone: R1
Component: File Systems/BFS Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: x86

Description

I'm seeing apparently random file corruption on my two BFS volumes at hrev26698. While compiling, e.g., expat, compilation fails with various errors resulting from (newly) corrupted files, such as Makefile containing NUL chars or source files such as lib/xmlparse.c and lib/xmltok.h containing garbage. Same happens for system headers such as be/support/Errors.h on the freshly initialized boot volume.

Since I haven't run Haiku for some time, DeadYak suggested to check hrev26532. Will do.

Change History (14)

comment:1 by andreasf, 11 years ago

Afterwards, chkbfs says that the volume is read-only and fixes some block allocation mismatches.

comment:2 by andreasf, 11 years ago

No problems in hrev26532!

comment:3 by andreasf, 11 years ago

hrev26700 still fails: Expat's configure hangs after "config.status: creating expat_config.h" with high CPU activity on both cores. Makefile.in contained garbage again. Unable to Ctrl+C it, and in KDL 'threads' appears to report sh running; no improvement after continuing.

After a restart (from the menu), chkbfs fixed 32565 blocks allocated that should not be. (sounds scary)

comment:4 by bonefish, 11 years ago

Since hrev26692 pretty much all I/O goes through the very new (and likely still buggy) I/O request framework. Can you please check whether hrev26691 still works OK.

comment:5 by emitrax, 11 years ago

I've encountered a similar behavior last night after running a configure. KDL shows that the thread running most of the time was the page writer, althuogh not in low memory situation.

in reply to:  4 comment:6 by andreasf, 11 years ago

Replying to bonefish:

Can you please check whether hrev26691 still works OK.

hrev26691 appears to work okay.

comment:7 by andreasf, 11 years ago

I encountered the hang on hrev26691 as well, should we move that to a new ticket? It seems unrelated to the corruption.

comment:8 by emitrax, 11 years ago

Could you provide some more debug information when the hang happens? Although, you should retry it with a latest rev, as in http://dev.haiku-os.org/changeset/26710 Ingo fixed a busy loop in the vm_page_allocate_page().

If it re-happens, and you can take some serial output debug, or some screeshots, it'd be nice to know which thread is running, and it's stack trace (bt command). And if your application is waiting (thread <id_of_your_thread>), like your configure, what is waiting for (check the "waiting for" field in the output of the thread KDL command).

in reply to:  8 ; comment:9 by andreasf, 11 years ago

A busy loop fix does not sound as if the file corruption were fixed...? In that case I'd be hesitant to shoot my source files.

I have not yet encountered such a hang again, still on hrev26691.

in reply to:  9 comment:10 by bonefish, 11 years ago

Owner: changed from axeld to bonefish
Status: newassigned

Replying to andreasf:

A busy loop fix does not sound as if the file corruption were fixed...? In that case I'd be hesitant to shoot my source files.

If hrev26692 shreds your data, the current revision still will. hrev26710 only fixes a system hang when the kernel heap needs to grow while all potentially usable pages are bound in caches. It doesn't even fix this completely (i.e. the system will still hang, but a few more threads will continue to run) as documented in hrev26711. So, if you like your data, don't use any revision newer than hrev26691 for the moment. I'll try to track the problems down, but that might take a few days. If you have any more hints how to easily reproduce it, that would be much appreciated.

I have not yet encountered such a hang again, still on hrev26691.

If it is the problem fixed by hrev26710 it should be relatively old and rare, I think.

comment:11 by emitrax, 11 years ago

Could #2522 be related to this one?

in reply to:  11 comment:12 by bonefish, 11 years ago

Replying to emitrax:

Could #2522 be related to this one?

No, #2522 predates hrev26692, which Andreas has confirmed has introduced this bug.

comment:13 by bonefish, 11 years ago

Resolution: fixed
Status: assignedclosed

Fixed in hrev26751.

comment:14 by andreasf, 11 years ago

Works for me, too, at hrev26751. Thanks!

Note: See TracTickets for help on using tickets.