Opened 12 years ago

Closed 12 years ago

#8969 closed bug (duplicate)

multiple KDLs, leading to corrupt partition

Reported by: mmadia Owned by: axeld
Priority: blocker Milestone: R1/alpha4
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: #8942 Blocking:
Platform: All

Description (last modified by mmadia)

hrevr1alpha4-44589

When building bezilla on hrevr1alpha4-44589, HAIKU will reliably and consistenly KDL. Upon the next boot, the partition being written to during the KDL will become corrupt and only mountable as RO. Note, hrev43937 was the last revision I used to build BeZillaBrowser (see hrev43952).

(initial kdl)

PANIC: no space in log after sync (1424 for 2043 blocks)!
Welcome to Kernel Debugging Land...
Thread 14908 "nsinstall" running on CPU 0
stack trace for thread 14908 "nsinstall"
    kernel stack: 0xcd536000 to 0xcd53a000
      user stack: 0x7efef000 to 0x7ffef000
frame               caller     <image>:function + offset
 0 cd539af8 (+  32) 80124efa   <kernel_x86>:arch_debug_stack_trace + 0x0012
 1 cd539b18 (+  16) 80091d87   <kernel_x86> stack_trace_trampoline(NULL) + 0x000b
 2 cd539b28 (+  12) 8012a366   <kernel_x86>:arch_debug_call_with_fault_handler + 0x001b
 3 cd539b34 (+  48) 80093816   <kernel_x86>:debug_call_with_fault_handler + 0x005e
 4 cd539b64 (+  64) 80091fa7   <kernel_x86> kernel_debugger_loop(0x8016ed57 "PANIC: ", 0x815364a0 "no space in log after sync (%ld for %ld blocks)!", 0xcd539c10 "", int32: 0) + 0x021b
 5 cd539ba4 (+  48) 8009230b   <kernel_x86> kernel_debugger_internal(0x8016ed57 "PANIC: ", 0x815364a0 "no space in log after sync (%ld for %ld blocks)!", 0xcd539c10 "", int32: 0) + 0x0053
 6 cd539bd4 (+  48) 80093b90   <kernel_x86>:panic + 0x0024
 7 cd539c04 (+ 224) 81527154   <bfs> Journal<0xd602f250>::_WriteTransactionToLog(0xebc6386e) + 0x0324
 8 cd539ce4 (+  48) 81527be5   <bfs> Journal<0xd602f250>::_TransactionDone(true) + 0x00d5
 9 cd539d14 (+  48) 81527a12   <bfs> Journal<0xd602f250>::Unlock(Transaction*: 0xcd539d94, true) + 0x004e
10 cd539d44 (+  96) 8152ef04   <bfs> bfs_create_symlink(fs_volume*: 0xd02b8798, fs_vnode*: 0xdfbaa420, 0xcd539de4 "nsIInterfaceRequestor.idl", 0xdf50a570 "/use-the-source/bezilla/mozilla/xpcom/base/nsIInterfaceRequestor.idl", int32: 0) + 0x02e0
11 cd539da4 (+ 320) 800e0643   <kernel_x86> common_create_symlink(int32: -1, 0xdf505070 "/generated", 0xdf50a570 "/use-the-source/bezilla/mozilla/xpcom/base/nsIInterfaceRequestor.idl", int32: 0, false) + 0x0073
12 cd539ee4 (+  96) 800e5adc   <kernel_x86>:_user_create_symlink + 0x010c
13 cd539f44 (+ 100) 8012b5d0   <kernel_x86>:handle_syscall + 0x00cd
user iframe at 0xcd539fa8 (end = 0xcd53a000)
 eax 0x73           ebx 0x2d2f14        ecx 0x7ffecdc0   edx 0xffff0114
 esi 0x7ffef6aa     edi 0x44            ebp 0x7ffecdec   esp 0xcd539fdc
 eip 0xffff0114  eflags 0x3216     user esp 0x7ffecdc0
 vector: 0x63, error code: 0x0
14 cd539fa8 (+   0) ffff0114   <commpage>:commpage_syscall + 0x0004
15 7ffecdec (+8592) 0020259b   <nsinstall>:main + 0x0b97
16 7ffeef7c (+  48) 0020120f   <nsinstall>:_start + 0x005b
17 7ffeefac (+  48) 00106236   </boot/system/runtime_loader@0x00100000>:unknown + 0x6236
18 7ffeefdc (+   0) 7ffeefec   719937:nsinstall_14908_stack@0x7efef000 + 0xffffec
kdebug>

this happened when i unmounted /generated, after exiting KDL:

bfs: inode at 0 requested!
bfs: check: Could not open inode at 0
PANIC: cache destroy: still has partial slabs
Welcome to Kernel Debugging Land...
Thread 113 "mount_server" running on CPU 0
stack trace for thread 113 "mount_server"
    kernel stack: 0x81a1e000 to 0x81a22000
      user stack: 0x7efef000 to 0x7ffef000
frame               caller     <image>:function + offset
 0 81a21c48 (+  32) 80124efa   <kernel_x86>:arch_debug_stack_trace + 0x0012
 1 81a21c68 (+  16) 80091d87   <kernel_x86> stack_trace_trampoline(NULL) + 0x000b
 2 81a21c78 (+  12) 8012a366   <kernel_x86>:arch_debug_call_with_fault_handler + 0x001b
 3 81a21c84 (+  48) 80093816   <kernel_x86>:debug_call_with_fault_handler + 0x005e
 4 81a21cb4 (+  64) 80091fa7   <kernel_x86> kernel_debugger_loop(0x8016ed57 "PANIC: ", 0x80182b40 "cache destroy: still has partial slabs", 0x81a21d60 "", int32: 0) + 0x021b
 5 81a21cf4 (+  48) 8009230b   <kernel_x86> kernel_debugger_internal(0x8016ed57 "PANIC: ", 0x80182b40 "cache destroy: still has partial slabs", 0x81a21d60 "", int32: 0) + 0x0053
 6 81a21d24 (+  48) 80093b90   <kernel_x86>:panic + 0x0024
 7 81a21d54 (+  48) 800fd034   <kernel_x86> delete_object_cache_internal(ObjectCache*: 0xd62e6bc8) + 0x0064
 8 81a21d84 (+  64) 800fe177   <kernel_x86>:delete_object_cache + 0x026b
 9 81a21dc4 (+  64) 8004676f   <kernel_x86>:_._11block_cache + 0x0043
10 81a21e04 (+  48) 8004a9ce   <kernel_x86>:block_cache_delete + 0x015e
11 81a21e34 (+  64) 8152bbeb   <bfs> Volume<0x8221b700>::Unmount(0x0) + 0x00eb
12 81a21e74 (+  48) 8152cf3c   <bfs> bfs_unmount(fs_volume*: 0xd02b8798) + 0x0024
13 81a21ea4 (+  96) 800e2f48   <kernel_x86> fs_unmount(0xd0030a48 "/generated", int32: -1, uint32: 0x0 (0), false) + 0x0578
14 81a21f04 (+  64) 800e4b32   <kernel_x86>:_user_unmount + 0x007a
15 81a21f44 (+ 100) 8012b5d0   <kernel_x86>:handle_syscall + 0x00cd
user iframe at 0x81a21fa8 (end = 0x81a22000)
 eax 0x5f           ebx 0x615f14        ecx 0x7ffee740   edx 0xffff0114
 esi 0x7ffee7f0     edi 0x1803de68      ebp 0x7ffee76c   esp 0x81a21fdc
 eip 0xffff0114  eflags 0x3203     user esp 0x7ffee740
 vector: 0x63, error code: 0x0
16 81a21fa8 (+   0) ffff0114   <commpage>:commpage_syscall + 0x0004
17 7ffee76c (+ 160) 00439ceb   <libbe.so> BPartition<0x1803de68>::Unmount(uint32: 0x0 (0)) + 0x0087
18 7ffee80c (+  80) 00206299   <_APP_> AutoMounter<0x7ffeedb8>::_UnmountAndEjectVolume(BPartition*: 0x1803de68, BPath&: 0x7ffee8f0, 0x7ffee90c "generated") + 0x0085
19 7ffee85c (+ 496) 00206745   <_APP_> AutoMounter<0x7ffeedb8>::_UnmountAndEjectVolume(BMessage*: 0x1801b0e0) + 0x0251
20 7ffeea4c (+ 144) 00204f39   <_APP_> AutoMounter<0x7ffeedb8>::MessageReceived(BMessage*: 0x1801b0e0) + 0x0151
21 7ffeeadc (+  48) 002f67a3   <libbe.so> BLooper<0x7ffeedb8>::DispatchMessage(BMessage*: 0x1801b0e0, BHandler*: 0x7ffeedb8) + 0x005b
22 7ffeeb0c (+ 496) 002ed1dd   <libbe.so> BApplication<0x7ffeedb8>::DispatchMessage(BMessage*: 0x1801b0e0, BHandler*: 0x7ffeedb8) + 0x0405
23 7ffeecfc (+  64) 002f8111   <libbe.so> BLooper<0x7ffeedb8>::task_looper(0x7ffeedb8) + 0x0211
24 7ffeed3c (+  64) 002ebc3d   <libbe.so> BApplication<0x7ffeedb8>::Run(0x1) + 0x0075
25 7ffeed7c (+ 512) 0020731f   <_APP_>:main + 0x002f
26 7ffeef7c (+  48) 002049cb   <_APP_>:_start + 0x005b
27 7ffeefac (+  48) 00106236   </boot/system/runtime_loader@0x00100000>:unknown + 0x6236
28 7ffeefdc (+   0) 7ffeefec   4089:mount_server_113_stack@0x7efef000 + 0xffffec
kdebug> this happened when i unmounted /generated.
Unknown command "this". Enter "help" to get a list of all supported commands.
kdebug> exit

mounting the partition on following boot (not in attachment):

Last message repeated 1 time
USER 'liblocale.so'[951]: app application/x-vnd.Haiku-Terminal send to client failed: Bad port ID
KERN: bfs: Replay log, disk was not correctly unmounted...
KERN: run count: 1777779432, array max: 1023, max runs: 126
KERN: bfs: Log entry has broken header!
KERN: bfs: KERN: replaying log entry from 333 failed: Bad data
KERN: bfs: KERN: Replaying log failed, data may be corrupted, volume read-only.
KERN: bfs: bfs: volume doesn't have indices!
KERN: bfs: KERN: mounted "generated" (root node at 524288, device = /dev/disk/ata/2/master/3_5)
KERN: usb error hub 13: error updating port status
KERN: slab memory manager: created area 0xdf801000 (11758)

Attachments (1)

serial-debug-output.txt (36.2 KB ) - added by mmadia 12 years ago.

Download all attachments as: .zip

Change History (9)

by mmadia, 12 years ago

Attachment: serial-debug-output.txt added

comment:1 by mmadia, 12 years ago

Description: modified (diff)

comment:2 by scottmc, 12 years ago

This is very much like what I ran into with #8961. I was working in a vm, so I had to trash the vm and start over. It made everything on the drive RO. How big was the partition you were working in? I wonder if it's getting too close to being full.

Perhaps we need to trace back to an earlier version to see where this didn't happen to help pin point what caused it. I wouldn't be surprised if it was one of the many rebuilt OptionalPackages, or something that changed in the virtual memory code, or maybe even a combination of both. Either way we need to get it figured out to be able to move forward.

comment:3 by rhester72, 12 years ago

I don't think it's any of the OptionalPackages. I pulled the 6/10 nightly VMDK (hrevr1alpha4-44589, with essentially zero changes) for VirtualBox 4.1.22 and attempted to self-build Haiku with it on a blank 8GB BFS volume and encountered the same initial KDL (twice, both during the build process, I don't recall what thread was running during the first but it was unzip during the second...how are you getting the captures out of the VM to paste in the ticket?). I reset the VM on both occasions without disk corruption/anything being forced read-only, but I don't know what the effect would have been if I'd formally exited the KDL within Haiku itself.

It's been a while since I attempted a self-build, so I don't know how long this has been happening. I was working for a number of hours on the same Haiku nightly without incident prior to this...perhaps it's related to total memory consumption/crossing some threshold? (I got the KDL with 1GB and 2GB of RAM in the VM...I doubled it thinking that was the issue.)

Rodney

Version 0, edited 12 years ago by rhester72 (next)

comment:4 by tqh, 12 years ago

I ran into this on trunk when I tried run haikuporter's install script. I have a big partition (10GB) so disk space is not the problem.

I'm at hrev44624 (not hrev44587 as this post first said) at the moment.

Last edited 12 years ago by tqh (previous) (diff)

comment:5 by tqh, 12 years ago

Oops, I looked in my GSOC Haiku tree, the correct hrev is hrev44624. (I'll correct above post).

comment:6 by scottmc, 12 years ago

This is probably a dupe of #8942 also.

comment:7 by scottmc, 12 years ago

Blocked By: 8942 added

comment:8 by scottmc, 12 years ago

Resolution: duplicate
Status: newclosed
Note: See TracTickets for help on using tickets.