Opened 9 years ago

Closed 8 years ago

Last modified 7 years ago

#6237 closed bug (fixed)

PANIC: ASSERT FAILED (src/system/kernel/slab/MemoryManager.cpp:973): chunkPointer; remaining: 1/3, area: 0x90800000, meta chunk: 0 on r37262

Reported by: anevilyak Owned by: bonefish
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: slab memory manager Cc: axeld, henrimoi@…
Blocked By: Blocking: #8030, #9010
Has a Patch: no Platform: All

Description

Occurs during boot. Debug info as follows:

kdebug> bt
stack trace for thread 62 "net_server"
    kernel stack: 0x81765000 to 0x81769000
      user stack: 0x7efef000 to 0x7ffef000
frame               caller     <image>:function + offset
 0 817687d8 (+  32) 80077f81   <kernel_x86> invoke_command_trampoline(void*: 0x81768858) + 0x0015
 1 817687f8 (+  12) 800f6342   <kernel_x86>:arch_debug_call_with_fault_handler + 0x001b
 2 81768804 (+  48) 80075a14   <kernel_x86>:debug_call_with_fault_handler + 0x0053
 3 81768834 (+  64) 800782fe   <kernel_x86>:invoke_debugger_command + 0x00bb
 4 81768874 (+  48) 8007841b   <kernel_x86> invoke_pipe_segment(debugger_command_pipe*: 0x2, int32: 0, char*: NULL) + 0x0083
 5 817688a4 (+  32) 800784e3   <kernel_x86>:invoke_debugger_command_pipe + 0x008b
 6 817688c4 (+ 128) 8007c4ca   <kernel_x86> ExpressionParser<0x817689b4>::_ParseCommandPipe(int&: 0x817689b0) + 0x0ae8
 7 81768944 (+  80) 8007eefe   <kernel_x86> ExpressionParser<0x817689b4>::EvaluateCommand(char const*: 0x801589c0 "bt", int&: 0x817689b0) + 0x0822
 8 81768994 (+ 192) 8007fbce   <kernel_x86>:evaluate_debug_command + 0x0084
 9 81768a54 (+  64) 80076ab1   <kernel_x86> kernel_debugger_loop(char const*: 0x0 "<NULL>", char const*: 0x80141734 "ASSERT FAILED (%s:%d): %s; remaining: %lu/%lu, area: %p, meta chunk: %ld
", char*: 0x81768ac4, int32: -2146997173) + 0x0302
10 81768a94 (+  48) 80076c69   <kernel_x86> kernel_debugger_internal(char const*: 0x0 "<NULL>", char const*: 0x81768c5c "", char*: 0x81768ae4, int32: -2146996679) + 0x0110
11 81768ac4 (+  32) 80076e4c   <kernel_x86>:panic + 0x0023
12 81768ae4 (+  80) 800ccdaa   <kernel_x86> MemoryManager<0x8016a558>::_GetChunks(DoublyLinkedList<MemoryManager::MetaChunk, DoublyLinkedListStandardGetLink<MemoryManager::MetaChunk> >*: 0x1000, uint32: 0x3 (3), uint32: 0x81768c14, MemoryManager::MetaChunk*&: 0x81768c10, MemoryManager::Chunk*&: NULL) + 0x018e
13 81768b34 (+ 112) 800ccee7   <kernel_x86> MemoryManager<0x1000>::_AllocateChunks(uint32: 0x3 (3), uint32: 0x0 (0), uint32: 0x81768c14, MemoryManager::MetaChunk*&: 0x81768c10, MemoryManager::Chunk*&: 0x9083b0a0) + 0x0061
14 81768ba4 (+ 128) 800cd323   <kernel_x86> MemoryManager<0x2260>::AllocateRaw(uint32: 0x0 (0), uint32: 0x81768c54, void*&: 0x800c9ab4) + 0x0107
15 81768c24 (+  64) 800c9add   <kernel_x86> block_alloc(uint32: 0x2260 (8800), uint32: 0x0 (0), uint32: 0x0 (0)) + 0x007f
16 81768c64 (+  32) 800c9bd1   <kernel_x86>:malloc + 0x0012
17 81768c84 (+ 160) 8004fbf5   <kernel_x86>:load_kernel_add_on + 0x05d1
18 81768d24 (+ 288) 80093cf0   <kernel_x86>:legacy_driver_probe + 0x0589
19 81768e44 (+  64) 8008f8ca   <kernel_x86> scan_for_drivers_if_needed(devfs_vnode*: 0x80d7fba0) + 0x00e8
20 81768e84 (+  64) 8008f954   <kernel_x86> devfs_open_dir(fs_volume*: 0x82207450, fs_vnode*: 0x9083b4b0, void**: 0x81768ef4) + 0x0049
21 81768ec4 (+  64) 800b5862   <kernel_x86> open_dir_vnode(vnode*: NULL, false) + 0x0027
22 81768f04 (+  32) 800beeec   <kernel_x86> dir_open(int32: -1867977984, char*: 0x401, false) + 0x0039
23 81768f24 (+  32) 800befc7   <kernel_x86>:_user_open_dir + 0x00a1
24 81768f44 (+ 100) 800f7541   <kernel_x86>:handle_syscall + 0x00be
user iframe at 0x81768fa8 (end = 0x81769000)
 eax 0x66           ebx 0x456330        ecx 0x7ffee6bc   edx 0xffff0114
 esi 0x0            edi 0x210efd        ebp 0x7ffee6e8   esp 0x81768fdc
 eip 0xffff0114  eflags 0x212      user esp 0x7ffee6bc
 vector: 0x63, error code: 0x0
25 81768fa8 (+   0) ffff0114   <commpage>:commpage_syscall + 0x0004
26 7ffee6e8 (+  48) 00391c6e   <libbe.so> BDirectory::BDirectory(char const*: 0x7ffee89c "¨EE") + 0x0050
27 7ffee718 (+ 560) 0020990f   <_APP_> NetServer<0x7ffeed78>::_ConfigureDevices(int32: 3, char const*: 0x210efd "/dev/net", BMessage*: NULL) + 0x002d
28 7ffee948 (+ 272) 00209cfa   <_APP_> NetServer<0x7ffeed78>::_BringUpInterfaces() + 0x01e8
29 7ffeea58 (+  96) 00209d7d   <_APP_> NetServer<0x7ffeed78>::ReadyToRun() + 0x0049
30 7ffeeab8 (+ 544) 002b7dea   <libbe.so> BApplication<0x7ffeed78>::DispatchMessage(BMessage*: 0x1801ab40, BHandler*: 0x7ffeed78) + 0x0348
31 7ffeecd8 (+  64) 002c2927   <libbe.so> BLooper<0x7ffeed78>::task_looper() + 0x0205
32 7ffeed18 (+  48) 002b839a   <libbe.so> BApplication<0x7ffeed78>::Run() + 0x0064
33 7ffeed48 (+ 544) 002086f7   <_APP_>:main + 0x0076
34 7ffeef68 (+  52) 002075cd   <_APP_>:_start + 0x0051
35 7ffeef9c (+  64) 0010534f   </boot/system/runtime_loader@0x00100000>:unknown + 0x534f
36 7ffeefdc (+   0) 7ffeefec   1562:net_server_main_stack@0x7efef000 + 0xffffec
kdebug> area 0x90800000
AREA: 0x82381060
name:           'slab area'
owner:          0x1
id:             0xb3
base:           0x90800000
size:           0x800000
protection:     0x10
wiring:         0x1
memory_type:    0x0
cache:          0x820089a0
cache_type:     null
cache_offset:   0x0
cache_next:     0x00000000
cache_prev:     0x00000000
page mappings:  0
kdebug> slab_area -c 0x90800000
chunk        base       cache  object size  cache name
    0  0x90803000  ---  small meta chunk:  116/ 125 used, 119 - 124 free ------------
    0  0x90803000  0x820031f0           32  block allocator: 32
    1  0x90804000  0x82006dc0          384  block allocator: 384
    2  0x90805000  0x82006dc0          384  block allocator: 384
    3  0x90806000  0x820034c0           80  block allocator: 80
    4  0x90807000  0x82003880          160  block allocator: 160
    5  0x90808000  0x820031f0           32  block allocator: 32
    6  0x90809000  0x82003100           24  block allocator: 24
    7  0x9080a000  raw allocation up to 0x9082cfff
   42  0x9082d000  raw allocation up to 0x9082ffff
   45  0x90830000  0x820032e0           48  block allocator: 48
   46  0x90831000  0x82003000           16  block allocator: 16
   47  0x90832000  0x820035b0           96  block allocator: 96
   48  0x90833000  0x82006dc0          384  block allocator: 384
   49  0x90834000  0x82006dc0          384  block allocator: 384
   50  0x90835000  0x820034c0           80  block allocator: 80
   51  0x90836000  0x82003880          160  block allocator: 160
   52  0x90837000  0x82003880          160  block allocator: 160
   53  0x90838000  0x82003880          160  block allocator: 160
   54  0x90839000  0x82006dc0          384  block allocator: 384
   55  0x9083a000  0x820032e0           48  block allocator: 48
   56  0x9083b000  0x820034c0           80  block allocator: 80
   57  0x9083c000  0x820032e0           48  block allocator: 48
   58  0x9083d000  0x820035b0           96  block allocator: 96
   59  0x9083e000  0x82006dc0          384  block allocator: 384
   60  0x9083f000  0x820031f0           32  block allocator: 32
   61  0x90840000  0x82003880          160  block allocator: 160
   62  0x90841000  0x82006dc0          384  block allocator: 384
   63  0x90842000  0x82003c40          320  block allocator: 320
   64  0x90843000  0x820034c0           80  block allocator: 80
   65  0x90844000  0x82003880          160  block allocator: 160
   66  0x90845000  0x82006dc0          384  block allocator: 384
   67  0x90846000  0x820034c0           80  block allocator: 80
   68  0x90847000  0x82003790          128  block allocator: 128
   69  0x90848000  0x82003880          160  block allocator: 160
   70  0x90849000  0x820034c0           80  block allocator: 80
   71  0x9084a000  0x82006dc0          384  block allocator: 384
   72  0x9084b000  0x820035b0           96  block allocator: 96
   73  0x9084c000  0x82003880          160  block allocator: 160
   74  0x9084d000  0x82006dc0          384  block allocator: 384
   75  0x9084e000  0x82003100           24  block allocator: 24
   76  0x9084f000  0x82003000           16  block allocator: 16
   77  0x90850000  0x820032e0           48  block allocator: 48
   78  0x90851000  raw allocation up to 0x90853fff
   81  0x90854000  raw allocation up to 0x90858fff
   86  0x90859000  0x82006dc0          384  block allocator: 384
   87  0x9085a000  0x82006dc0          384  block allocator: 384
   88  0x9085b000  0x82006dc0          384  block allocator: 384
   89  0x9085c000  0x82003880          160  block allocator: 160
   90  0x9085d000  0x820035b0           96  block allocator: 96
   91  0x9085e000  0x820033d0           64  block allocator: 64
   92  0x9085f000  raw allocation up to 0x90861fff
   95  0x90862000  0x82006dc0          384  block allocator: 384
   96  0x90863000  0x820034c0           80  block allocator: 80
   97  0x90864000  0x82006dc0          384  block allocator: 384
   98  0x90865000  0x82003880          160  block allocator: 160
   99  0x90866000  0x82006dc0          384  block allocator: 384
  100  0x90867000  0x82006dc0          384  block allocator: 384
  101  0x90868000  0x820032e0           48  block allocator: 48
  102  0x90869000  0x82006dc0          384  block allocator: 384
  103  0x9086a000  0x820034c0           80  block allocator: 80
  104  0x9086b000  0x820035b0           96  block allocator: 96
  105  0x9086c000  0x82006dc0          384  block allocator: 384
  110  0x90871000  0x82003c40          320  block allocator: 320
  111  0x90872000  0x820033d0           64  block allocator: 64
  112  0x90873000  raw allocation up to 0x90879fff
  121  0x9087c000  0x82003790          128  block allocator: 128
    1  0x90880000  ---  large meta chunk:    1/   1 used, 1   -   0 free ------------
    0  0x90880000  0x80b7ec80         2048  block cache buffers
    2  0x90900000  ---  large meta chunk:    1/   1 used, 1   -   0 free ------------
    0  0x90900000  0x80b7ec80         2048  block cache buffers
    3  0x90980000  ---  large meta chunk:    1/   1 used, 1   -   0 free ------------
    0  0x90980000  0x80b7ec80         2048  block cache buffers
    4  0x90a00000  ---  large meta chunk:    1/   1 used, 1   -   0 free ------------
    0  0x90a00000  0x80b7ec80         2048  block cache buffers
    5  0x90a80000  --- medium meta chunk:    1/   8 used, 1   -   7 free ------------
    0  0x90a80000  0x82006500         1280  block allocator: 1280
    6  0x90b00000  ---  empty meta chunk --------------------------------------------
    7  0x90b80000  ---  empty meta chunk --------------------------------------------
    8  0x90c00000  ---  empty meta chunk --------------------------------------------
    9  0x90c80000  ---  empty meta chunk --------------------------------------------
   10  0x90d00000  ---  empty meta chunk --------------------------------------------
   11  0x90d80000  ---  empty meta chunk --------------------------------------------
   12  0x90e00000  ---  empty meta chunk --------------------------------------------
   13  0x90e80000  ---  empty meta chunk --------------------------------------------
   14  0x90f00000  ---  empty meta chunk --------------------------------------------
   15  0x90f80000  ---  empty meta chunk --------------------------------------------
kdebug>

Will see if SMP or anything else affects it and report back. Let me know if there's anything else I can dig up from the debugger.

Attachments (1)

syslog-alpha3.txt (455.2 KB) - added by heto 8 years ago.
Syslog captured using the save feature in the bootloader

Download all attachments as: .zip

Change History (20)

comment:1 Changed 9 years ago by anevilyak

Seems it's not reliably reproducible either, just rebooted about 10 times with the same exact code and haven't seen it again.

comment:2 in reply to:  1 ; Changed 9 years ago by bonefish

Replying to anevilyak:

Seems it's not reliably reproducible either, just rebooted about 10 times with the same exact code and haven't seen it again.

There isn't anything substantial in the MemoryManager that has changed in the last months. Only small adjustments due to VM interface changes. This looks like a structural bug though: The algorithm didn't find one of the three chunks it was allocating in the free chunk list. According to the slab_area output the contiguous free chunk range hint for the first meta chunk is the chunks 119 - 124. And according to the chunk listing they actually are free. I looked through the code, but I couldn't find any problem. The free list maintenance looks fine and it is only accessed while locked, so there can't be any race conditions either.

I guess for testing purposes I could add an aggressive free list check after any operation. Will add that to my TODO list.

comment:3 in reply to:  2 ; Changed 9 years ago by anevilyak

Replying to bonefish:

There isn't anything substantial in the MemoryManager that has changed in the last months. Only small adjustments due to VM interface changes. This looks like a

Indeed, first time I've seen it and I've been using that slab memory manager option pretty much since you introduced it.

I guess for testing purposes I could add an aggressive free list check after any operation. Will add that to my TODO list.

Will keep an eye out. On the off chance I run into it again meanwhile, is there any other slab-related info that'd be helpful from the debugger or is the area output about as useful as we'll get?

comment:4 in reply to:  3 Changed 9 years ago by bonefish

Replying to anevilyak:

Will keep an eye out. On the off chance I run into it again meanwhile, is there any other slab-related info that'd be helpful from the debugger or is the area output about as useful as we'll get?

Without tracing (SLAB_MEMORY_MANAGER_TRACING), I'm afraid, there isn't anymore interesting info one can get. With tracing enabled, all operations on the concerned meta chunk (the tracing data contain the meta chunk address instead of the index, though) would be of interest. Might be quite a few, though.

comment:5 Changed 9 years ago by bonefish

Keywords: slab memory manager added
Status: newin-progress

I could reproduce the panic once after several boots. Will poke around a bit...

comment:6 in reply to:  5 Changed 9 years ago by bonefish

Cc: axeld added

Replying to bonefish:

I could reproduce the panic once after several boots. Will poke around a bit...

Despite dozens of tries on all three of my machines (Core i7, Core 2 Due, P4) I didn't see panic even one more time. I tried with and without tracing, and with and without the paranoid checks I introduced in hrev37360. I even implemented a bit of allocator stress testing (not checked in), which didn't trigger the problem either.

So if one of you guys who can reproduce the bug more reliably could enable DEBUG_SLAB_MEMORY_MANAGER_PARANOID_CHECKS and see what happens, that would be helpful.

comment:7 Changed 9 years ago by anevilyak

Paranoia settings and slab mem manager tracing enabled, will let you know if/when I see it again.

comment:8 Changed 9 years ago by axeld

The two settings so far reliably defeated the problem, unfortunately. Before, 80+% of all boots resulted in the panic :-/

comment:9 Changed 9 years ago by anevilyak

I haven't seen it so far either with those settings, though in fairness, I've only ever encountered it once myself so far anyways.

comment:10 Changed 8 years ago by bonefish

Status: in-progressassigned

comment:11 Changed 8 years ago by anevilyak

Still haven't seen this one, shall we close it, and just reopen if I happen to encounter it again?

comment:12 Changed 8 years ago by bonefish

If Axel, who seemed to be able to reproduce the issue most reliably, agrees, closing is fine by me as well.

comment:13 Changed 8 years ago by axeld

Resolution: fixed
Status: assignedclosed

I'm not sure if my current version enables any extra tracing; I would assume it doesn't, and I haven't seen this bug since ages. So I guess closing it shouldn't harm :-)

comment:14 Changed 8 years ago by heto

Cc: henrimoi@… added
Resolution: fixed
Status: closedreopened

I can reproduce this on alpha 3. Trying to install from a cd with an anyboot image it KDLs about halfway through the installation.

I’ll attach a syslog from the installation attempt; I should be able to build Haiku if there are any tracing options to be enabled.

Changed 8 years ago by heto

Attachment: syslog-alpha3.txt added

Syslog captured using the save feature in the bootloader

comment:15 Changed 8 years ago by bonefish

Resolution: fixed
Status: reopenedclosed

Fixed in hrev42385. Unfortunately it is not related to the bug reported originally, as the new issue was only introduced in hrev37701, thus postdating the ticket.

comment:16 Changed 8 years ago by heto

Confirming that the fix seems to have worked.

comment:17 Changed 8 years ago by Premislaus

comment:18 Changed 7 years ago by bonefish

Blocking: 8030 added

(In #8030) The description is a bit confusing, but I assume the KDL is what this ticket is reporting. The KDL was already reported in #6237 (after reopening it) and fixed in hrev42385. Closing as duplicate.

comment:19 Changed 7 years ago by axeld

Blocking: 9010 added

(In #9010) Indeed, thanks for the investigation!

Note: See TracTickets for help on using tickets.