Context Navigation

#1359 closed bug (fixed)

PANIC: vm_cache_insert_page()

Reported by:	jonas.kirilla	Owned by:	bonefish
Priority:	blocker	Milestone:	R1
Component:	System/Kernel	Version:	R1/pre-alpha1
Keywords:		Cc:	axeld
Blocked By:		Blocking:
Platform:	All

Description

This panic is reproducible, withins seconds on my hardware, by running either of these in Terminal:

while (true); do sleep 0.001; done;
while (true); do times > /dev/null; done;

This is with Haiku hrev21827 on real hardware.

Sample: (KDL1)

PANIC: vm_cache_insert_page(): there's already page 0x928dfc08 with cache offset 4095 in cache 0x90af3680; inserting page 0x928e0188
Welcome to Kernel Debugging Land...
Running on CPU 0
kdebug> bt
stack trace for thread 0x7e "sh"
    kernel stack: 0x9c80d000 to 0x9c811000
      user stack: 0x7efe7000 to 0x7ffe7000
frame            caller     <image>:function + offset
9c810c50 (+  52) 8008635b   <kernel>:invoke_command + 0x0073
9c810c84 (+  48) 800864a2   <kernel>:kernel_debugger_loop + 0x0102
9c810cb4 (+  32) 80086efa   <kernel>:kernel_debugger + 0x00b2
9c810cd4 (+ 192) 80086e3d   <kernel>:panic + 0x0029
9c810d94 (+  80) 8005ad28   <kernel>:vm_cache_insert_page + 0x00a8
9c810de4 (+ 208) 8005892c   <kernel>:vm_soft_fault__FUlbT1 + 0x0868
9c810eb4 (+  64) 80057ecd   <kernel>:vm_page_fault + 0x0031
9c810ef4 (+ 176) 8008f68a   <kernel>:i386_handle_trap + 0x023a
iframe at 0x9c810fac (end = 0x9c811000)
 eax 0x0            ebx 0x32bedc        ecx 0x7ffe6c68   edx 0x0
 esi 0xf8           edi 0x0             ebp 0x7ffe6cac   esp 0x9c810fdc
 eip 0x30ffcf    eflags 0x10207    user esp 0x7ffe6c88
 vector: 0xe, error code: 0x7
9c810fa4 (+   0) 0030ffcf   </boot/beos/system/lib/libroot.so@0x00295000>:unknown + 0x7afcf
7ffe6cac (+  32) 0023c1ad   </bin/sh@0x00200000>:unknown + 0x3c1ad
7ffe6ccc (+  48) 0022762c   </bin/sh@0x00200000>:unknown + 0x2762c
7ffe6cfc (+  96) 00226b4b   </bin/sh@0x00200000>:unknown + 0x26b4b
7ffe6d5c (+  96) 00223cb7   </bin/sh@0x00200000>:unknown + 0x23cb7
7ffe6dbc (+  80) 00223682   </bin/sh@0x00200000>:unknown + 0x23682
7ffe6e0c (+  48) 00225f54   </bin/sh@0x00200000>:unknown + 0x25f54
7ffe6e3c (+  48) 00225e79   </bin/sh@0x00200000>:unknown + 0x25e79
7ffe6e6c (+  80) 00223e56   </bin/sh@0x00200000>:unknown + 0x23e56
7ffe6ebc (+  80) 00223682   </bin/sh@0x00200000>:unknown + 0x23682
7ffe6f0c (+  48) 0021f05e   </bin/sh@0x00200000>:unknown + 0x1f05e
7ffe6f3c (+  64) 0021d142   </bin/sh@0x00200000>:unknown + 0x1d142
7ffe6f7c (+  48) 00215c7f   </bin/sh@0x00200000>:unknown + 0x15c7f
7ffe6fac (+  48) 001007c8   1190:runtime_loader_seg0ro@0x00100000 + 0x7c8
7ffe6fdc (+   0) 7ffe6fec   1187:/bin/sh_main_stack@0x7efe7000 + 0xffffec

Attachments (5)

KDL1 - sh - while (true) do sleep 0.001 done (2.7 KB ) - added by jonas.kirilla 18 years ago.
KDL2 - sh - while (true) do sleep 0.001 done (2.6 KB ) - added by jonas.kirilla 18 years ago.
KDL3 - sh - while (true) do times to dev null done (2.4 KB ) - added by jonas.kirilla 18 years ago.
haiku-sysinfo.txt (1.1 KB ) - added by jonas.kirilla 18 years ago.
beos-sysinfo.txt (624 bytes ) - added by jonas.kirilla 18 years ago.

Download all attachments as: .zip

Change History (12)

by jonas.kirilla, 18 years ago

Attachment:	KDL1 - sh - while (true) do sleep 0.001 done added

by jonas.kirilla, 18 years ago

Attachment:	KDL2 - sh - while (true) do sleep 0.001 done added

by jonas.kirilla, 18 years ago

Attachment:	KDL3 - sh - while (true) do times to dev null done added

by jonas.kirilla, 18 years ago

Attachment:	haiku-sysinfo.txt added

by jonas.kirilla, 18 years ago

Attachment:	beos-sysinfo.txt added

follow-up: 2 comment:1 by axeld, 18 years ago

Owner:	changed from axeld to bonefish
Priority:	normal → blocker

bonefish, wasn't that supposed to be fixed now? :-)

in reply to: 1 comment:2 by bonefish, 18 years ago

Replying to axeld:

bonefish, wasn't that supposed to be fixed now? :-)

I fixed one occurrence. Looks like this wasn't the only one. Scary.

But hey, why does this become my bug now? I only added the panic(). :-)

comment:3 by bonefish, 18 years ago

Cc:	axeld added

I think I've spotted the problem. In fault_get_page() in the part handling a write fault and a page found in a lower cache:

		mutex_unlock(&cache->lock);
		mutex_lock(&topCache->lock);

		// Insert the new page into our cache, and replace it with the dummy page if necessary

		// if we inserted a dummy page into this cache, we have to remove it now
		if (dummyPage.state == PAGE_STATE_BUSY && dummyPage.cache == topCache)
			fault_remove_dummy_page(dummyPage, true);

		vm_cache_insert_page(topCache, page, cacheOffset);

After "cache" has been unlocked, we don't have a lock to either cache and vm_cache_remove_consumer() can happily replace our dummy page with a page from a to-be-removed lower cache (probably the one we've found). We insert our fresh page in either case. I suppose adding checking whether there's already a page should solve the problem.

I have a similarly bad feeling about the end of fault_find_page(), when no page has been found. We don't recheck whether a page has appeared in the chosen cache and will insert a clear page at the beginning of fault_get_page() at any rate.

Opinions?

comment:4 by axeld, 18 years ago

I remember you wanted more colors in your bug list ;-) Anyway, that sounds like causing the problem - so we would need to add a check there if a page has appeared in the mean time before inserting the other page.

comment:5 by bonefish, 18 years ago

Should be fixed in hrev21841.

comment:6 by jonas.kirilla, 18 years ago

Yes, it's solid as a rock now. Thanks! :)

comment:7 by bonefish, 18 years ago

Resolution:	→ fixed
Status:	new → closed

Note: See TracTickets for help on using tickets.

Download in other formats: