Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#2059 closed bug (fixed)

KDL during svn checkout in block notifier/writer

Reported by: anevilyak Owned by: bonefish
Priority: critical Milestone: R1/alpha1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc: marcusoverhagen
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

After ~1GB of files having been downloaded via svn, Haiku KDLed with:

PANIC: vm_page_fault: unhandled page fault in kernel space at 0x200246 ip 0x86d9c800

stack trace for thread 8 "block notifier/writer"
<snip>usual debugger parser functions
<kernel>:panic + 0x0029
<kernel>:vm_page_fault + 0x00ab
<kernel>:page_fault_exception + 0x00b1
<kernel>:int_bottom + 0x001d (nearest)
iframe at 0x8013ad9c (end = 0x8013adf4)

eax 0x86d9c800 ebx 0x80131d68 ecx 0x1 edx 0x200246
esi 0x80131d88 edi 0x8013ae40 ebp 0x8013ae48 esp 0x8013add0
eip 0x86d9c800 eflags 0x210287
vector: 0xe, error code: 0x0

<kernel>:flush_pending_notificationsFv + 0x0069
<kernel>:block_notifier_and_writerFPv + 0x0055
<kernel>:_create_kernel_thread_kentry
Fv + 0x001b
<kernel>:thread_kthread_exitFv + 0x0000

This was on hrev24880.

Attachments (1)

NotificationKDL2.PNG (332.1 KB) - added by bga 6 years ago.
General Protection Exception

Download all attachments as: .zip

Change History (18)

comment:1 Changed 6 years ago by anevilyak

I just realized this was with the old ide stack ; I'm usually using ata but reverted it to build an image for someone, then forgot to set it back. Not sure if that can influence it or not, but rebuilding with ata in place now to see if I can replicate it in that scenario or not.

comment:2 Changed 6 years ago by anevilyak

On further note, I've tried a complete svn checkout + rm -rf twice now with ata, and I cannot replicate this crash now ; perhaps some interplay/race condition in how the old IDE stack works?

comment:3 Changed 6 years ago by axeld

Judging from the stack crawl, it doesn't look like there is any obvious connection to ata vs. ide, at least.
This looks like the cache list was corrupted - I dunno what have caused this, though, could be theoretically everything...

comment:4 Changed 6 years ago by anevilyak

Yeah, I realize the crawl doesn't really point fingers at the ide stack directly, but I thought I'd point it out anyways since I cannot seem to replicate that crash again with the other stack. I thought perhaps it might be possible the ide stack was destroying a data block after having passed it to the cache or something along those lines ; wasn't certain how they interacted. I can try switching back to ide and see if it crashes consistently if you'd like.

comment:5 Changed 6 years ago by axeld

No need to, if it doesn't pop up anymore, we can just close this ticket. If it does, then we had reason to keep it open ;-)
I guess it'll come back. They all come back :-))

comment:6 Changed 6 years ago by anevilyak

Understood, will let you know if I run into it again since I'm probably going to be doing this kind of thing a lot the next few days :) If I do hit it again I'll leave it in the kernel debugger so you can let me know any possibly useful info I can try to trace out of it.

comment:7 Changed 6 years ago by ddew

I've just run in to what looks like the same issue on hrev24968 using the new ata stack. I've left it in KDL if you want me to run some tests or need more info.

comment:8 Changed 6 years ago by bonefish

#2150 is a dup of this one.

I also ran into the problem, but all I could find out is that the crash happened when the notification was removed from the list. The list had been corrupted.

The crash happened in a low-memory situation, BTW. I was running the OpenSSH test suite and due to the net buffer data header leak it had consumed virtually all memory. Furthermore another bug caused the test to write an ever-growing file at the same time.

comment:9 Changed 6 years ago by bga

I got the same crash and it was not in a low memory condition at all (unless we also consider low memory condition when the cache is taking up almost all available memory). Basically I booted Haiku and tried to create a 3 Gb file using DD in a partition that had 3.6 Gb available. After a while, I got the KDL. BTW, I managed to do it 2 in a row (didn't try more then 2 times tough) using the same steps. See bug #2151.

comment:10 Changed 6 years ago by bga

Got the crash again when checking out the source tree with svn from inside Haiku. The KDL was almost the same but the specific error it reported was different, so I am attaching a new screenshot just for reference.

Changed 6 years ago by bga

General Protection Exception

comment:11 Changed 6 years ago by stippi

  • Milestone changed from R1 to R1/alpha1
  • Priority changed from normal to critical

This one is critical for R1/alpha.

comment:12 Changed 6 years ago by stippi

Still with us in hrev25463.

comment:13 Changed 6 years ago by marcusoverhagen

  • Cc marcusoverhagen added

I also get this with hrev25523 while checking out haiku source.
Used memory in ActivityMonitor was 694 MB when it crashed (on a 4GB machine).

vm_soft_fault: kernel thread accessing invalid user memory!
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x30010800, ip 0x30010800, write 0, user 0, thread 0xb
CPU 1 halted!
CPU 3 halted!
CPU 0 halted!
PANIC: vm_page_fault: unhandled page fault in kernel space at 0x30010800, ip 0x30010800

Welcome to Kernel Debugging Land...
Running on CPU 2
kdebug> sc
stack trace for thread 11 "block notifier/writer"
    kernel stack: 0x80151000 to 0x80155000
frame            caller     <image>:function + offset
80154a60 (+  48) 800496e7   <kernel>:invoke_debugger_command + 0x00cf
80154a90 (+  64) 8004a490   <kernel>:_ParseCommand__16ExpressionParserRi + 0x01f8
80154ad0 (+  48) 80049e82   <kernel>:EvaluateCommand__16ExpressionParserPCcRi + 0x01de
80154b00 (+ 224) 8004b5a4   <kernel>:evaluate_debug_command + 0x0088
80154be0 (+  64) 80048222   <kernel>:kernel_debugger_loop__Fv + 0x017a
80154c20 (+  48) 80048ed5   <kernel>:kernel_debugger + 0x010d
80154c50 (+ 192) 80048dbd   <kernel>:panic + 0x0029
80154d10 (+  64) 8009410f   <kernel>:vm_page_fault + 0x00ab
80154d50 (+  64) 8009e289   <kernel>:page_fault_exception + 0x00b1
80154d90 (+  12) 800a197d   <kernel>:int_bottom + 0x001d (nearest)
iframe at 0x80154d9c (end = 0x80154df4)
 eax 0x30010800     ebx 0xa485ebe4      ecx 0x0          edx 0x200246
 esi 0xa485ec04     edi 0x80154e40      ebp 0x80154e48   esp 0x80154dd0
 eip 0x30010800  eflags 0x210287
 vector: 0xe, error code: 0x0

[*** READ/WRITE FAULT ***]
kdebug> 

comment:14 Changed 6 years ago by marcusoverhagen

This might be related, too.

PANIC: free(): free failed for address 0x8014bd98

Welcome to Kernel Debugging Land...
Running on CPU 0
kdebug> sc
stack trace for thread 11 "block notifier/writer"
    kernel stack: 0x80151000 to 0x80155000
frame            caller     <image>:function + offset
80154b08 (+  48) 800496e7   <kernel>:invoke_debugger_command + 0x00cf
80154b38 (+  64) 8004a490   <kernel>:_ParseCommand__16ExpressionParserRi + 0x01f8
80154b78 (+  48) 80049e82   <kernel>:EvaluateCommand__16ExpressionParserPCcRi + 0x01de
80154ba8 (+ 224) 8004b5a4   <kernel>:evaluate_debug_command + 0x0088
80154c88 (+  64) 80048222   <kernel>:kernel_debugger_loop__Fv + 0x017a
80154cc8 (+  48) 80048ed5   <kernel>:kernel_debugger + 0x010d
80154cf8 (+ 192) 80048dbd   <kernel>:panic + 0x0029
80154db8 (+  48) 8002e814   <kernel>:free + 0x0074
80154de8 (+  96) 800238ee   <kernel>:flush_pending_notifications__FP11block_cache + 0x01a6
80154e48 (+  64) 80023999   <kernel>:flush_pending_notifications__Fv + 0x006d
80154e88 (+ 336) 800259c1   <kernel>:block_notifier_and_writer__FPv + 0x0055
80154fd8 (+  32) 80040d6f   <kernel>:_create_kernel_thread_kentry__Fv + 0x001b
80154ff8 (+2146086920) 80040d04   <kernel>:thread_kthread_exit__Fv + 0x0000
kdebug> 

comment:15 Changed 6 years ago by bonefish

  • Owner changed from axeld to bonefish
  • Status changed from new to assigned

comment:16 Changed 6 years ago by bonefish

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in hrev25525.

comment:17 Changed 6 years ago by stippi

Works beautifully now! Thanks a lot!

Note: See TracTickets for help on using tickets.