Opened 14 years ago

Last modified 12 years ago

#6736 closed bug

[Network stack] crashes while trying to quit Fuppes — at Version 14

Reported by: diver Owned by: axeld
Priority: normal Milestone: R1
Component: Network & Internet/IPv4 Version: R1/Development
Keywords: multicast ipv4 Cc:
Blocked By: Blocking:
Platform: All

Description (last modified by diver)

FUPPES is a free, multiplatform UPnP A/V Media Server.

Minimal Fuppes package for Haiku is attached.
Unzip it to /boot, start /boot/apps/Fuppes/fuppes from terminal and try to quit it via Ctrl+С.

Change History (20)

by diver, 14 years ago

Attachment: fuppes_kdl.png added

by diver, 14 years ago

Attachment: fuppes_kdl2.png added

by diver, 14 years ago

Attachment: fuppes_kdl3.png added

by diver, 14 years ago

Attachment: fuppes_kdl4.png added

comment:1 by diver, 14 years ago

Actually, I'm not sure about the 1st and 4th screenshots, but they were observed while trying to quit Fuppes.

Last edited 14 years ago by diver (previous) (diff)

comment:2 by diver, 14 years ago

Keywords: multicast added

comment:4 by mmlr, 13 years ago

Resolution: fixed
Status: newclosed

Should be fixed in hrev42899. The stack traces are not pointing at the actual problem due to the heap getting corrupted and only on further use crashes happen. Please reopen if you encounter it again.

comment:5 by diger, 13 years ago

Bug reproduced on hrev42904

I tried fuppes & libbriza testapps

comment:6 by siarzhuk, 13 years ago

Looks like it is just partially fixed. Unfortunately it is still observed with my sis19x network driver on revisions after hrev42899. Typical stack crawl is looking like one in attached fuppes_kdl2.png (JoinGroup + 0x00f6). One time it was crashed in the same place as shown in fuppes_kdl3.png (Clear() + 0x0060). Note that sis19x is native driver so freebsd compat layer could not be taken into account - so hrev42899 changes are unrelated in this exact case.

I have tried to catch the case some week ago but failed. Looks like the suspect is the "MultiHashTable<MulticastStateHash>* sMulticastState" object. At least commenting it out forces the KDLs to disappear.

Note that typical reproduce sequence is:
a) start fuppes;
b) Ctrl-C to quit fuppes;
c) start fuppes -> fall through into KDL;

Sometime it is required to repeat b) and c) some times to receive the KDL.

Are there any suggestion to trace or debug?

PS: Test was performed with version of sis19x just answering B_OK on B_ETHER_ADDMULTI / B_ETHER_REMMULTI ioctl requests.

comment:7 by axeld, 13 years ago

Multicast is pretty much broken at this point in the ipv4/ipv6 modules for quite some time.

comment:8 by mmlr, 13 years ago

Resolution: fixed
Status: closedreopened

At least screenshots 1 and 4 look exactly like the ones that happened here due to the FreeBSD compatibility layer issue fixed in hrev42904. So there seem to be two separate bugs here, one of which was fixed, the other still open. Even though it'd be nicer to have separate bug reports for each, let's just reopen this one.

Replying to siarzhuk:

Looks like it is just partially fixed. Unfortunately it is still observed with my sis19x network driver on revisions after hrev42899. Typical stack crawl is looking like one in attached fuppes_kdl2.png (JoinGroup + 0x00f6). One time it was crashed in the same place as shown in fuppes_kdl3.png (Clear() + 0x0060). Note that sis19x is native driver so freebsd compat layer could not be taken into account - so hrev42899 changes are unrelated in this exact case.

I see. The FreeBSD ones still use the same upper layers though, so it's entirely possible to run into that issue with a FreeBSD driver as well. In my limited test case I was doing something unrelated, so I didn't stress test the multicast mechanism after the fix.

Are there any suggestion to trace or debug?

Adding/enabling tracing to see what's really going on would make sense. I can take another look of course to see if I can spot anything when using the indicated software. I've used my own SSDP implementation and did kill the app always, so it's possible that, if the software mentioned does do a proper cleanup on getting the signal, different code paths were used.

in reply to:  7 comment:9 by mmlr, 13 years ago

Replying to axeld:

Multicast is pretty much broken at this point in the ipv4/ipv6 modules for quite some time.

...

Fixing at least the KDLs would be my immediate goal anyway.

in reply to:  7 comment:10 by Disreali, 13 years ago

Replying to axeld:

Multicast is pretty much broken at this point in the ipv4/ipv6 modules for quite some time.

Is there a ticket for that?

comment:11 by siarzhuk, 13 years ago

Well, I have catched the case: the LeaveGroup is not called on dropping multicast group membership. And looks like the groups hash-map contains after this invalid pointer to deleted object. So the next attempt to add the same group fails on iteration of the map. I have no idea about all those templates kung-fu but attached patch solves at least this problem with fuppes application. :-)

by siarzhuk, 13 years ago

Attachment: ipv4.patch added

A workaround for drop multicast group KDL.

comment:12 by siarzhuk, 13 years ago

patch: 01

comment:13 by diver, 12 years ago

Tried again with hrev44559.

vm_soft_fault: va 0x0 not covered by area in address space
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x24, ip 0xcd7ee73d, write 0, user 0, thread 0x2a1
PANIC: vm_page_fault: unhandled page fault in kernel space at 0x24, ip 0xcd7ee73d

Welcome to Kernel Debugging Land...
Thread 673 "fuppes" running on CPU 0
stack trace for thread 673 "fuppes"
    kernel stack: 0x806cd000 to 0x806d1000
      user stack: 0x7efef000 to 0x7ffef000
frame               caller     <image>:function + offset
 0 806d0794 (+  32) 801241e2   <kernel_x86>:arch_debug_stack_trace + 0x0012
 1 806d07b4 (+  16) 800910cf   <kernel_x86> stack_trace_trampoline(NULL) + 0x000b
 2 806d07c4 (+  12) 8012964e   <kernel_x86>:arch_debug_call_with_fault_handler + 0x001b
 3 806d07d0 (+  48) 80092b5e   <kernel_x86>:debug_call_with_fault_handler + 0x005e
 4 806d0800 (+  64) 800912ef   <kernel_x86> kernel_debugger_loop(0x8016d6f7 "PANIC: ", 0x80182ae0 "vm_page_fault: unhandled page fault in kernel space at 0x%lx, ip 0x%lx
", 0x806d08ac "$", int32: 0) + 0x021b
 5 806d0840 (+  48) 80091653   <kernel_x86> kernel_debugger_internal(0x8016d6f7 "PANIC: ", 0x80182ae0 "vm_page_fault: unhandled page fault in kernel space at 0x%lx, ip 0x%lx
", 0x806d08ac "$", int32: 0) + 0x0053
 6 806d0870 (+  48) 80092ed8   <kernel_x86>:panic + 0x0024
 7 806d08a0 (+ 144) 80106f8d   <kernel_x86>:vm_page_fault + 0x0129
 8 806d0930 (+  80) 8012583e   <kernel_x86> page_fault_exception(iframe*: 0x806d098c) + 0x017e
 9 806d0980 (+  12) 8012a5fd   <kernel_x86>:int_bottom + 0x003d
kernel iframe at 0x806d098c (end = 0x806d09dc)
 eax 0x0            ebx 0xcd7f43e4      ecx 0xd2870620   edx 0xd2870600
 esi 0x82009488     edi 0x5             ebp 0x806d0a54   esp 0x806d09c0
 eip 0xcd7ee73d  eflags 0x13282
 vector: 0xe, error code: 0x0
10 806d098c (+ 200) cd7ee73d   </boot/system/add-ons/kernel/network/protocols/ipv4> IPv4Multicast<0xd2870600>::JoinGroup() + 0x0249
11 806d0a54 (+ 160) cd7f2332

by diver, 12 years ago

Attachment: Fuppes_kdl.zip added

comment:14 by diver, 12 years ago

Component: Network & Internet/StackNetwork & Internet/IPv4
Description: modified (diff)
Keywords: ipv4 added
Note: See TracTickets for help on using tickets.