Opened 11 years ago
Closed 10 years ago
#10328 closed bug (fixed)
[Network Stack] crashes in socket_free
Reported by: | diver | Owned by: | nobody |
---|---|---|---|
Priority: | high | Milestone: | R1/beta1 |
Component: | Network & Internet/Stack | Version: | R1/Development |
Keywords: | Cc: | degea@… | |
Blocked By: | Blocking: | #10814 | |
Platform: | All |
Description
This is hrev46546 gcc2 hybrid in VirtualBox.
Got this KDL a few times after firing up Web+ (right after boot) and going to https://money.yandex.ru
Attachments (1)
Change History (23)
by , 11 years ago
comment:1 by , 11 years ago
Cc: | added |
---|
comment:2 by , 11 years ago
If I didn't screw up searching, the inlined atomic_add()
is here and the fUseCount variable is in class BWeakReferenceable here.
I guess the KDL hints at the net_socket_private having been delete
d before, thus being reset to deadbeef, including its BWeakReferenceable
part (and/or) its WeakPointer
member and its fUseCount
member... So when atomic_add() dereferences the weakpointer to access its fUseCount
it dereferences 0xdeadbeef
plus the offset to that usecount variable, == 0xdeadbef7.. So this would be a "heap corruption/double free()" scenario.. Sounds correct to any of you kernel gurus ?
Maybe diver could do a dis
or even dis -b20
to check how edx ended up the way it is..
comment:3 by , 11 years ago
hrev46686. Turning on and off Traffic/Photos in the collapsable view in the upper right corner at http://maps.google.com (with classic interface) reproduces this bug.
comment:4 by , 11 years ago
Component: | Kits/Network Kit → Network & Internet/Stack |
---|
comment:5 by , 11 years ago
Summary: | [Network Kit] KDL when accessing https sites → [Network Stack] crashes in socket_free |
---|
comment:6 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Sorry, I'm afraid I can't really help here. I'll let someone else have a look.
comment:8 by , 11 years ago
This is getting really, really bad -- I can't really browse anywhere without this happening. I haven't found any sites that cause it to happen 100% of the time, but it seems to happen repeatedly after browsing for any significant amount of time.
Perhaps a committer can look at Coverity and see if there are any open warnings in the network stack?
comment:9 by , 11 years ago
Just had a similar KDL (occurs very rarely here), starting from BSecureSocket::WaitForData() instead of BSecureSocket::Disconnect() if I recall.
I say "if I recall" because I don't have the full backtrace: I expected it would be in previous_syslog
but unfortunately the interesting part is corrupted/truncated by some trailing 'color codes' themselves followed by seemingly random binary garbage:
write access attempted on write-protected area 0x54 at 0xdeadb000 vm_page_fault: vm_soft_fault returned error 'Permission denied' on fault at 0xdeadbef7, ip 0x8008d6be, write 1, user 0, thread 0xb45 PANIC: vm_page_fault: unhandled page fault in kernel space at 0xdeadbef7, ip 0x8008d6be Welcome to Kernel Debugging Land... Thread 2885 "BUrlProtocol.HTTP" running on CPU 1 stack trace for thread 2885 "BUrlProtocol.HTTP" kernel stack: 0x8152a000 to 0x8152e000 user stack: 0x7a33b000 to 0x7a37b000 frame caller <image>:function + offset 0 8152dc94 (+ 32) 801413b6 <kernel_x86> arch_debug_stack_trace + 0x12 1 8152dcb4 (+ 16) 800a131f <kernel_x86> stack_trace_trampoline(NULL) + 0x0b 2 8152dcc4 (+ 12) 801330fe <kernel_x86> arch_debug_call_with_fault_handler + 0x1b 3 8152dcd0 (+ 48) 800a2e8a <kernel_x86> debug_call_with_fault_handler + 0x5a 4 8152dd00 (+ 64) 800a153b <kernel_x86> kernel_debugger_loop([34m0x80183f77[0m [36m"PANIC: "[0m, [34m0x8019a920[0m [36m"vm_page_fault: unhandled page fault in kernel space at 0x%lx, ip 0x%lx
Also forgot to run a dis
:-/
comment:10 by , 11 years ago
Milestone: | R1 → R1/alpha5 |
---|---|
Priority: | normal → high |
comment:11 by , 11 years ago
Rebuilding the network stack with TRACE_SOCKET defined would allow to see when the net_socket_private destructor seems to be called previously. Or, if not, at least we could search for heap corruption, then.
comment:15 by , 10 years ago
Blocked By: | 11098 added |
---|---|
Resolution: | → duplicate |
Status: | assigned → closed |
comment:16 by , 10 years ago
Resolution: | duplicate |
---|---|
Status: | closed → reopened |
No, this one is different. Only 0xdeadbef7 in common_{poll|select|wait_for_object} are #11098.
comment:17 by , 10 years ago
Blocked By: | 11098 removed |
---|
comment:18 by , 10 years ago
Milestone: | R1/alpha5 → R1/beta1 |
---|
comment:19 by , 10 years ago
I haven't seen this one in a while either now. Can anyone still reproduce it?
comment:21 by , 10 years ago
Didn't see it in many weeks, though that kinda overlaps with the time at which youtube videos started acting up (they abort playing after a few seconds), so not absolutely ruling out it be back if/when videos work again later, but yes for now this KDL seems to be gone!
comment:22 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Ok then, let's close for now.
Gotta follow the action on this in light of the other tracked kernel problems..
Here's socket_free() for reference: http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/network/stack/net_socket.cpp#n460
I suppose gcc inlined an
atomic_add()
in lieu of ReleaseReference().