Opened 17 years ago
Closed 16 years ago
#2143 closed bug (fixed)
KDL in net timer.
Reported by: | bga | Owned by: | axeld |
---|---|---|---|
Priority: | high | Milestone: | R1/alpha1 |
Component: | Network & Internet/Stack | Version: | R1/pre-alpha1 |
Keywords: | Cc: | tqh | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
This is with the latest revision, hrev25113. I got a KDL on net timer when starting Firefox. In would guess this is related to Ingo's recent changes but I am not sure.
Attachments (1)
Change History (13)
by , 17 years ago
comment:1 by , 17 years ago
Component: | System/Kernel → Network & Internet/Stack |
---|---|
Milestone: | R1 → R1/alpha1 |
Owner: | changed from | to
Can you reproduce this? When I tried, Firefox started without problem and I could visit Haiku's website.
I wouldn't rule out, that my changes are to blame, but until proven I move this bug to the net stack component.
After a quick glance at the code, I saw that uninit_timers() doesn't look good, BTW. Destroying a benaphore won't really be noticed by anyone unless one locks it before doing that. Even then I'm not sure that there are no races with the timer thread.
follow-up: 5 comment:2 by , 17 years ago
uninit_timers() is only called when the stack is being unloaded - at that time, no protocol should be active anymore (and therefore, no timers). So that's at least unrelated to the problem we're seeing here :-)
comment:3 by , 17 years ago
Cc: | added |
---|
comment:5 by , 17 years ago
Replying to axeld:
uninit_timers() is only called when the stack is being unloaded - at that time, no protocol should be active anymore (and therefore, no timers). So that's at least unrelated to the problem we're seeing here :-)
I didn't think it was related. I just found it weird that the timer thread checks the acquisition of the benaphore at all and while uninit_timers() doesn't do anything to ever trigger this check in the first place.
comment:6 by , 17 years ago
Probably related (hrev25537):
PANIC: vm_page_fault: unhandled page fault in kernel space at 0xdeadbeef, ip 0xdeadbeef Welcome to Kernel Debugging Land... Running on CPU 0 kdebug> sc stack trace for thread 66 "net timer" kernel stack: 0x807ca000 to 0x807ce000 frame caller <image>:function + offset 807cdc20 (+ 48) 8004e69b <kernel>:invoke_debugger_command + 0x00cf 807cdc50 (+ 64) 8004f444 <kernel>:_ParseCommand__16ExpressionParserRi + 0x01f8 807cdc90 (+ 48) 8004ee36 <kernel>:EvaluateCommand__16ExpressionParserPCcRi + 0x01de 807cdcc0 (+ 224) 80050558 <kernel>:evaluate_debug_command + 0x0088 807cdda0 (+ 64) 8004d1d6 <kernel>:kernel_debugger_loop__Fv + 0x017a 807cdde0 (+ 48) 8004de89 <kernel>:kernel_debugger + 0x010d 807cde10 (+ 192) 8004dd71 <kernel>:panic + 0x0029 807cded0 (+ 64) 8009acef <kernel>:vm_page_fault + 0x00ab 807cdf10 (+ 64) 800a4de1 <kernel>:page_fault_exception + 0x00b1 807cdf50 (+ 12) 800a857d <kernel>:int_bottom + 0x001d (nearest) iframe at 0x807cdf5c (end = 0x807cdfb4) eax 0xdeadbeef ebx 0x807c9498 ecx 0x9115f000 edx 0x200246 esi 0x91be9e6c edi 0x0 ebp 0x807cdfd8 esp 0x807cdf90 eip 0xdeadbeef eflags 0x210287 vector: 0xe, error code: 0x0 807cdf5c (+ 124) deadbeef 807cdfd8 (+ 32) 800446ef <kernel>:_create_kernel_thread_kentry__Fv + 0x001b 807cdff8 (+2139299848) 80044684 <kernel>:thread_kthread_exit__Fv + 0x0000
If TCP does indeed use the timers as explained on the commit mailing list recently, it might be a good idea to remove the race condition, as it perfectly explains these kinds of problems.
comment:7 by , 16 years ago
I never got this anymore. Is this still an issue to anyone? If not, maybe it can be closed.
comment:8 by , 16 years ago
I'm the only cc, and I've never seen it. I just watch for Firefox related problems. So I think it can be closed.
comment:9 by , 16 years ago
The original problem is still persistent, and is waiting for a fix, I just didn't get around doing it.
comment:10 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:11 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Hi,
I got two KDL on hrev30140, both very similar (I would say the same). So I think I'm getting near a way to reproduce it. It involves the network preflets and the ftp command line client.
I get this error :
ARP host 0294a8c0 updated with different hardware address 00:0c:29:4a:0e:7d. ARP host 0294a8c0 updated with different hardware address 00:0c:29:4a:0e:7d. vm_soft_fault: va 0xdeadb000 not covered by area in address space vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0xdeadbef3, ip 0x800bac37, write 1, user 0, thread 0x38 PANIC: vm_page_fault: unhandled page fault in kernel space at 0xdeadbef3, ip 0x800bac37 Welcome to Kernel Debugging Land... Thread 56 "net timer" running on CPU 0 kdebug> bt stack trace for thread 56 "net timer" kernel stack: 0x8023d000 to 0x80241000 frame caller <image>:function + offset 0 80240b6c (+ 48) 80060271 <kernel_x86>:invoke_debugger_command + 0x00f5 1 80240b9c (+ 64) 80060061 <kernel_x86> invoke_pipe_segment(debugger_command_pipe*: [34m0x8012a220[0m, int32: [34m0[0m, [34m0x0[0m [31m"<NULL>"[0m) + 0x0079 2 80240bdc (+ 64) 800603e8 <kernel_x86>:invoke_debugger_command_pipe + 0x009c 3 80240c1c (+ 48) 80061998 <kernel_x86> ExpressionParser<[32m0x80240cd0[0m>::_ParseCommandPipe([34m0x80240ccc[0m) + 0x0234 4 80240c4c (+ 64) 80060dd2 <kernel_x86> ExpressionParser<[32m0x80240cd0[0m>::EvaluateCommand([34m0x8011ab60[0m [36m"bt"[0m, [34m0x80240ccc[0m) + 0x02ba 5 80240c8c (+ 224) 80062dc0 <kernel_x86>:evaluate_debug_command + 0x0088 6 80240d6c (+ 64) 8005e162 <kernel_x86> kernel_debugger_loop() + 0x01ae 7 80240dac (+ 32) 8005eff1 <kernel_x86>:kernel_debugger + 0x004d 8 80240dcc (+ 192) 8005ef99 <kernel_x86>:panic + 0x0029 9 80240e8c (+ 80) 800c1c31 <kernel_x86>:vm_page_fault + 0x0139 10 80240edc (+ 64) 800d1c1d <kernel_x86>:page_fault_exception + 0x00d9 11 80240f1c (+ 12) 800d5316 <kernel_x86>:int_bottom + 0x0036 kernel iframe at 0x80240f28 (end = 0x80240f78) eax 0x81025e40 ebx 0x80567568 ecx 0xdeadbeef edx 0xdeadbeef esi 0x80567c94 edi 0x81025e40 ebp 0x80240f78 esp 0x80240f5c eip 0x800bac37 eflags 0x10282 vector: 0xe, error code: 0x2 12 80240f28 (+ 80) 800bac37 <kernel_x86>:list_remove_link + 0x000b 13 80240f78 (+ 32) 800bad1c <kernel_x86>:list_remove_item + 0x0018 14 80240f98 (+ 64) 8056457a </boot/system/add-ons/kernel/network/stack> timer_thread(NULL) + 0x009a 15 80240fd8 (+ 32) 800548ff <kernel_x86> _create_kernel_thread_kentry() + 0x001b 16 80240ff8 (+2145120264) 8005489c <kernel_x86> thread_kthread_exit() + 0x0000
Basically, the connection between my haiku guest and my ubuntu host get lost while I'm in FTP, I open the network preflet to change the ip address (probably useless), and tada, KDL..
I'm not sure if it's really related to this ticket, but I think it's pretty close (net timer thread, etc..)
comment:12 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Even though it's the same stack trace, this is a whole different bug. If you're not sure, making a comment without reopening the bug would be preferred.
I'm currently looking into that bug in particular, btw, so there is no need to open another ticket for this.
KDL image.