#2594 closed bug (fixed)
PANIC: ASSERT FAILED (src/add-ons/kernel/network/protocols/tcp/BufferQueue.cpp:304): buffer != __null
Reported by: | stippi | Owned by: | axeld |
---|---|---|---|
Priority: | high | Milestone: | R1/alpha1 |
Component: | Network & Internet/TCP | Version: | R1/pre-alpha1 |
Keywords: | Cc: | mattmadia@… | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
Revision is hrev26680, no patches from newer revision applied to kernel stuff. I left the system running over night with a Transmission download (0.70 for BONE version). Returning to it in the morning, it showed the above panic. Here is the backtrace:
stack trace for thread 2681 "torrent 0x18084e10" ... <kernel>:panic </boot/beos/.../protocols/tcp>:Get__11BufferQueueUlbPP10net_buffer + 0x0069 </boot/beos/.../protocols/tcp>:ReadData__11TCPEndpointUlUlPP10net_buffer + 0x0307 </boot/beos/.../protocols/tcp>:tcp_read_data__FP12net_protocolUlUlPP10net_buffer + 0x0029 </boot/beos/.../network/stack>:socket_receive__FP10net_socketP6msghdrPvUli + 0x0087 < </boot/beos/.../network/stack>:stack_interface_recvfrp__FP10net_socketPvUliP8sockaddrPUi + 0x008c <kernel>:common_recvfrom__FiPvUliP8sockaddrPUib + 0x0055 <kernel>:_user_recvfrom + 0x0091 syscall stuff ...
Attachments (7)
Change History (34)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Priority: | normal → high |
---|---|
Summary: | [TPC] PANIC: ASSERT FAILED (src/add-ons/kernel/network/protocols/tcp/BufferQueue.cpp:304): buffer != __null → PANIC: ASSERT FAILED (src/add-ons/kernel/network/protocols/tcp/BufferQueue.cpp:304): buffer != __null |
comment:4 by , 16 years ago
Cc: | added |
---|
I too can reproduce this KDL reliably with the same version of Transmission on hrev28822.
Adding almost complete output of bt
What other commands shoud I run while in KDL?
by , 16 years ago
Attachment: | kdl_BufferQueue.txt added |
---|
comment:5 by , 16 years ago
I don't know about the kernel debugger command for the network stack, Axel will probably tell you more, but it might be useful to enable the debug output. That's done by uncommenting line 15 in src/add-ons/kernel/network/protocols/tcp/BufferQueue.cpp.
comment:6 by , 16 years ago
Status: | new → assigned |
---|
comment:7 by , 16 years ago
by , 16 years ago
Attachment: | BufferQueue.cpp.1.diff added |
---|
comment:10 by , 16 years ago
I checked with the old code + Verify() (but with trace messages instead of panics), that the buffer queue gets broken many minutes earlier than the buffer != null panic happens.
Also, the bug would probably break the data sent through tcp, because data from some segments would be duplicated in the buffer queue.
comment:11 by , 16 years ago
Thanks Adrian! I've cleaned up your patch, and fixed a few more problems in hrev28878. I will look into writing a test for BufferQueue next week, though, to make sure it's really okay now (seeing how many bugs proof-reading revealed already).
I will close this bug once the test app is in place.
by , 16 years ago
Attachment: | img_1568.jpg added |
---|
comment:13 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Everything should work fine now, with hrev28883.
comment:15 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
How exactly can you reproduce this so fast? :-)
Thanks for the note, it seems to be really hard to get that right. I fixed one occurence of that assert with the test app, so I thought I got it. At least that's the least important assert :-)
comment:16 by , 16 years ago
I can reproduce it in a just few minutes with transmission + 30 torrents :-)
follow-up: 18 comment:17 by , 16 years ago
Would a Haiku binary of v0.7x help with testing? Adek336 does this still occur with the v1.42 at http://www.haikuware.com/view-details/development/app-installation/transmission-142
comment:18 by , 16 years ago
Replying to mmadia:
Would a Haiku binary of v0.7x help with testing? Adek336 does this still occur with the v1.42 at http://www.haikuware.com/view-details/development/app-installation/transmission-142
Indeed, it happens with both transmission 0.4 and 1.42, and the bug is clearly in the kernel side of things. (Btw, transmission 1.42 daemon crashes to userland debugging very quickly, does it crash so quickly for you as well?)
comment:20 by , 16 years ago
buffer == null | next->seque". |
by , 16 years ago
photograph of bt
in kdl, amd x2 cpu. both cores enabled.
by , 16 years ago
Attachment: | smp-disabled.jpg added |
---|
photograph of bt
in kdl, amd x2 cpu. smp disabled via boot options menu
comment:21 by , 16 years ago
added two photographs of the resulting KDL, tested on revision hrev28947~49 used Transmission 0.70-bone
comment:22 by , 16 years ago
Milestone: | R1 → R1/alpha1 |
---|
Maybe I should just stop adding new assertion to the code ;-)
Next time that happens, what's more interesting than a stack crawl is a dump of the buffer passed in and the buffer queue itself (via dumping the TCP connection).
by , 16 years ago
Attachment: | img_1845.jpg added |
---|
comment:24 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Thanks, Adrian! Turns out I tried to reproduce the bug with a version of the module that hadn't the assert activated...
Anyway, it's fixed now, since hrev28958 - the assert was just wrong.
by , 16 years ago
Attachment: | img_1846.jpg added |
---|
comment:25 by , 16 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
There's still an empty buffer there.
comment:26 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Thanks, fixed in hrev28967, finally.
I guess you didn't use the debugger and investigated the issue a bit? The tcp module has some useful KDL commands for problems like that :-)
In any case, it's an interesting bug. It seems to be an internal bug in the BufferQueue, maybe caused by lack of memory, but it could have any other reason, too. So I hope this happens again, eventually to me this time.
How large was the downloaded file, and how fast was it?