Opened 15 years ago
Last modified 5 years ago
#5949 new bug
KDL: Could not read block 18110: bytesRead: -1, error: Operation timed out
Reported by: | stippi | Owned by: | axeld |
---|---|---|---|
Priority: | low | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | dziadek@… | |
Blocked By: | Blocking: | #9578 | |
Platform: | All |
Description
I've searched the tickets, I believe this may be a new issue.
The system (rXXXXX) was running for half a day. I attached a harddisk via USB 2.0, connected via FTP from another machine and copied a file of about 700 MB to a BFS partition on the USB disk, when I returned a while later, the machine had KDL'd. This is the backtrace:
get_cached_block(block_cache*: 0xdeb9fa63, int64: -2397609950474928127, bool*: 0x80040e10, true) + 0x0296 block_cache_get_etc + 0x004c block_cache_get + 0x001c CachedNode<...>::InternalSetTo(Transaction*: NULL, int64: 57344) CachedNode<...>::SetTo() BPlusTree<...>::_SeekDown(Stack...) BPlusTree<...>::Insert(Transaction&: ...) Index<...>::Update(Transaction&: ...) Index<...>::UpdateSize(Transaction&: ...) bfs_free_cookie(fs_volume*: 0x829f42f8, fs_vnode*: 0xd4c69ae4, void*: 0xcce31898) file_free_fd(file_descriptor*: 0xd182b0c8) put_fd FDGetter::~FDGetter() common_user_io _user_write handle_syscall ...syscall stuff... <_APP_>:statfilecmd (nearest) <_APP_>:store <_APP_>:yyparse <_APP_>:main <_APP_>:_start
Typed this by hand and will now try to use the new syslog boot loader features to retrieve the real thing... :-)
Attachments (2)
Change History (12)
comment:1 by , 15 years ago
comment:2 by , 15 years ago
Yes, seems like I didn't have this enabled by default. I could only enable the feature in the boot loader, not get the syslog from the previous session. :-\ Sorry.
comment:3 by , 12 years ago
Hi, I experienced similar error today when I tried to copy files to USB stick with BFS partition. I run Haiku hrev... in VirtualBox in Windows 7 64-bit. VM has 1536MBs of RAM and 2 CPU cores. Below is KDL output, but please look at the serial.txt file as it contains much more valuable information what was going before! The error shown here is "Device not ready", however previously I got "Operation timed out" as well. I can reproduce bug, so I am able to provide more info, or even send VM snapshot if desired.
PANIC: could not read block 3145815: bytesRead: -1, error: Device not ready Welcome to Kernel Debugging Land... Thread 379 "MoveTask" running on CPU 1 stack trace for thread 379 "MoveTask" kernel stack: 0x81417000 to 0x8141b000 user stack: 0x781db000 to 0x7821b000 frame caller <image>:function + offset 0 8141a908 (+ 32) 80131b3e <kernel_x86> arch_debug_stack_trace + 0x12 1 8141a928 (+ 16) 800927af <kernel_x86> stack_trace_trampoline(NULL) + 0x0b 2 8141a938 (+ 12) 801247d2 <kernel_x86> arch_debug_call_with_fault_handler + 0x1b 3 8141a944 (+ 48) 80094252 <kernel_x86> debug_call_with_fault_handler + 0x5e 4 8141a974 (+ 64) 800929cf <kernel_x86> kernel_debugger_loop([34m0x80170997[0m [36m"PANIC: "[0m, [34m0x8014a1a0[0m [36m"could not read block %lld: bytesRead: %zd, error: %s "[0m, [34m0x8141aa20[0m [36m"W"[0m, int32: [34m1[0m) + 0x21b 5 8141a9b4 (+ 48) 80092d33 <kernel_x86> kernel_debugger_internal([34m0x80170997[0m [36m"PANIC: "[0m, [34m0x8014a1a0[0m [36m"could not read block %lld: bytesRead: %zd, error: %s "[0m, [34m0x8141aa20[0m [36m"W"[0m, int32: [34m1[0m) + 0x53 6 8141a9e4 (+ 48) 800945de <kernel_x86> panic + 0x36 7 8141aa14 (+ 112) 80047910 <kernel_x86> get_cached_block(block_cache*: [34m0xcdcb93d8[0m, int64: [34m3145815[0m, [34m0x8141aac3[0m, [34mtrue[0m) + 0x264 8 8141aa84 (+ 64) 800479ad <kernel_x86> get_writable_cached_block(block_cache*: [34m0xcdcb93d8[0m, int64: [34m3145815[0m, int64: [34m3145815[0m, int64: [34m1[0m, int32: [34m1[0m, [34mfalse[0m) + 0x61 9 8141aac4 (+ 80) 8004b4ce <kernel_x86> block_cache_get_writable_etc + 0x76 10 8141ab14 (+ 64) 8004b51e <kernel_x86> block_cache_get_writable + 0x22 11 8141ab54 (+ 96) 8152433c <bfs> CachedNode<[32m0x8141ac28[0m>::InternalSetTo(Transaction*: [34m0x8141adc4[0m, int64: [34m0[0m) + 0x128 12 8141abb4 (+ 64) 815241a8 <bfs> CachedNode<[32m0x8141ac28[0m>::SetToWritableHeader(Transaction&: [34m0x8141adc4[0m) + 0x68 13 8141abf4 (+ 80) 8152497f <bfs> BPlusTree<[32m0xea6c4d90[0m>::SetTo(Transaction&: [34m0x8141adc4[0m, Inode*: [34m0xea6c5e38[0m, int32: [34m1024[0m) + 0xaf 14 8141ac44 (+ 64) 81524731 <bfs> __9BPlusTreeR11TransactionP5Inodel + 0x65 15 8141ac84 (+ 80) 8152dbcd <bfs> InodeAllocator<[32m0x8141ad4c[0m>::CreateTree([34m0x8141adc4[0m) + 0x69 16 8141acd4 (+ 144) 81534ea3 <bfs> Inode<[32m0x8141adc4[0m>::Create(Transaction&: [34m0xea6c30b0[0m, Inode*: [34m0x8141ae04[0m, [34m0x41ed[0m [31m"<???>"[0m, int32: [34m0[0m, int32: [34m0[0m, uint32: [34m0x0[0m ([34m0[0m), [34m0x8141adbc[0m, NULL, Inode*: NULL, fs_vnode_ops*: NULL, uint32: [34m0x0[0m ([34m0[0m)) + 0x4ff 17 8141ad64 (+ 112) 8154094d <bfs> bfs_create_dir(fs_volume*: [34m0xce01a930[0m, fs_vnode*: [34m0xea6bd018[0m, [34m0x8141ae04[0m [36m"LC_MESSAGES"[0m, int32: [34m493[0m) + 0x109 18 8141add4 (+ 304) 800e0810 <kernel_x86> dir_create(int32: [34m41[0m, [34m0xcddd1938[0m [36m"."[0m, int32: [34m493[0m, [34mfalse[0m) + 0x68 19 8141af04 (+ 64) 800e6790 <kernel_x86> _user_create_dir + 0x98 20 8141af44 (+ 100) 80127470 <kernel_x86> handle_syscall + 0xcd user iframe at 0x8141afa8 (end = 0x8141b000) eax 0x70 ebx 0x4f14c0 ecx 0x78219870 edx 0xffff0114 esi 0x78219998 edi 0x78219dd8 ebp 0x7821989c esp 0x8141afdc eip 0xffff0114 eflags 0x3206 user esp 0x78219870 vector: 0x63, error code: 0x0 21 8141afa8 (+ 0) ffff0114 <commpage> commpage_syscall + 0x04 22 7821989c (+ 704) 005d50a0 <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x4ec 23 78219b5c (+ 720) 005d5312 <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e 24 78219e2c (+ 720) 005d5312 <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e 25 7821a0fc (+ 720) 005d5312 <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e 26 7821a3cc (+ 720) 005d5312 <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e 27 7821a69c (+1712) 005d625b <libtracker.so> BPrivate<[32m0x7821add0[0m>::MoveItem(BEntry*: [34m0x7821ae34[0m, BDirectory*: [34m0xffffffff[0m, BPoint*: [34m0x5463736c[0m, uint32: [34m0x0[0m ([34m0[0m), [34m0x7821ae00[0m [36m"đv/€Ěn"[0m, BPrivate::Undo&: [34m0x7821ae04[0m, BPrivate::CopyLoopControl*: [34m0x5d2f35[0m) + 0x86f 28 7821ad4c (+ 464) 005d38a5 <libtracker.so> delete_point__8BPrivatePv (nearest) + 0x9a9 29 7821af1c (+ 48) 005e13bf <libtracker.so> __cl__Q28BPrivatet33FourParamFunctionObjectWorkaround4ZPt11BObjectList1Z9entry_refZP6BEntryZP5BListZUl + 0x2b 30 7821af4c (+ 48) 0066fe57 <libtracker.so> BPrivate::Thread<[32m0x182124f8[0m>::Run([34m0x0[0m) + 0x2b 31 7821af7c (+ 48) 0066fc9c <libtracker.so> BPrivate::SimpleThread<[32m0x182124f8[0m>::RunBinder(NULL) + 0x28 32 7821afac (+ 48) 00724893 <libroot.so> _get_next_team_info (nearest) + 0x5f 33 7821afdc (+ 0) 7821afec 5299:MoveTask_379_stack@0x781d7000 + 0x43fec kdebug>
Few lines from serial output:
PageWriteWrapper: Failed to write page 0x83024dcc: General system error PageWriteWrapper: Failed to write page 0x83024e00: General system error PageWriteWrapper: Failed to write page 0x83024e34: General system error usb error ohci -1: td error: 0x00000005 usb error ohci -1: td error: 0x00000004 usb hub 2: port 0: device removed usb hub 2: port 0 disabled usb_disk: receiving the command status wrapper failed PANIC: could not read block 3145815: bytesRead: -1, error: Device not ready
comment:4 by , 12 years ago
Cc: | added |
---|
comment:5 by , 12 years ago
I've added the actual error message to the debug output in hrev45402. Please try to reproduce the problem and attach an updated serial log. This time it should actually tell why "sending or receiving of the data failed"...
by , 12 years ago
Attachment: | serial-hrev45437.txt added |
---|
I updated my system to latest revision and reproduced a bug, so here's serial output. Unfortunately I think it doesn't bring any news, only error messages such as: "No error" or "General system error"
comment:6 by , 12 years ago
Now I have also experienced this error while copying files to FAT32 partition, so it seems it's not connected with BFS support.
comment:7 by , 12 years ago
The updated log actually does bring news. The error that is printed is the status code returned from the USB stack. This is "No error" and means that from the USB stack side of things worked out well, i.e. the transfer did not fail per se. The other condition to trigger the error code path is that the reported transfer length does not match the scheduled/expected transfer length. This can be a problem on many levels. The actual length is the sum of all the lengths of each indivial packet sent/received by the USB bus driver, so it is possible that there is a bug causing this to be miscalculated, or USB packets might indeed not be scheduled properly at all.
It is also possible that this is a defect in VirtualBox. It is common for emulated hardware to sometimes take shortcuts and not fully emulate a certain device which might lead to problems with "by the book" drivers like our EHCI one. It might also just be a bug that is only triggered by the pattern in which our EHCI driver does things that hasn't been uncovered yet due to other systems working differently. Since I have seen this particular combination of errors only reported with VirtualBox I tend in that direction.
Generally the issues reported here are two different ones. The timeout is usually really just that, a timeout. It is usually caused by transfers being so large that they simply have to exceed the relatively short timeout. The timeout is static and doesn't take bus speed or bandwidth constraints into account. Hence you are almost guaranteed to run into it when transferring large files with USB 1.1 as it simply is too slow to ever make it within that timeout. It is also possible that if the bus is busy due to other devices, the transfer doesn't complete quickly enough.
The issue of "sending or receiving of the data failed" as explained above, is different from the timeout. I would suggest opening a new ticket for that issue. Please attach the VirtualBox log files to that new ticket, they might indicate what's going on from the VirtualBox side. For example the EHCI driver might try something VirtualBox doesn't expect or it might do something plain buggy that hardware EHCI implementations just handle silently.
comment:8 by , 7 years ago
Blocking: | 9578 added |
---|
comment:9 by , 7 years ago
I ran into this some time ago, and found a single file that reliably reproduced the bug. That's attached in #9578, in case someone else wants to take a look.
comment:10 by , 5 years ago
Priority: | normal → low |
---|
I suppose getting the syslog failed for some reason?