Opened 9 years ago

Last modified 8 weeks ago

#5949 new bug

KDL: Could not read block 18110: bytesRead: -1, error: Operation timed out

Reported by: stippi Owned by: axeld
Priority: low Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc: dziadek@…
Blocked By: Blocking: #9578
Has a Patch: no Platform: All

Description

I've searched the tickets, I believe this may be a new issue.

The system (rXXXXX) was running for half a day. I attached a harddisk via USB 2.0, connected via FTP from another machine and copied a file of about 700 MB to a BFS partition on the USB disk, when I returned a while later, the machine had KDL'd. This is the backtrace:

get_cached_block(block_cache*: 0xdeb9fa63, int64: -2397609950474928127, bool*: 0x80040e10, true) + 0x0296
block_cache_get_etc + 0x004c
block_cache_get + 0x001c
CachedNode<...>::InternalSetTo(Transaction*: NULL, int64: 57344)
CachedNode<...>::SetTo()
BPlusTree<...>::_SeekDown(Stack...)
BPlusTree<...>::Insert(Transaction&: ...)
Index<...>::Update(Transaction&: ...)
Index<...>::UpdateSize(Transaction&: ...)
bfs_free_cookie(fs_volume*: 0x829f42f8, fs_vnode*: 0xd4c69ae4, void*: 0xcce31898)
file_free_fd(file_descriptor*: 0xd182b0c8)
put_fd
FDGetter::~FDGetter()
common_user_io
_user_write
handle_syscall
...syscall stuff...
<_APP_>:statfilecmd (nearest)
<_APP_>:store
<_APP_>:yyparse
<_APP_>:main
<_APP_>:_start

Typed this by hand and will now try to use the new syslog boot loader features to retrieve the real thing... :-)

Attachments (2)

serial.txt (61.2 KB ) - added by DziadekAlzheimer 7 years ago.
Serial output from system where the error occured
serial-hrev45437.txt (62.4 KB ) - added by DziadekAlzheimer 6 years ago.
I updated my system to latest revision and reproduced a bug, so here's serial output. Unfortunately I think it doesn't bring any news, only error messages such as: "No error" or "General system error"

Download all attachments as: .zip

Change History (12)

comment:1 by bonefish, 9 years ago

I suppose getting the syslog failed for some reason?

comment:2 by stippi, 9 years ago

Yes, seems like I didn't have this enabled by default. I could only enable the feature in the boot loader, not get the syslog from the previous session. :-\ Sorry.

comment:3 by DziadekAlzheimer, 7 years ago

Hi, I experienced similar error today when I tried to copy files to USB stick with BFS partition. I run Haiku hrev... in VirtualBox in Windows 7 64-bit. VM has 1536MBs of RAM and 2 CPU cores. Below is KDL output, but please look at the serial.txt file as it contains much more valuable information what was going before! The error shown here is "Device not ready", however previously I got "Operation timed out" as well. I can reproduce bug, so I am able to provide more info, or even send VM snapshot if desired.

PANIC: could not read block 3145815: bytesRead: -1, error: Device not ready

Welcome to Kernel Debugging Land...
Thread 379 "MoveTask" running on CPU 1
stack trace for thread 379 "MoveTask"
    kernel stack: 0x81417000 to 0x8141b000
      user stack: 0x781db000 to 0x7821b000
frame               caller     <image>:function + offset
 0 8141a908 (+  32) 80131b3e   <kernel_x86> arch_debug_stack_trace + 0x12
 1 8141a928 (+  16) 800927af   <kernel_x86> stack_trace_trampoline(NULL) + 0x0b
 2 8141a938 (+  12) 801247d2   <kernel_x86> arch_debug_call_with_fault_handler + 0x1b
 3 8141a944 (+  48) 80094252   <kernel_x86> debug_call_with_fault_handler + 0x5e
 4 8141a974 (+  64) 800929cf   <kernel_x86> kernel_debugger_loop([34m0x80170997[0m [36m"PANIC: "[0m, [34m0x8014a1a0[0m [36m"could not read block %lld: bytesRead: %zd, error: %s
"[0m, [34m0x8141aa20[0m [36m"W"[0m, int32: [34m1[0m) + 0x21b
 5 8141a9b4 (+  48) 80092d33   <kernel_x86> kernel_debugger_internal([34m0x80170997[0m [36m"PANIC: "[0m, [34m0x8014a1a0[0m [36m"could not read block %lld: bytesRead: %zd, error: %s
"[0m, [34m0x8141aa20[0m [36m"W"[0m, int32: [34m1[0m) + 0x53
 6 8141a9e4 (+  48) 800945de   <kernel_x86> panic + 0x36
 7 8141aa14 (+ 112) 80047910   <kernel_x86> get_cached_block(block_cache*: [34m0xcdcb93d8[0m, int64: [34m3145815[0m, [34m0x8141aac3[0m, [34mtrue[0m) + 0x264
 8 8141aa84 (+  64) 800479ad   <kernel_x86> get_writable_cached_block(block_cache*: [34m0xcdcb93d8[0m, int64: [34m3145815[0m, int64: [34m3145815[0m, int64: [34m1[0m, int32: [34m1[0m, [34mfalse[0m) + 0x61
 9 8141aac4 (+  80) 8004b4ce   <kernel_x86> block_cache_get_writable_etc + 0x76
10 8141ab14 (+  64) 8004b51e   <kernel_x86> block_cache_get_writable + 0x22
11 8141ab54 (+  96) 8152433c   <bfs> CachedNode<[32m0x8141ac28[0m>::InternalSetTo(Transaction*: [34m0x8141adc4[0m, int64: [34m0[0m) + 0x128
12 8141abb4 (+  64) 815241a8   <bfs> CachedNode<[32m0x8141ac28[0m>::SetToWritableHeader(Transaction&: [34m0x8141adc4[0m) + 0x68
13 8141abf4 (+  80) 8152497f   <bfs> BPlusTree<[32m0xea6c4d90[0m>::SetTo(Transaction&: [34m0x8141adc4[0m, Inode*: [34m0xea6c5e38[0m, int32: [34m1024[0m) + 0xaf
14 8141ac44 (+  64) 81524731   <bfs> __9BPlusTreeR11TransactionP5Inodel + 0x65
15 8141ac84 (+  80) 8152dbcd   <bfs> InodeAllocator<[32m0x8141ad4c[0m>::CreateTree([34m0x8141adc4[0m) + 0x69
16 8141acd4 (+ 144) 81534ea3   <bfs> Inode<[32m0x8141adc4[0m>::Create(Transaction&: [34m0xea6c30b0[0m, Inode*: [34m0x8141ae04[0m, [34m0x41ed[0m [31m"<???>"[0m, int32: [34m0[0m, int32: [34m0[0m, uint32: [34m0x0[0m ([34m0[0m), [34m0x8141adbc[0m, NULL, Inode*: NULL, fs_vnode_ops*: NULL, uint32: [34m0x0[0m ([34m0[0m)) + 0x4ff
17 8141ad64 (+ 112) 8154094d   <bfs> bfs_create_dir(fs_volume*: [34m0xce01a930[0m, fs_vnode*: [34m0xea6bd018[0m, [34m0x8141ae04[0m [36m"LC_MESSAGES"[0m, int32: [34m493[0m) + 0x109
18 8141add4 (+ 304) 800e0810   <kernel_x86> dir_create(int32: [34m41[0m, [34m0xcddd1938[0m [36m"."[0m, int32: [34m493[0m, [34mfalse[0m) + 0x68
19 8141af04 (+  64) 800e6790   <kernel_x86> _user_create_dir + 0x98
20 8141af44 (+ 100) 80127470   <kernel_x86> handle_syscall + 0xcd
user iframe at 0x8141afa8 (end = 0x8141b000)
 eax 0x70          ebx 0x4f14c0       ecx 0x78219870  edx 0xffff0114
 esi 0x78219998    edi 0x78219dd8     ebp 0x7821989c  esp 0x8141afdc
 eip 0xffff0114 eflags 0x3206    user esp 0x78219870
 vector: 0x63, error code: 0x0
21 8141afa8 (+   0) ffff0114   <commpage> commpage_syscall + 0x04
22 7821989c (+ 704) 005d50a0   <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x4ec
23 78219b5c (+ 720) 005d5312   <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e
24 78219e2c (+ 720) 005d5312   <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e
25 7821a0fc (+ 720) 005d5312   <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e
26 7821a3cc (+ 720) 005d5312   <libtracker.so> CopyAttributes__8BPrivatePQ28BPrivate15CopyLoopControlP5BNodeT2PvUl (nearest) + 0x75e
27 7821a69c (+1712) 005d625b   <libtracker.so> BPrivate<[32m0x7821add0[0m>::MoveItem(BEntry*: [34m0x7821ae34[0m, BDirectory*: [34m0xffffffff[0m, BPoint*: [34m0x5463736c[0m, uint32: [34m0x0[0m ([34m0[0m), [34m0x7821ae00[0m [36m"đv/€Ěn"[0m, BPrivate::Undo&: [34m0x7821ae04[0m, BPrivate::CopyLoopControl*: [34m0x5d2f35[0m) + 0x86f
28 7821ad4c (+ 464) 005d38a5   <libtracker.so> delete_point__8BPrivatePv (nearest) + 0x9a9
29 7821af1c (+  48) 005e13bf   <libtracker.so> __cl__Q28BPrivatet33FourParamFunctionObjectWorkaround4ZPt11BObjectList1Z9entry_refZP6BEntryZP5BListZUl + 0x2b
30 7821af4c (+  48) 0066fe57   <libtracker.so> BPrivate::Thread<[32m0x182124f8[0m>::Run([34m0x0[0m) + 0x2b
31 7821af7c (+  48) 0066fc9c   <libtracker.so> BPrivate::SimpleThread<[32m0x182124f8[0m>::RunBinder(NULL) + 0x28
32 7821afac (+  48) 00724893   <libroot.so> _get_next_team_info (nearest) + 0x5f
33 7821afdc (+   0) 7821afec   5299:MoveTask_379_stack@0x781d7000 + 0x43fec
kdebug> 

Few lines from serial output:

PageWriteWrapper: Failed to write page 0x83024dcc: General system error
PageWriteWrapper: Failed to write page 0x83024e00: General system error
PageWriteWrapper: Failed to write page 0x83024e34: General system error
usb error ohci -1: td error: 0x00000005
usb error ohci -1: td error: 0x00000004
usb hub 2: port 0: device removed
usb hub 2: port 0 disabled
usb_disk: receiving the command status wrapper failed
PANIC: could not read block 3145815: bytesRead: -1, error: Device not ready

by DziadekAlzheimer, 7 years ago

Attachment: serial.txt added

Serial output from system where the error occured

comment:4 by DziadekAlzheimer, 7 years ago

Cc: dziadek@… added

comment:5 by mmlr, 6 years ago

I've added the actual error message to the debug output in hrev45402. Please try to reproduce the problem and attach an updated serial log. This time it should actually tell why "sending or receiving of the data failed"...

by DziadekAlzheimer, 6 years ago

Attachment: serial-hrev45437.txt added

I updated my system to latest revision and reproduced a bug, so here's serial output. Unfortunately I think it doesn't bring any news, only error messages such as: "No error" or "General system error"

comment:6 by DziadekAlzheimer, 6 years ago

Now I have also experienced this error while copying files to FAT32 partition, so it seems it's not connected with BFS support.

comment:7 by mmlr, 6 years ago

The updated log actually does bring news. The error that is printed is the status code returned from the USB stack. This is "No error" and means that from the USB stack side of things worked out well, i.e. the transfer did not fail per se. The other condition to trigger the error code path is that the reported transfer length does not match the scheduled/expected transfer length. This can be a problem on many levels. The actual length is the sum of all the lengths of each indivial packet sent/received by the USB bus driver, so it is possible that there is a bug causing this to be miscalculated, or USB packets might indeed not be scheduled properly at all.

It is also possible that this is a defect in VirtualBox. It is common for emulated hardware to sometimes take shortcuts and not fully emulate a certain device which might lead to problems with "by the book" drivers like our EHCI one. It might also just be a bug that is only triggered by the pattern in which our EHCI driver does things that hasn't been uncovered yet due to other systems working differently. Since I have seen this particular combination of errors only reported with VirtualBox I tend in that direction.

Generally the issues reported here are two different ones. The timeout is usually really just that, a timeout. It is usually caused by transfers being so large that they simply have to exceed the relatively short timeout. The timeout is static and doesn't take bus speed or bandwidth constraints into account. Hence you are almost guaranteed to run into it when transferring large files with USB 1.1 as it simply is too slow to ever make it within that timeout. It is also possible that if the bus is busy due to other devices, the transfer doesn't complete quickly enough.

The issue of "sending or receiving of the data failed" as explained above, is different from the timeout. I would suggest opening a new ticket for that issue. Please attach the VirtualBox log files to that new ticket, they might indicate what's going on from the VirtualBox side. For example the EHCI driver might try something VirtualBox doesn't expect or it might do something plain buggy that hardware EHCI implementations just handle silently.

comment:8 by waddlesplash, 15 months ago

Blocking: 9578 added

comment:9 by waddlesplash, 15 months ago

I ran into this some time ago, and found a single file that reliably reproduced the bug. That's attached in #9578, in case someone else wants to take a look.

comment:10 by waddlesplash, 8 weeks ago

Priority: normallow
Note: See TracTickets for help on using tickets.