Opened 7 years ago

Closed 3 years ago

#14104 closed bug (fixed)

Concurrent read/write to FAT32 partition deadlocks

Reported by: nzimmermann Owned by: nobody
Priority: normal Milestone: R1/beta4
Component: File Systems/FAT Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

I was trying to compile within a FAT32 partition for testing, already when invoking cmake it hangs the whole Terminal where I typed 'cmake' in.

Apparently the system is hanging.

Here's the output of a serial debug session:

threads:

0x82a20460    899  waiting   sem       4132          -  15  0x82334000  892  EscParse
0x82a20d00    900  waiting   mutex     0x82d4ae90    -  15  0x82338000  892  w>Terminal: CMakeTmp: cmake
0x829f6330    951  waiting   cvar      0x9981e4a8    -  10  0x8233c000  951  bash
0x82a2dc00    988  waiting   sem       5902          -  10  0x82340000  988  cmake
0x82a22f80   1113  waiting   other                   -  10  0x82344000 1113  make

Apparently the cmake process invoked make, which is waiting for data in dosfs_read. The make thread is waiting for "page events".

kdebug> sc 1113
stack trace for thread 1113 "make"
    kernel stack: 0x82344000 to 0x82348000
      user stack: 0x70bbd000 to 0x71bbd000
frame               caller     <image>:function + offset
 0 82347b74 (+ 224) 8009c743   <kernel_x86> reschedule(int32: 6) + 0x1007
 1 82347c54 (+  48) 8009c7ed   <kernel_x86> scheduler_reschedule + 0x61
 2 82347c84 (+  96) 8008df3e   <kernel_x86> thread_block + 0x10a
 3 82347ce4 (+  64) 8013a7b2   <kernel_x86> VMCache<0x98059008>::WaitForPageEvents(vm_page*: 0x8e94f908, uint32: 0x1 (1), true) + 0x6e
 4 82347d24 (+ 176) 80057ff3   <kernel_x86> cache_io(0x9b809498, 0xdf32f2d0, int64: 0, uint32: 0x185eb5f0, 0x82347f2c, false) + 0x2cb
 5 82347dd4 (+  96) 80058e72   <kernel_x86> file_cache_read + 0x36
 6 82347e34 (+  80) 8204b94d   </boot/system/add-ons/kernel/file_systems/fat> dosfs_read + 0x16d
 7 82347e84 (+  64) 800fe47b   <kernel_x86> file_read(file_descriptor*: 0xb5812400, int64: 0, 0x185eb5f0, 0x82347f2c) + 0x67
 8 82347ec4 (+  80) 800e8d21   <kernel_x86> common_user_io(int32: 3, int64: -1, 0x185eb5f0, uint32: 0x1376 (4982), false) + 0x185
 9 82347f14 (+  48) 800e91ec   <kernel_x86> _user_read + 0x28
10 82347f44 (+ 100) 80143fff   <kernel_x86> handle_syscall + 0xdc
user iframe at 0x82347fa8 (end = 0x82348000)
 eax 0x8e          ebx 0x14e2610      ecx 0x71bbbbbc  edx 0x60cd1114
 esi 0x185ea100    edi 0x185ea100     ebp 0x71bbbbf8  esp 0x82347fdc
 eip 0x60cd1114 eflags 0x3202    user esp 0x71bbbbbc
 vector: 0x63, error code: 0x0
11 82347fa8 (+   0) 60cd1114   <commpage> commpage_syscall + 0x04
12 71bbbbf8 (+  48) 0146046a   <libroot.so> _IO_file_read + 0x2a
13 71bbbc28 (+  64) 0145fcaf   <libroot.so> _IO_new_file_underflow + 0x107
14 71bbbc68 (+  48) 0146174a   <libroot.so> _IO_default_uflow + 0x26
15 71bbbc98 (+  48) 01461661   <libroot.so> __uflow + 0xcd
16 71bbbcc8 (+  64) 01463954   <libroot.so> _IO_getline_info + 0x5c
17 71bbbd08 (+  64) 014638eb   <libroot.so> _IO_getline + 0x33
18 71bbbd48 (+  64) 014629ea   <libroot.so> fgets + 0x56
19 71bbbd88 (+  64) 01c3e493   <make> find_percent_cached (nearest) + 0x3cb
20 71bbbdc8 (+ 304) 01c3c5c6   <make> eval_buffer (nearest) + 0x1b7a
21 71bbbef8 (+ 112) 01c3aa10   <make> read_all_makefiles (nearest) + 0x640
22 71bbbf68 (+  80) 01c3a585   <make> read_all_makefiles + 0x1b5
23 71bbbfb8 (+3456) 01c36332   <make> main + 0x142e
24 71bbcd38 (+  48) 01c2486f   <make> _start + 0x5b
25 71bbcd68 (+  48) 017c4f16   </boot/system/runtime_loader@0x017b3000> <unknown> + 0x11f16
26 71bbcd98 (+   0) 60cd1250   <commpage> commpage_thread_exit + 0x00

On the other hand the "page writer" thread hangs in dosfs_write_pages:

kdebug> sc 12
stack trace for thread 12 "page writer"
    kernel stack: 0x82616000 to 0x8261a000
frame               caller     <image>:function + offset
 0 82619a34 (+ 224) 8009c743   <kernel_x86> reschedule(int32: 6) + 0x1007
 1 82619b14 (+  48) 8009c7ed   <kernel_x86> scheduler_reschedule + 0x61
 2 82619b44 (+  96) 8008df3e   <kernel_x86> thread_block + 0x10a
 3 82619ba4 (+  48) 80098028   <kernel_x86> _mutex_lock + 0x174
 4 82619bd4 (+  48) 80096a9e   <kernel_x86> recursive_lock_lock + 0x52
 5 82619c04 (+ 208) 8204d8c5   </boot/system/add-ons/kernel/file_systems/fat> dosfs_write_pages + 0x95
 6 82619cd4 (+  80) 80109861   <kernel_x86> VnodeIO<0x82619db4>::IO(int64: 0, 0x82535000, 0x82619d78) + 0x51
 7 82619d24 (+  96) 80106eba   <kernel_x86> synchronous_io(IORequest*: 0xef84acd8, DoIO&: 0x82619db4) + 0x7e
 8 82619d84 (+  64) 80106fe1   <kernel_x86> vfs_vnode_io + 0x6d
 9 82619dc4 (+  64) 80107218   <kernel_x86> vfs_asynchronous_write_pages(vnode*: 0x9200bc70, NULL, int64: 0, generic_io_vec*: 0x82aa2590, uint32: 0x1 (1), uint64: 0x1000 (4096), uic
10 82619e04 (+  80) 8005a22d   <kernel_x86> VMVnodeCache<0x98059cf8>::WriteAsync(int64: 0, generic_io_vec*: 0x82aa2590, uint32: 0x1 (1), uint64: 0x1000 (4096), uint32: 0x3 (3), Asy9
11 82619e54 (+  96) 8012dae2   <kernel_x86> PageWriteTransfer<0x82aa256c>::Schedule(uint32: 0x2 (2)) + 0xb6
12 82619eb4 (+  80) 8012dd6e   <kernel_x86> PageWriterRun<0x82619f88>::Go(0x9037d720) + 0x6e
13 82619f04 (+ 176) 8012e43b   <kernel_x86> page_writer(NULL) + 0x50f
14 82619fb4 (+  48) 8008935b   <kernel_x86> common_thread_entry(0x82619ff0) + 0x3b

Seems to be an issue with concurrent reading/writing from FAT32. Probably other FS are affected as well.

Change History (4)

comment:1 by waddlesplash, 6 years ago

Was this a partition on a USB drive? That may have been the problem. Please retest under a recent nightly.

comment:2 by waddlesplash, 5 years ago

Component: File Systems/FAT- General

"waiting for page events" means it is waiting for data to be written out or read from the underlying device. So the real cause of this is the underlying device stalling out. A syslog would reveal more.

comment:3 by waddlesplash, 3 years ago

Component: - GeneralFile Systems/FAT

Or not. Indeed, FAT seems to have locking problems in read and write.

comment:4 by waddlesplash, 3 years ago

Milestone: UnscheduledR1/beta4
Resolution: fixed
Status: newclosed

Fixed in hrev55676.

Note: See TracTickets for help on using tickets.