Opened 9 years ago

Closed 9 years ago

#5680 closed bug (fixed)

Intermittent hang on startup with r36030

Reported by: anevilyak Owned by: bonefish
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

hrev36030 seems to have introduced some form of deadlock regression for me on startup, though I haven't yet managed to track down the exact issue. On some boots it succeeds, on others I don't get past the disks icon at boot. Getting into KDL reveals that both CPUs are running the idle thread at the time. Some details as follows, and serial attached:

kdebug> threads
thread         id  state     wait for   object  cpu pri  stack      team  name
0x82917800     31  waiting   sem           114    -  10  0x820a7000    1  locked_pool_enlarger
0x82917200     32  waiting   sem           120    -  20  0x820ac000    1  scsi_bus_service
0x8013cbe0      1  running           -            0   0  0x81001000    1  idle thread 1
0x82916c00     33  waiting   sem           150    -  20  0x81567000    1  scsi_bus_service
0x8013d1c0      2  running           -            1   0  0x818a4000    1  idle thread 2
0x82916600     34  waiting   sem           180    -  20  0x81570000    1  scsi_bus_service
0x8291d800      3  waiting   cvar   0x8013cb6c    -  15  0xffff8000    1  undertaker
0x82916000     35  waiting   sem           210    -  20  0x81579000    1  scsi_bus_service
0x8291d200      4  zzz                            -   5  0x82000000    1  kernel daemon
0x82915a00     36  waiting   sem           242    -  20  0x81584000    1  scsi_bus_service
0x8291cc00      5  zzz                            -   5  0x82004000    1  resource resizer
0x82915400     37  waiting   sem           274    -  20  0x81590000    1  scsi_bus_service
0x8291c600      6  zzz                            -   1  0x82008000    1  page scrubber
0x82914e00     38  waiting   cvar   0x82a63828    -  12  0x81618000    1  scsi scheduler 1
0x8291c000      7  waiting   cvar   0x80147f50    -  11  0x8200c000    1  page writer
0x82914800     39  waiting   cvar   0x82a63850    -  12  0x8161c000    1  scsi notifier 1
0x8291ba00      8  waiting   cvar   0x80147f90    -  10  0x82010000    1  page daemon
0x82914200     40  waiting   mutex  0x82808c8c    -   5  0x81620000    1  bfs block allocator
0x8291b400      9  waiting   cvar   0x80147e10    - 110  0x82014000    1  object cache resizer
0x8291ae00     10  waiting   sem            27    -   5  0x82018000    1  low resource manager
0x8291a800     11  waiting   sem            31    -   5  0x8201c000    1  block notifier/writer
0x8291a200     12  waiting   cvar   0x81623b30    -  10  0x82020000    1  main2
0x82919c00     13  waiting   sem            38    -  10  0x82088000    1  acpi_task
0x82919600     26  waiting   sem            93    -  20  0x82094000    1  ohci finish thread
0x82919000     27  waiting   sem            99    -  10  0x82099000    1  ehci finish thread
0x82918a00     28  waiting   sem           100    -  10  0x8209d000    1  ehci cleanup thread
0x82918400     29  waiting   cvar   0x81623b30    -   5  0x820a3000    1  usb explore
0x82917e00     30  waiting   rwlock 0x82a63978    -  10  0x8189d000    1  media checker

kdebug> thread 12
THREAD: 0x8291a200
id:                 12 (0xc)
name:               "main2"
all_next:           0x00000000
team_next:          0x8291a800
q_next:             0x00000000
priority:           10 (next 10, I/O: -1)
state:              waiting
next_state:         waiting
cpu:                0x00000000
sig_pending:        0x0 (blocked: 0x0, temp enabled: 0x0)
in_kernel:          1
waiting for:        condition variable 0x81623b30
fault_handler:      0x00000000
args:               0x00000000 0x00000000
entry:              0x8004e275
team:               0x82828e00, "kernel_team"
  exit.sem:         35
  exit.status:      0x0 (No error)
  exit.reason:      0x0
  exit.signal:      0x0
  exit.waiters:
kernel_stack_area:  107
kernel_stack_base:  0x82020000
user_stack_area:    -1
user_stack_base:    0x00000000
user_local_storage: 0x00000000
user_thread:        0x00000000
kernel_errno:       0x80006000 (Bad file descriptor)
kernel_time:        3331183
user_time:          0
flags:              0x0
architecture dependant section:
        esp: 0x820232bc
        ss: 0x00000010
        fpu_state at 0x8291a5e0
kdebug> cvar 0x81623b30
condition variable 0x81623b30
  object:  0x828031f0 (wait for slabs)
  threads: 12 29
kdebug> bt 29
stack trace for thread 29 "usb explore"
    kernel stack: 0x820a3000 to 0x820a7000
frame               caller     <image>:function + offset
 0 820a6b18 (+  48) 80067ed7   <kernel_x86> _ZL17affine_reschedulev() + 0x0289
 1 820a6b48 (+  48) 80045fcc   <kernel_x86> _ZN22ConditionVariableEntry4WaitEmx() + 0x01a6
 2 820a6b78 (+ 112) 800bfcc7   <kernel_x86> _ZL29object_cache_reserve_internalP11ObjectCachemm() + 0x00ae
 3 820a6be8 (+  32) 800c0732   <kernel_x86> object_cache_alloc() + 0x009a
 4 820a6c08 (+  48) 800bb4d0   <kernel_x86> _Z11block_allocmmm() + 0x0049
 5 820a6c38 (+  32) 800bb5d8   <kernel_x86> malloc() + 0x0012
 6 820a6c58 (+  48) 817705d4   <ohci> _ZN4OHCI19_AddPendingTransferEP8TransferP24ohci_endpoint_descriptorP15ohci_general_tdS5_S5_b() + 0x0044
 7 820a6c88 (+  96) 81772f18   <ohci> _ZN4OHCI14_SubmitRequestEP8Transfer() + 0x0242
 8 820a6ce8 (+  48) 8177327a   <ohci> _ZN4OHCI14SubmitTransferEP8Transfer() + 0x0062
 9 820a6d18 (+  48) 818986f4   <usb> _ZN11ControlPipe12QueueRequestEhhtttPvmPFvS0_lS0_mES0_() + 0x00ca
10 820a6d48 (+  80) 81898e7f   <usb> _ZN11ControlPipe11SendRequestEhhtttPvmPm() + 0x00cf
11 820a6d98 (+  80) 81895208   <usb> _ZN6Device13GetDescriptorEhhtPvmPm() + 0x0060
12 820a6de8 (+  96) 818967a4   <usb> _ZN6DeviceC2EP6ObjectahR21usb_device_descriptora9usb_speedb() + 0x0296
13 820a6e48 (+  80) 818975c5   <usb> _ZN3HubC1EP6ObjectahR21usb_device_descriptora9usb_speedb() + 0x003d
14 820a6e98 (+ 176) 818980a4   <usb> _ZN10BusManager14AllocateDeviceEP3Hubah9usb_speed() + 0x02c8
15 820a6f48 (+  80) 818971be   <usb> _ZN3Hub7ExploreEPP11change_item() + 0x01d4
16 820a6f98 (+  64) 818946b1   <usb> _ZN5Stack13ExploreThreadEPv() + 0x0093
17 820a6fd8 (+  32) 80061144   <kernel_x86> _ZL28_create_kernel_thread_kentryv() + 0x0015
18 820a6ff8 (+2113245192) 80064a1e   <kernel_x86> _ZL19thread_kthread_exitv() + 0x0000
kdebug> bt 12
stack trace for thread 12 "main2"
    kernel stack: 0x82020000 to 0x82024000
frame               caller     <image>:function + offset
 0 82023308 (+  48) 80067ed7   <kernel_x86> _ZL17affine_reschedulev() + 0x0289
 1 82023338 (+  48) 80045fcc   <kernel_x86> _ZN22ConditionVariableEntry4WaitEmx() + 0x01a6
 2 82023368 (+ 112) 800bfcc7   <kernel_x86> _ZL29object_cache_reserve_internalP11ObjectCachemm() + 0x00ae
 3 820233d8 (+  32) 800c0732   <kernel_x86> object_cache_alloc() + 0x009a
 4 820233f8 (+  48) 800bb4d0   <kernel_x86> _Z11block_allocmmm() + 0x0049
 5 82023428 (+  32) 800bb5d8   <kernel_x86> malloc() + 0x0012
 6 82023448 (+  32) 800d3e78   <kernel_x86> _ZN6VMArea4WireEmmb() + 0x0014
 7 82023468 (+ 176) 800c74dd   <kernel_x86> lock_memory_etc() + 0x013c
 8 82023518 (+  48) 8008e9de   <kernel_x86> _ZN8IOBuffer10LockMemoryElb() + 0x0056
 9 82023548 (+  64) 80088f84   <kernel_x86> _ZN11IOScheduler15ScheduleRequestEP9IORequest() + 0x002a
10 82023588 (+ 208) 8186443e   <scsi_disk> _ZL8das_readPvxS_Pm() + 0x0079
11 82023658 (+  48) 80080356   <kernel_x86> _ZN20AbstractModuleDevice4ReadEPvxS0_Pm() + 0x001e
12 82023688 (+  80) 800838c0   <kernel_x86> _ZL10devfs_readP9fs_volumeP8fs_vnodePvxS3_Pm() + 0x00af
13 820236d8 (+  64) 800a467d   <kernel_x86> _ZL9file_readP15file_descriptorxPvPm() + 0x0044
14 82023718 (+  80) 8009ce39   <kernel_x86> _kern_read() + 0x00bf
15 82023768 (+  48) 800ed490   <kernel_x86> read_pos() + 0x0036
16 82023798 (+ 112) 8003f5bd   <kernel_x86> _ZL16get_cached_blockP11block_cachexPbb() + 0x01e0
17 82023808 (+  64) 8003f67b   <kernel_x86> block_cache_get_etc() + 0x0044
18 82023848 (+  48) 8003f6db   <kernel_x86> block_cache_get() + 0x001c
19 82023878 (+  80) 816ff27a   <bfs> _ZN10CachedNode13InternalSetToEP11Transactionx() + 0x0142
20 820238c8 (+  48) 816ff5dd   <bfs> _ZN10CachedNode11SetToHeaderEv() + 0x0065
21 820238f8 (+  96) 817005d9   <bfs> _ZN9BPlusTree5SetToEP5Inode() + 0x0077
22 82023958 (+  32) 817008f3   <bfs> _ZN9BPlusTreeC1EP5Inode() + 0x0059
23 82023978 (+  64) 817089ce   <bfs> _ZN5InodeC1EP6Volumex() + 0x00f0
24 820239b8 (+ 144) 8171488d   <bfs> _ZN6Volume5MountEPKcm() + 0x024b
25 82023a48 (+  64) 817151b2   <bfs> _ZL9bfs_mountP9fs_volumePKcmS2_Px() + 0x004c
26 82023a88 (+ 416) 800ad6ad   <kernel_x86> _ZL8fs_mountPcPKcS1_mS1_b() + 0x0738
27 82023c28 (+  48) 800adc43   <kernel_x86> _kern_mount() + 0x0050
28 82023c58 (+ 832) 800b2b20   <kernel_x86> vfs_mount_boot_file_system() + 0x0369
29 82023f98 (+  64) 8004e30d   <kernel_x86> _ZL5main2Pv() + 0x0098
30 82023fd8 (+  32) 80061144   <kernel_x86> _ZL28_create_kernel_thread_kentryv() + 0x0015
31 82023ff8 (+2113781768) 80064a1e   <kernel_x86> _ZL19thread_kthread_exitv() + 0x0000
kdebug>

Let me know what else might be useful, since this is at least fairly reliably reproducible here. Note that I'm also using the slab for kernel heap option if that makes any difference in this instance (since I notice the object cache is implicated in the backtraces above).

Attachments (1)

deadlock_backtrace.txt (48.0 KB ) - added by anevilyak 9 years ago.
Syslog leading up to boot deadlock.

Download all attachments as: .zip

Change History (5)

by anevilyak, 9 years ago

Attachment: deadlock_backtrace.txt added

Syslog leading up to boot deadlock.

comment:1 by anevilyak, 9 years ago

Update: confirmed that the problem does not occur with slab as malloc disabled.

comment:2 by anevilyak, 9 years ago

Also, with it enabled, it appears to actually deadlock 100% reliably, the question is simply when ; if not at boot, it eventually hangs at some point later when the GUI is up.

comment:3 by bonefish, 9 years ago

Status: newin-progress

comment:4 by bonefish, 9 years ago

Resolution: fixed
Status: in-progressclosed

Fixed in hrev36033.

Note: See TracTickets for help on using tickets.