Opened 16 years ago
Closed 16 years ago
#2597 closed bug (fixed)
PANIC: page fault, but interrupts were disabled. Touching address 0x6e697478 from eip 0x80042f08
Reported by: | anevilyak | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/pre-alpha1 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
I got this panic from hrev25905 while doing an svn checkout to test if the corruption problem was gone. I did have Vision running at the same time and the panic seems to have happened in one of its network threads. Also, oddly there was a line of output before the PANIC line that looked as follows:
wait interval 997100, scan pages 536, free 16333, target 50.
backtrace:
stack trace for thread 289 "s>wegotdeathstar" kernel stack: 0xc4f08000 to 0xc4f0c000 user stack: 0x70104000 to 0x70144000 frame caller <image>:function + offset 0 c4f0ba80 (+ 48) 800577d5 <kernel>:invoke_debugger_command + 0x00ed 1 c4f0bab0 (+ 64) 800575cd <kernel>:invoke_pipe_segment__FP21debugger_command_pipelPc + 0x0079 2 c4f0baf0 (+ 64) 80057915 <kernel>:invoke_debugger_command_pipe + 0x009d 3 c4f0bb30 (+ 48) 800587f0 <kernel>:_ParseCommandPipe__16ExpressionParserRi + 0x0234 4 c4f0bb60 (+ 48) 800581a6 <kernel>:EvaluateCommand__16ExpressionParserPCcRi + 0x01de 5 c4f0bb90 (+ 224) 80059bbc <kernel>:evaluate_debug_command + 0x0088 6 c4f0bc70 (+ 64) 80055c4a <kernel>:kernel_debugger_loop__Fv + 0x01ae 7 c4f0bcb0 (+ 48) 800567e3 <kernel>:kernel_debugger + 0x0117 8 c4f0bce0 (+ 192) 800566c1 <kernel>:panic + 0x0029 9 c4f0bda0 (+ 48) 800b9154 <kernel>:page_fault_exception + 0x0060 10 c4f0bdd0 (+ 12) 800bc746 <kernel>:int_bottom + 0x0036 (nearest) kernel iframe at 0xc4f0bddc (end = 0xc4f0be2c) eax 0x80042efb ebx 0x6e69746c ecx 0x800fb62c edx 0x38 esi 0x1 edi 0x90e487a0 ebp 0xc4f0be44 esp 0xc4f0be10 eip 0x80042f08 eflags 0x10006 vector: 0xe, error code: 0x0 11 c4f0bddc (+ 104) 80042f08 <kernel>:create_sem_etc + 0x00a4 12 c4f0be44 (+ 48) 8004346e <kernel>:create_sem + 0x001e 13 c4f0be74 (+ 64) 800524d1 <kernel>:create_select_sync__FiRP11select_sync + 0x0071 14 c4f0beb4 (+ 64) 80052646 <kernel>:common_select__FiP6fd_setN21xPClb + 0x009e 15 c4f0bef4 (+ 80) 800533fe <kernel>:_user_select + 0x019a 16 c4f0bf44 (+ 100) 800bc981 <kernel>:pre_syscall_debug_done + 0x0002 (nearest) user iframe at 0xc4f0bfa8 (end = 0xc4f0c000) eax 0x6e ebx 0x95c2c0 ecx 0x70143830 edx 0xffff0104 esi 0x0 edi 0x0 ebp 0x7014386c esp 0xc4f0bfdc eip 0xffff0104 eflags 0x217 user esp 0x70143830 vector: 0x63, error code: 0x0 17 c4f0bfa8 (+ 0) ffff0104 18 7014386c (+1856) 002a24d8 </boot/apps/Vision/Vision@0x00200000>:unknown + 0xa24d8 19 70143fac (+ 48) 008d3bc0 </boot/beos/system/lib/libroot.so@0x008af000>:unknown + 0x24bc0 20 70143fdc (+ 0) 70143fec 8499:s>wegotdeathstar_289_stack@0x70104000 + 0x3ffec
Staying in KDL in case more data is needed.
Change History (6)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
It looks like the semaphore list may have become corrupted somehow...I tried printing it with the 'sems' command, and after several pages it stopped with READ/WRITE FAULT. The last few entries in the list were:
0x9fcaf274 130769 -1 1 0 select 0x9fcb2d28 131058 -1 1 0 select [*** READ/WRITE FAULT ***] kdebug>
Pertinent information for those two semaphores:
kdebug> sem 0x9fcb2d28 SEM: 0x9fcb2d28 id: 131058 (0x1fff2) name: 'select' owner: 1 count: -1 queue: 271 last acquired by: 0, count: 0 last released by: 0, count: 0 kdebug> thread 271 THREAD: 0x916b1800 id: 271 (0x10f) name: "ssh" all_next: 0x916a8000 team_next: 0x00000000 q_next: 0x80105fe0 priority: 10 (next 10) state: waiting next_state: waiting cpu: 0x00000000 sig_pending: 0x0 (blocked: 0x0) in_kernel: 1 waiting for: semaphore 131058 fault_handler: 0x00000000 args: 0x9100af78 0x00000000 entry: 0x8004a538 team: 0x90d82e88, "ssh" exit.sem: 3791 exit.status: 0x0 (No error) exit.reason: 0x0 exit.signal: 0x0 exit.waiters: kernel_stack_area: 7173 kernel_stack_base: 0xa16d8000 user_stack_area: 7175 user_stack_base: 0x7efef000 user_local_storage: 0x7ffef000 kernel_errno: 0x0 (No error) kernel_time: 5001786 user_time: 5879385 flags: 0x0 architecture dependant section: esp: 0xa16dbd68 ss: 0x00000010 fpu_state at 0x916b1b80 kdebug> sem 0x9fcaf274 SEM: 0x9fcaf274 id: 130769 (0x1fed1) name: 'select' owner: 1 count: -1 queue: 296 last acquired by: 0, count: 0 last released by: 0, count: 0 kdebug> thread 296 THREAD: 0x916da800 id: 296 (0x128) name: "sshd" all_next: 0x9169b000 team_next: 0x00000000 q_next: 0x80105fe0 priority: 10 (next 10) state: waiting next_state: waiting cpu: 0x00000000 sig_pending: 0x0 (blocked: 0x0) in_kernel: 1 waiting for: semaphore 130769 fault_handler: 0x00000000 args: 0xc56aad20 0x00000000 entry: 0x8004a538 team: 0x90e72d14, "sshd" exit.sem: 51553 exit.status: 0x0 (No error) exit.reason: 0x0 exit.signal: 0x0 exit.waiters: kernel_stack_area: 8830 kernel_stack_base: 0xc4fd9000 user_stack_area: 8832 user_stack_base: 0x7efef000 user_local_storage: 0x7ffef000 kernel_errno: 0x0 (No error) kernel_time: 62168 user_time: 25679 flags: 0x0 architecture dependant section: esp: 0xc4fdcd68 ss: 0x00000010 fpu_state at 0x916dab80
Hope that helps.
comment:3 by , 16 years ago
also, args passed to create_sem_etc:
kdebug> call 11 -3 thread 289, s>wegotdeathstar c4f0bddc 80042f08 <kernel>:create_sem_etc(0x0 (0), 0x800e8010, 0x1 (1)) kdebug> string 0x800e8010 0x800e8010 "select"
comment:4 by , 16 years ago
The "wait interval..." line is harmless. It is printed by the page daemon to the syslog, but while entering the kernel debugger there's a small race condition when it can get redirected to the already active blue screen.
Regarding the page fault in create_sem_etc(), the dereferenced address is in ebx, into which at that point sFreeSemsHead should just have been loaded. It can only have been overwritten directly, or the sem_entry::u::unused::next pointer of the previous head had been overwritten. Interestingly the value looks like a string ("ltin"). Semaphore names aren't stored in structure itself, they are allocated on the heap, though. So the semaphore code itself is probably not to blame.
The sem_entry structures live in a dedicated area and their pointers are never given out to other code, so I don't see any obvious way, how a sem_entry could easily be overwritten. Either the overwriting happened by sheer accident in that address range (e.g. a sem_entry pointer was still somewhere on the stack), or someone wrote over the bounds over an adjacent area. "sems" prints the semaphore array in order and it managed to print several entries, so if at all, the user of the following area might be to blame. It would be interesting to know where the sem_entry area ends and what is the next area.
comment:5 by , 16 years ago
Will check that if I encounter this again, presumably via the areas command? For what it's worth I haven't run into it in 26911 yet.
comment:6 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Closing this one as I haven't encountered it again since last comment close to a year ago.
Er, that should've been hrev26905 of course :)