Opened 11 years ago

Closed 10 years ago

#2597 closed bug (fixed)

PANIC: page fault, but interrupts were disabled. Touching address 0x6e697478 from eip 0x80042f08

Reported by: anevilyak Owned by: axeld
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

I got this panic from hrev25905 while doing an svn checkout to test if the corruption problem was gone. I did have Vision running at the same time and the panic seems to have happened in one of its network threads. Also, oddly there was a line of output before the PANIC line that looked as follows:

wait interval 997100, scan pages 536, free 16333, target 50.

backtrace:

stack trace for thread 289 "s>wegotdeathstar"
    kernel stack: 0xc4f08000 to 0xc4f0c000
      user stack: 0x70104000 to 0x70144000
frame            caller     <image>:function + offset
 0 c4f0ba80 (+  48) 800577d5   <kernel>:invoke_debugger_command + 0x00ed
 1 c4f0bab0 (+  64) 800575cd   <kernel>:invoke_pipe_segment__FP21debugger_command_pipelPc + 0x0079
 2 c4f0baf0 (+  64) 80057915   <kernel>:invoke_debugger_command_pipe + 0x009d
 3 c4f0bb30 (+  48) 800587f0   <kernel>:_ParseCommandPipe__16ExpressionParserRi + 0x0234
 4 c4f0bb60 (+  48) 800581a6   <kernel>:EvaluateCommand__16ExpressionParserPCcRi + 0x01de
 5 c4f0bb90 (+ 224) 80059bbc   <kernel>:evaluate_debug_command + 0x0088
 6 c4f0bc70 (+  64) 80055c4a   <kernel>:kernel_debugger_loop__Fv + 0x01ae
 7 c4f0bcb0 (+  48) 800567e3   <kernel>:kernel_debugger + 0x0117
 8 c4f0bce0 (+ 192) 800566c1   <kernel>:panic + 0x0029
 9 c4f0bda0 (+  48) 800b9154   <kernel>:page_fault_exception + 0x0060
10 c4f0bdd0 (+  12) 800bc746   <kernel>:int_bottom + 0x0036 (nearest)
kernel iframe at 0xc4f0bddc (end = 0xc4f0be2c)
 eax 0x80042efb     ebx 0x6e69746c      ecx 0x800fb62c   edx 0x38
 esi 0x1            edi 0x90e487a0      ebp 0xc4f0be44   esp 0xc4f0be10
 eip 0x80042f08  eflags 0x10006
 vector: 0xe, error code: 0x0
11 c4f0bddc (+ 104) 80042f08   <kernel>:create_sem_etc + 0x00a4
12 c4f0be44 (+  48) 8004346e   <kernel>:create_sem + 0x001e
13 c4f0be74 (+  64) 800524d1   <kernel>:create_select_sync__FiRP11select_sync + 0x0071
14 c4f0beb4 (+  64) 80052646   <kernel>:common_select__FiP6fd_setN21xPClb + 0x009e
15 c4f0bef4 (+  80) 800533fe   <kernel>:_user_select + 0x019a
16 c4f0bf44 (+ 100) 800bc981   <kernel>:pre_syscall_debug_done + 0x0002 (nearest)
user iframe at 0xc4f0bfa8 (end = 0xc4f0c000)
 eax 0x6e           ebx 0x95c2c0        ecx 0x70143830   edx 0xffff0104
 esi 0x0            edi 0x0             ebp 0x7014386c   esp 0xc4f0bfdc
 eip 0xffff0104  eflags 0x217      user esp 0x70143830
 vector: 0x63, error code: 0x0
17 c4f0bfa8 (+   0) ffff0104
18 7014386c (+1856) 002a24d8   </boot/apps/Vision/Vision@0x00200000>:unknown + 0xa24d8
19 70143fac (+  48) 008d3bc0   </boot/beos/system/lib/libroot.so@0x008af000>:unknown + 0x24bc0
20 70143fdc (+   0) 70143fec   8499:s>wegotdeathstar_289_stack@0x70104000 + 0x3ffec

Staying in KDL in case more data is needed.

Change History (6)

comment:1 Changed 11 years ago by anevilyak

Er, that should've been hrev26905 of course :)

comment:2 Changed 11 years ago by anevilyak

It looks like the semaphore list may have become corrupted somehow...I tried printing it with the 'sems' command, and after several pages it stopped with READ/WRITE FAULT. The last few entries in the list were:

0x9fcaf274 130769    -1      1      0  select
0x9fcb2d28 131058    -1      1      0  select

[*** READ/WRITE FAULT ***]
kdebug>

Pertinent information for those two semaphores:

kdebug> sem 0x9fcb2d28
SEM: 0x9fcb2d28
id:      131058 (0x1fff2)
name:    'select'
owner:   1
count:   -1
queue:   271
last acquired by: 0, count: 0
last released by: 0, count: 0
kdebug> thread 271
THREAD: 0x916b1800
id:                 271 (0x10f)
name:               "ssh"
all_next:           0x916a8000
team_next:          0x00000000
q_next:             0x80105fe0
priority:           10 (next 10)
state:              waiting
next_state:         waiting
cpu:                0x00000000
sig_pending:        0x0 (blocked: 0x0)
in_kernel:          1
waiting for:        semaphore 131058
fault_handler:      0x00000000
args:               0x9100af78 0x00000000
entry:              0x8004a538
team:               0x90d82e88, "ssh"
  exit.sem:         3791
  exit.status:      0x0 (No error)
  exit.reason:      0x0
  exit.signal:      0x0
  exit.waiters:
kernel_stack_area:  7173
kernel_stack_base:  0xa16d8000
user_stack_area:    7175
user_stack_base:    0x7efef000
user_local_storage: 0x7ffef000
kernel_errno:       0x0 (No error)
kernel_time:        5001786
user_time:          5879385
flags:              0x0
architecture dependant section:
        esp: 0xa16dbd68
        ss: 0x00000010
        fpu_state at 0x916b1b80
kdebug> sem 0x9fcaf274
SEM: 0x9fcaf274
id:      130769 (0x1fed1)
name:    'select'
owner:   1
count:   -1
queue:   296
last acquired by: 0, count: 0
last released by: 0, count: 0
kdebug> thread 296
THREAD: 0x916da800
id:                 296 (0x128)
name:               "sshd"
all_next:           0x9169b000
team_next:          0x00000000
q_next:             0x80105fe0
priority:           10 (next 10)
state:              waiting
next_state:         waiting
cpu:                0x00000000
sig_pending:        0x0 (blocked: 0x0)
in_kernel:          1
waiting for:        semaphore 130769
fault_handler:      0x00000000
args:               0xc56aad20 0x00000000
entry:              0x8004a538
team:               0x90e72d14, "sshd"
  exit.sem:         51553
  exit.status:      0x0 (No error)
  exit.reason:      0x0
  exit.signal:      0x0
  exit.waiters:
kernel_stack_area:  8830
kernel_stack_base:  0xc4fd9000
user_stack_area:    8832
user_stack_base:    0x7efef000
user_local_storage: 0x7ffef000
kernel_errno:       0x0 (No error)
kernel_time:        62168
user_time:          25679
flags:              0x0
architecture dependant section:
        esp: 0xc4fdcd68
        ss: 0x00000010
        fpu_state at 0x916dab80

Hope that helps.

comment:3 Changed 11 years ago by anevilyak

also, args passed to create_sem_etc:

kdebug> call 11 -3
thread 289, s>wegotdeathstar
c4f0bddc 80042f08   <kernel>:create_sem_etc(0x0 (0), 0x800e8010, 0x1 (1))
kdebug> string 0x800e8010
0x800e8010 "select"

comment:4 Changed 11 years ago by bonefish

The "wait interval..." line is harmless. It is printed by the page daemon to the syslog, but while entering the kernel debugger there's a small race condition when it can get redirected to the already active blue screen.

Regarding the page fault in create_sem_etc(), the dereferenced address is in ebx, into which at that point sFreeSemsHead should just have been loaded. It can only have been overwritten directly, or the sem_entry::u::unused::next pointer of the previous head had been overwritten. Interestingly the value looks like a string ("ltin"). Semaphore names aren't stored in structure itself, they are allocated on the heap, though. So the semaphore code itself is probably not to blame.

The sem_entry structures live in a dedicated area and their pointers are never given out to other code, so I don't see any obvious way, how a sem_entry could easily be overwritten. Either the overwriting happened by sheer accident in that address range (e.g. a sem_entry pointer was still somewhere on the stack), or someone wrote over the bounds over an adjacent area. "sems" prints the semaphore array in order and it managed to print several entries, so if at all, the user of the following area might be to blame. It would be interesting to know where the sem_entry area ends and what is the next area.

comment:5 Changed 11 years ago by anevilyak

Will check that if I encounter this again, presumably via the areas command? For what it's worth I haven't run into it in 26911 yet.

comment:6 Changed 10 years ago by anevilyak

Resolution: fixed
Status: newclosed

Closing this one as I haven't encountered it again since last comment close to a year ago.

Note: See TracTickets for help on using tickets.