Opened 11 years ago

Closed 10 years ago

#3487 closed bug (fixed)

[gdb] repeatedly used stepi causes double fault

Reported by: Adek336 Owned by: bonefish
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All


~> gdb ls gdb> b main gdb> run gdb> stepi

now hold the enter key for 20-30 seconds so the stepi instruction is executed lots of times. Eventually, a double fault happens.

Change History (6)

comment:1 by Adek336, 11 years ago

~> gdb ls
gdb> b main
gdb> run
gdb> stepi
<enter, enter, ...>

comment:2 by bonefish, 11 years ago

Owner: changed from axeld to bonefish

I could reproduce this in VMware with just a single stepi. I attached a gdb to an "EscParse" thread of a Terminal and hit some keys in it before doing the stepi.

Welcome to Kernel Debugging Land...
Thread 165 "EscParse" running on CPU 0
kdebug> sc
stack trace for thread 165 "EscParse"
    kernel stack: 0x826d8000 to 0x826dc000
      user stack: 0x70041000 to 0x70081000
frame               caller     <image>:function + offset
 0 801383e8 (+  48) 8006015d   <kernel_x86>:invoke_debugger_command + 0x00f5
 1 80138418 (+  64) 8005ff4d   <kernel_x86> invoke_pipe_segment(debugger_command_pipe*: 0x8012a400, int32: 0, 0x0 "<NULL>") + 0x0079
 2 80138458 (+  64) 800602d4   <kernel_x86>:invoke_debugger_command_pipe + 0x009c
 3 80138498 (+  48) 8006185c   <kernel_x86> ExpressionParser<0x8013854c>::_ParseCommandPipe(0x80138548) + 0x0234
 4 801384c8 (+  64) 80060c96   <kernel_x86> ExpressionParser<0x8013854c>::EvaluateCommand(0x8011ad60 "sc", 0x80138548) + 0x02ba
 5 80138508 (+ 224) 80062c84   <kernel_x86>:evaluate_debug_command + 0x0088
 6 801385e8 (+  64) 8005e05a   <kernel_x86> kernel_debugger_loop() + 0x01ae
 7 80138628 (+  32) 8005eedd   <kernel_x86>:kernel_debugger + 0x004d
 8 80138648 (+ 192) 8005ee85   <kernel_x86>:panic + 0x0029
 9 80138708 (+  48) 800d0bfb   <kernel_x86>:double_fault_exception + 0x0087
10 80138738 (+  12) 800d4512   <kernel_x86>:int_bottom_vm86 + 0x004d
kernel iframe at 0x80138744 (end = 0x80138794)
 eax 0xb            ebx 0x76e030        ecx 0x70080c30   edx 0xffff0104
 esi 0x0            edi 0x18054e50      ebp 0x70080c5c   esp 0x0
 eip 0x800d4743  eflags 0x10103
 vector: 0x8, error code: 0x0
11 80138744 (+   0) 800d4743   <kernel_x86>:x86_sysenter + 0x0000
70080c5c -- read fault
kdebug> area 0x70080c30
AREA: 0x810e07d0
name:           'EscParse_165_stack'
owner:          0x9f
id:             0x10c1
base:           0x70041000
size:           0x41000
protection:     0x3b
wiring:         0x0
memory_type:    0x0
cache:          0x81078a2c
cache_type:     RAM
cache_offset:   0x0
cache_next:     0x00000000
cache_prev:     0x00000000
page mappings:  2
kdebug> dl 0x70080c30 1
[0x70080c30]    read fault

If the stack crawl can be trusted the crash happens while executing a sysenter. Which is somewhat weird, since the only exception a sysenter raises is a #GP when SYSENTER_CS_MSR is 0. We initialize it during the boot process and don't touch it again, though.

Did you test this on real hardware?

comment:3 by Adek336, 11 years ago

Oh no, I've tested that on VMware, forgot to mention that.

comment:4 by bonefish, 10 years ago

Component: - GeneralSystem/Kernel
Version: R1/pre-alpha1R1/Development

The bug can still be reproduced in hrev35591. Tested in qemu and on real hardware with the method described in 2. I don't think my interpretation was quite correct. Since eip points already to the first instruction in x86_sysenter, the sysenter instruction has obviously been executed already. Since single step exceptions are generated after instruction execution and since the TF flag in eflags is still set, I'd say the single step instruction is being generated at that point and for some reason that causes a double fault.

comment:5 by bonefish, 10 years ago

Status: newin-progress

comment:6 by bonefish, 10 years ago

Resolution: fixed
Status: in-progressclosed

Fixed in hrev35620.

Note: See TracTickets for help on using tickets.