Opened 3 years ago
Closed 3 years ago
#17531 closed bug (fixed)
KDL (GPE) in syscall entry (thread_hit_debug_event_internal)
Reported by: | waddlesplash | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta4 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | korli | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
I was running jam, and hit Ctrl+C at the prompt while it was still in the startup phase and got a KDL.
PANIC: Unexpected exception "General Protection Exception" occurred in kernel mode! Error code: 0x0 Welcome to Kernel Debugging Land... Thread 34781 "jam" running on CPU 1 stack trace for thread 34781 "jam" kernel stack: 0xffffffff8153b000 to 0xffffffff81540000 user stack: 0x00007fc8194c8000 to 0x00007fc81a4c8000 frame caller <image>:function + offset 0 ffffffff8153eee8 (+ 24) ffffffff801446dc <kernel_x86_64> arch_debug_call_with_fault_handler + 0x16 1 ffffffff8153ef00 (+ 80) ffffffff800adbe8 <kernel_x86_64> debug_call_with_fault_handler + 0x78 2 ffffffff8153ef50 (+ 96) ffffffff800af203 <kernel_x86_64> kernel_debugger_loop(char const*, char const*, __va_list_tag*, int) + 0xf3 3 ffffffff8153efb0 (+ 80) ffffffff800af59e <kernel_x86_64> kernel_debugger_internal(char const*, char const*, __va_list_tag*, int) + 0x6e 4 ffffffff8153f000 (+ 240) ffffffff800af8f7 <kernel_x86_64> panic + 0xb7 5 ffffffff8153f0f0 (+ 856) ffffffff80145ecc <kernel_x86_64> int_bottom + 0x80 kernel iframe at 0xffffffff8153f448 (end = 0xffffffff8153f510) rax 0x0 rbx 0x0 rcx 0xa0 rdx 0xffffffff96355cd8 rsi 0xffffffff96355cd8 rdi 0xffffffffa1d78150 rbp 0xffffffff8153fdb8 r8 0x0 r9 0x0 r10 0x1 r11 0x7e r12 0x0 r13 0x200 r14 0xffffffffa1d77f40 r15 0x0 rip 0xffffffff800bd7d9 rsp 0xffffffff8153f518 rflags 0x10046 vector: 0xd, error code: 0x0 6 ffffffff8153f448 (+2416) ffffffff800bd7d9 <kernel_x86_64> thread_hit_debug_event_internal[clone .constprop.0] (debug_debugger_message, void const*, int, bool, bool&) + 0x549 7 ffffffff8153fdb8 (+ 144) ffffffff800bd92e <kernel_x86_64> thread_hit_debug_event(debug_debugger_message, void const*, int, bool) + 0x3e 8 ffffffff8153fe48 (+ 160) ffffffff800bddda <kernel_x86_64> user_debug_pre_syscall + 0x7a 9 ffffffff8153fee8 (+ 72) ffffffff801462ea <kernel_x86_64> x86_64_syscall_entry + 0x216 user iframe at 0xffffffff8153ff30 (end = 0xffffffff8153fff8) rax 0x95 rbx 0x7fc81a4c8648 rcx 0x11ef70bc13c rdx 0x1 rsi 0x7fc81a4bf720 rdi 0xffffffff rbp 0x7fc81a4bf640 r8 0x80 r9 0x13 r10 0x7fc81a4bf660 r11 0x206 r12 0x1 r13 0x7fc81a4c8658 r14 0x0 r15 0x0 rip 0x11ef70bc13c rsp 0x7fc81a4bf628 rflags 0x206 vector: 0x63, error code: 0x0 10 ffffffff8153ff30 (+140499536574224) 0000011ef70bc13c <libroot.so> _kern_read_stat + 0x0c 11 00007fc81a4bf640 (+ 160) 0000018d6b963c8b <jam> file_time + 0x2f 12 00007fc81a4bf6e0 (+1136) 0000018d6b965582 <jam> jcache + 0x7b 13 00007fc81a4bfb50 (+ 32) 0000018d6b96a370 <jam> yyline + 0x107 14 00007fc81a4bfb70 (+10304) 0000018d6b96a52c <jam> yylex + 0x5d 15 00007fc81a4c23b0 (+8592) 0000018d6b96c267 <jam> yyparse + 0x28b 16 00007fc81a4c4540 (+ 128) 0000018d6b96831d <jam> parse_file + 0x40 17 00007fc81a4c45c0 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 18 00007fc81a4c4600 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 19 00007fc81a4c4650 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 20 00007fc81a4c4690 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 21 00007fc81a4c46f0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 22 00007fc81a4c4730 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 23 00007fc81a4c4790 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 24 00007fc81a4c47d0 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 25 00007fc81a4c4860 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 26 00007fc81a4c4910 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 27 00007fc81a4c4950 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 28 00007fc81a4c49d0 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 29 00007fc81a4c4a10 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 30 00007fc81a4c4a60 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 31 00007fc81a4c4aa0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 32 00007fc81a4c4b00 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 33 00007fc81a4c4b40 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 34 00007fc81a4c4ba0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 35 00007fc81a4c4be0 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 36 00007fc81a4c4c70 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 37 00007fc81a4c4d20 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 38 00007fc81a4c4d60 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 39 00007fc81a4c4de0 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 40 00007fc81a4c4e20 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 41 00007fc81a4c4e70 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 42 00007fc81a4c4eb0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 43 00007fc81a4c4f10 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 44 00007fc81a4c4f50 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 45 00007fc81a4c4fb0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 46 00007fc81a4c4ff0 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 47 00007fc81a4c5080 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 48 00007fc81a4c5130 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 49 00007fc81a4c5170 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 50 00007fc81a4c51f0 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 51 00007fc81a4c5230 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 52 00007fc81a4c5280 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 53 00007fc81a4c52c0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 54 00007fc81a4c5320 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 55 00007fc81a4c5360 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 56 00007fc81a4c53c0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 57 00007fc81a4c5400 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 58 00007fc81a4c5490 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 59 00007fc81a4c5540 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 60 00007fc81a4c5580 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 61 00007fc81a4c5600 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 62 00007fc81a4c5640 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 63 00007fc81a4c5690 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 64 00007fc81a4c56d0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 65 00007fc81a4c5730 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 66 00007fc81a4c5770 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 67 00007fc81a4c57d0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 68 00007fc81a4c5810 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 69 00007fc81a4c58a0 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 70 00007fc81a4c5950 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 71 00007fc81a4c5990 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 72 00007fc81a4c5a10 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 73 00007fc81a4c5a50 (+ 80) 0000018d6b97005e <jam> compile_on + 0xd1 74 00007fc81a4c5aa0 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 75 00007fc81a4c5ae0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 76 00007fc81a4c5b40 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 77 00007fc81a4c5b80 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 78 00007fc81a4c5be0 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 79 00007fc81a4c5c20 (+ 144) 0000018d6b970491 <jam> evaluate_rule + 0x2cd 80 00007fc81a4c5cb0 (+ 176) 0000018d6b970179 <jam> compile_rule + 0xed 81 00007fc81a4c5d60 (+ 64) 0000018d6b96fc9c <jam> compile_if + 0x79 82 00007fc81a4c5da0 (+ 64) 0000018d6b97053e <jam> compile_rules + 0x4a 83 00007fc81a4c5de0 (+ 96) 0000018d6b96ff54 <jam> compile_local + 0x148 84 00007fc81a4c5e40 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 85 00007fc81a4c5e80 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 86 00007fc81a4c5f00 (+ 64) 0000018d6b96fd9f <jam> compile_include + 0x101 87 00007fc81a4c5f40 (+ 64) 0000018d6b970599 <jam> compile_rules + 0xa5 88 00007fc81a4c5f80 (+ 128) 0000018d6b96834b <jam> parse_file + 0x6e 89 00007fc81a4c6000 (+4448) 0000018d6b962f2c <jam> main + 0x767 90 00007fc81a4c7160 (+ 48) 0000018d6b9627be <jam> _start + 0x3e 91 00007fc81a4c7190 (+ 48) 0000008252a78f55 </boot/system/runtime_loader@0x0000008252a68000> <unknown> + 0x10f55 92 00007fc81a4c71c0 (+ 0) 00007ff4a7649260 <commpage> commpage_thread_exit + 0x00 kdebug>
Attachments (1)
Change History (14)
comment:1 by , 3 years ago
Version: | R1/beta3 → R1/Development |
---|
comment:2 by , 3 years ago
comment:3 by , 3 years ago
Cc: | added |
---|
Yeah, this is reproducible for me. Ctrl+C during strace of jam seems to generally trigger a KDL.
CC korli: perhaps your recent changes to strace may have "uncovered" this? (Probably not as they were all in userland?) I'll look into fixes later.
comment:5 by , 3 years ago
Might you have some idea of where to start looking on this (or how to get a more useful message than GPE?)
comment:6 by , 3 years ago
The problem seems to be that the stack pointer is not aligned to 16 (but only to 8) and GCC generated aligned-SSE instructions for a copy. The question then is how we wound up with a misaligned stack pointer in syscall entry; or why this only manifests in user_debug_pre_syscall...
comment:7 by , 3 years ago
So! I added some alignment checks:
diff --git a/src/system/kernel/debug/user_debugger.cpp b/src/system/kernel/debug/user_debugger.cpp index aa1f760759..8d840d382a 100644 --- a/src/system/kernel/debug/user_debugger.cpp +++ b/src/system/kernel/debug/user_debugger.cpp @@ -65,6 +65,7 @@ static void schedule_profiling_timer(Thread* thread, bigtime_t interval); static int32 profiling_event(timer* unused); static status_t ensure_debugger_installed(); static void get_team_debug_info(team_debug_info &teamDebugInfo); +extern void alignment_check(void* ptr); static inline status_t @@ -734,6 +735,7 @@ thread_hit_debug_event_internal(debug_debugger_message event, // update the thread debug info bool destroyThreadInfo = false; thread_debug_info threadDebugInfo; + alignment_check(&threadDebugInfo); state = disable_interrupts(); threadDebugInfoLocker.Lock(); @@ -832,6 +834,7 @@ user_debug_pre_syscall(uint32 syscall, void *args) { // check whether a debugger is installed Thread *thread = thread_get_current_thread(); + alignment_check(&thread); int32 teamDebugFlags = atomic_get(&thread->team->debug_info.flags); if (!(teamDebugFlags & B_TEAM_DEBUG_DEBUGGER_INSTALLED)) return;
and in another file (as putting this inline just gets optimized out):
void alignment_check(void* ptr) { if ((intptr_t(ptr) % 16) != 0) panic("BAD ALIGNMENT!"); }
And indeed:
PANIC: BAD ALIGNMENT! Welcome to Kernel Debugging Land... Thread 453 "true" running on CPU 0 stack trace for thread 453 "true" kernel stack: 0xffffffff81aba000 to 0xffffffff81abf000 user stack: 0x00007fee9bacb000 to 0x00007fee9cacb000 frame caller <image>:function + offset 0 ffffffff81abe2e0 (+ 24) ffffffff80144c3c <kernel_x86_64> arch_debug_call_with_fault_handler + 0x16 1 ffffffff81abe2f8 (+ 80) ffffffff800ae258 <kernel_x86_64> debug_call_with_fault_handler + 0x78 2 ffffffff81abe348 (+ 96) ffffffff800af873 <kernel_x86_64> kernel_debugger_loop(char const*, char const*, __va_list_ tag*, int) + 0xf3 3 ffffffff81abe3a8 (+ 80) ffffffff800afc0e <kernel_x86_64> kernel_debugger_internal(char const*, char const*, __va_l ist_tag*, int) + 0x6e 4 ffffffff81abe3f8 (+ 240) ffffffff800aff67 <kernel_x86_64> panic + 0xb7 5 ffffffff81abe4e8 (+2224) ffffffff800bdbf8 <kernel_x86_64> thread_hit_debug_event_internal[clone .constprop.0] (debu g_debugger_message, void const*, int, bool, bool&) + 0x2d8 6 ffffffff81abed98 (+ 144) ffffffff800bdfde <kernel_x86_64> thread_hit_debug_event(debug_debugger_message, void const *, int, bool) + 0x3e 7 ffffffff81abee28 (+ 192) ffffffff800be4bc <kernel_x86_64> user_debug_pre_syscall + 0xac 8 ffffffff81abeee8 (+ 72) ffffffff8014684a <kernel_x86_64> x86_64_syscall_entry + 0x216 user iframe at 0xffffffff81abef30 (end = 0xffffffff81abeff8) rax 0xc9 rbx 0x0 rcx 0x1abfa603224 rdx 0x7e9ef2b010 rsi 0x5 rdi 0x224e rbp 0x7fee9cacad80 r8 0x0 r9 0x1 r10 0x7fee9cacacec r11 0x246 r12 0x0 r13 0x7e9ef2b010 r14 0x7e9ef2b010 r15 0x1abfa80bac0 rip 0x1abfa603224 rsp 0x7fee9cacad58 rflags 0x246 vector: 0x63, error code: 0x0 9 ffffffff81abef30 (+140664926944848) 000001abfa603224 </boot/system/runtime_loader@0x000001abfa5ee000> <unknown> + 0 x15224 10 00007fee9cacad80 (+ 144) 000001abfa5f7be0 </boot/system/runtime_loader@0x000001abfa5ee000> <unknown> + 0x9be0 11 00007fee9cacae10 (+ 48) 000001abfa5fefe3 </boot/system/runtime_loader@0x000001abfa5ee000> <unknown> + 0x10fe3 12 00007fee9cacae40 (+ 0) 00007fd3e58d2260 <commpage> commpage_thread_exit + 0x00 kdebug>
comment:8 by , 3 years ago
Adding another check in thread_hit_debug_event also fires. I guess this must be another case of RDP/RSP being incorrectly matched...?
comment:9 by , 3 years ago
The prologues of both functions do
pushq %rbp movq %rsp, %rbp
as expected, so I would expect that if RSP was aligned in the first one (which apparently it is) it would also be aligned in all subsequent ones, right?
comment:10 by , 3 years ago
indeed aligning the stack looks healthy. can't reproduce with this change: https://review.haiku-os.org/c/haiku/+/5113
comment:11 by , 3 years ago
Milestone: | Unscheduled → R1/beta4 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Indeed fixed in hrev55957.
comment:12 by , 3 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
message reoccured because of "page scrubber" while using bepdf
by , 3 years ago
Attachment: | IMG_0368.JPG added |
---|
comment:13 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
This is a totally unrelated issue which is probably just another symptom of #13205. Please attach the KDL to that ticket instead.
I was running jam in strace at the time, probably relevant.