Opened 2 years ago
Closed 20 months ago
#18133 closed bug (fixed)
page fault in _kern_write
Reported by: | jessicah | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta5 |
Component: | System/Kernel | Version: | R1/beta4 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
Running yarn --verbose
for vscode reliably triggers page fault in make
(needs an updated version of nodejs that I need to publish):
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x0, ip 0x0, write 0, user 0, exec 1, thread 0x331 PANIC: vm_page_fault: unhandled page fault in kernel space at 0x0, ip 0x0 Welcome to Kernel Debugging Land... Thread 817 "make" running on CPU 2 stack trace for thread 817 "make" kernel stack: 0xffffffff936b8000 to 0xffffffff936bd000 user stack: 0x00007fe16ee68000 to 0x00007fe16fe68000 frame caller <image>:function + offset 0 ffffffff936bc728 (+ 24) ffffffff8014559c <kernel_x86_64> arch_debug_call_with_fault_handler + 0x16 1 ffffffff936bc740 (+ 80) ffffffff800aec28 <kernel_x86_64> debug_call_with_fault_handler + 0x78 2 ffffffff936bc790 (+ 96) ffffffff800b0243 <kernel_x86_64> kernel_debugger_loop(char const*, char const*, __va_list_tag*, int) + 0xf3 3 ffffffff936bc7f0 (+ 80) ffffffff800b05de <kernel_x86_64> kernel_debugger_internal(char const*, char const*, __va_list_tag*, int) + 0x6e 4 ffffffff936bc840 (+ 240) ffffffff800b0937 <kernel_x86_64> panic + 0xb7 5 ffffffff936bc930 (+ 256) ffffffff8012eaf8 <kernel_x86_64> vm_page_fault + 0x258 6 ffffffff936bca30 (+ 64) ffffffff80151458 <kernel_x86_64> x86_page_fault_exception + 0x168 7 ffffffff936bca70 (+ 904) ffffffff80146d8c <kernel_x86_64> int_bottom + 0x80 kernel iframe at 0xffffffff936bcdf8 (end = 0xffffffff936bcec0) rax 0xffffffff801b9040 rbx 0x0 rcx 0x331 rdx 0x2 rsi 0x0 rdi 0xffffff81178582a8 rbp 0xffffffff936bcf20 r8 0x1 r9 0x0 r10 0xffffffff936bce60 r11 0x206 r12 0x1 r13 0xffffffff80001301 r14 0x10def828f660 r15 0xffffffff9b2febc0 rip 0x0 rsp 0xffffffff936bcec8 rflags 0x10246 vector: 0xe, error code: 0x10 8 ffffffff936bcdf8 (+ 296) 0000000000000000 9 ffffffff936bcf20 (+ 16) ffffffff8014708f <kernel_x86_64> x86_64_syscall_entry + 0xfb user iframe at 0xffffffff936bcf30 (end = 0xffffffff936bcff8) rax 0x91 rbx 0x5c rcx 0x1518d209b6c rdx 0x10def828f660 rsi 0xffffffffffffffff rdi 0x1 rbp 0x7fe16fe65f60 r8 0x5c r9 0x22 r10 0x5c r11 0x206 r12 0x10def828f660 r13 0x5c r14 0x1518d4e59a0 r15 0x10def821f07c rip 0x1518d209b6c rsp 0x7fe16fe65f48 rflags 0x206 vector: 0x63, error code: 0x0 10 ffffffff936bcf30 (+140608043388976) 000001518d209b6c <libroot.so> _kern_write + 0x0c 11 00007fe16fe65f60 (+ 48) 000001518d23e38a <libroot.so> _IO_new_file_write + 0x3a 12 00007fe16fe65f90 (+ 48) 000001518d23de81 <libroot.so> _IO_file_setbuf (nearest) + 0x81 13 00007fe16fe65fc0 (+ 32) 000001518d23ec91 <libroot.so> _IO_do_write + 0x21 14 00007fe16fe65fe0 (+ 32) 000001518d23efd5 <libroot.so> _IO_file_overflow + 0x105 15 00007fe16fe66000 (+ 80) 000001518d2401bc <libroot.so> _IO_default_xsputn + 0x8c 16 00007fe16fe66050 (+ 80) 000001518d23e8a3 <libroot.so> _IO_new_file_xsputn + 0x193 17 00007fe16fe660a0 (+ 48) 000001518d2418b2 <libroot.so> fputs + 0x62 18 00007fe16fe660d0 (+ 64) 0000025009c0ea78 <make> child_access (nearest) + 0x1c8 19 00007fe16fe66110 (+ 288) 0000025009c0f126 <make> output_start + 0x86 20 00007fe16fe66230 (+ 96) 0000025009c0a9f4 <make> construct_command_argv (nearest) + 0x314 21 00007fe16fe66290 (+ 96) 0000025009c0ac78 <make> construct_command_argv (nearest) + 0x598 22 00007fe16fe662f0 (+ 80) 0000025009c0b6be <make> reap_children (nearest) + 0x83e 23 00007fe16fe66340 (+ 176) 0000025009c0bc78 <make> new_job + 0x248 24 00007fe16fe663f0 (+ 176) 0000025009c1684b <make> notice_finished_file (nearest) + 0x136b 25 00007fe16fe664a0 (+ 128) 0000025009c16ced <make> notice_finished_file (nearest) + 0x180d 26 00007fe16fe66520 (+ 176) 0000025009c15c8c <make> notice_finished_file (nearest) + 0x7ac 27 00007fe16fe665d0 (+ 128) 0000025009c16ced <make> notice_finished_file (nearest) + 0x180d 28 00007fe16fe66650 (+ 176) 0000025009c15c8c <make> notice_finished_file (nearest) + 0x7ac 29 00007fe16fe66700 (+ 128) 0000025009c16ced <make> notice_finished_file (nearest) + 0x180d 30 00007fe16fe66780 (+ 176) 0000025009c15c8c <make> notice_finished_file (nearest) + 0x7ac 31 00007fe16fe66830 (+ 128) 0000025009c170eb <make> update_goal_chain + 0x12b 32 00007fe16fe668b0 (+3264) 0000025009bfd438 <make> main + 0x15c8 33 00007fe16fe67570 (+ 48) 0000025009bfddce <make> _start + 0x3e 34 00007fe16fe675a0 (+ 48) 000001d141e85ae5 </boot/system/runtime_loader@0x000001d141e76000> <unknown> + 0xfae5 35 00007fe16fe675d0 (+ 0) 00007fffffd5e258 <commpage> commpage_thread_exit + 0x00
Change History (10)
follow-up: 3 comment:1 by , 2 years ago
comment:2 by , 2 years ago
Component: | System/libroot.so → System/Kernel |
---|
follow-up: 4 comment:3 by , 2 years ago
Replying to jessicah:
Seems to be faulting at https://github.com/haiku/haiku/blob/master/src/system/kernel/arch/x86/64/interrupts.S#L451?
This is the syscall call. This would mean the iframe is corrupted.
comment:4 by , 2 years ago
Replying to korli:
This is the syscall call. This would mean the iframe is corrupted.
That's what I thought might be happening, any ideas on how to debug, or what might be causing it? It's 100% reproducible here.
comment:5 by , 2 years ago
One possibility is adding a huge buffer to the top of _user_write
(i.e. over a page in size) and see if this fixes the crash. That would indicate a stack overflow. Then you should be able to use mprotect
to make the page un-writeable and catch where the overflow happens.
comment:6 by , 23 months ago
Grab http://haiku.nz/files/node.zip. My IP may change from time to time, so let me know if you can't download it.
unzip node.zip cp -r install/* ~/config/non-packaged/ pkgman install -y c_ares yarn git clone --depth=1 https://github.com/microsoft/vscode cd vscode yarn
comment:7 by , 20 months ago
Note that this appears to be an "execute" fault, i.e. instruction fetch:
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x0, ip 0x0, write 0, user 0, exec 1, thread 0x331
I guess this was run on a machine/VM without SMAP/SMEP enabled, because otherwise that would have triggered.
comment:8 by , 20 months ago
While inspecting the code, I noticed a possibility. I can't download the zip, so I can't be sure at the moment this is the exact same bug. But I highly suspect it is:
PANIC: SMEP violation user-mapped address 0x0000000000000000 touched from kernel 0x0000000000000000 Welcome to Kernel Debugging Land... Thread 2100 "tcp_connection_test" running on CPU 1 stack trace for thread 2100 "tcp_connection_test" kernel stack: 0xffffffff82564000 to 0xffffffff82569000 user stack: 0x00007fa8e0660000 to 0x00007fa8e1660000 frame caller <image>:function + offset 0 ffffffff82568828 (+ 24) ffffffff80146cfc <kernel_x86_64> arch_debug_call_with_fault_handler + 0x16 1 ffffffff82568840 (+ 80) ffffffff800b0418 <kernel_x86_64> debug_call_with_fault_handler + 0x78 2 ffffffff82568890 (+ 96) ffffffff800b1a83 <kernel_x86_64> kernel_debugger_loop(char const*, char const*, __va_list_tag*, int) + 0xf3 3 ffffffff825688f0 (+ 80) ffffffff800b1e1e <kernel_x86_64> kernel_debugger_internal(char const*, char const*, __va_list_tag*, int) + 0x6e 4 ffffffff82568940 (+ 240) ffffffff800b2177 <kernel_x86_64> panic + 0xb7 5 ffffffff82568a30 (+ 64) ffffffff80152c32 <kernel_x86_64> x86_page_fault_exception + 0x122 6 ffffffff82568a70 (+ 904) ffffffff801484ec <kernel_x86_64> int_bottom + 0x80 kernel iframe at 0xffffffff82568df8 (end = 0xffffffff82568ec0) rax 0xffffffff801bb040 rbx 0x0 rcx 0x834 rdx 0x2 rsi 0x0 rdi 0xffffffff983dfe70 rbp 0xffffffff82568f20 r8 0x1 r9 0x5 r10 0x3 r11 0x7f r12 0x1 r13 0xffffffff80001301 r14 0x134b08fe148 r15 0xffffffff9852c280 rip 0x0 rsp 0xffffffff82568ec8 rflags 0x10246 vector: 0xe, error code: 0x10 7 ffffffff82568df8 (+ 296) 0000000000000000 8 ffffffff82568f20 (+ 16) ffffffff801487ef <kernel_x86_64> x86_64_syscall_entry + 0xfb user iframe at 0xffffffff82568f30 (end = 0xffffffff82568ff8) rax 0x91 rbx 0x134b08fe148 rcx 0x476b5058bc rdx 0x134b08fe148 rsi 0xffffffffffffffff rdi 0x4 rbp 0x7fa8e165f7c0 r8 0x476b7e51d8 r9 0x5 r10 0x5 r11 0x202 r12 0x4 r13 0x3 r14 0x7fa8e1660658 r15 0x0 rip 0x476b5058bc rsp 0x7fa8e165f7a8 rflags 0x202 vector: 0x63, error code: 0x0 9 ffffffff82568f30 (+140365421045904) 000000476b5058bc <libroot.so> _kern_write + 0x0c 10 00007fa8e165f7c0 (+ 176) 00000134b08fdd1e <_APP_> main + 0x21e 11 00007fa8e165f870 (+ 48) 00000134b08fdf5f <_APP_> _start + 0x3f 12 00007fa8e165f8a0 (+ 48) 00000174bd03eae5 </boot/system/runtime_loader@0x00000174bd02f000> <unknown> + 0xfae5 13 00007fa8e165f8d0 (+ 0) 00007fffff2fa258 <commpage> commpage_thread_exit + 0x00
comment:9 by , 20 months ago
The problem that caused the above KDL is fixed in hrev56959. If it's indeed the same as this ticket, then this ticket is also fixed. So, please test.
comment:10 by , 20 months ago
Milestone: | Unscheduled → R1/beta5 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
yarn succeeded without any KDL after the fix, indeed.
Seems to be faulting at https://github.com/haiku/haiku/blob/master/src/system/kernel/arch/x86/64/interrupts.S#L451?