#14177 closed bug (fixed)
[KDL] in vfs_vnode_io, lots of failures writing back pages
Reported by: | jessicah | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta2 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | vfs | Cc: | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
System KDL'd in vfs_vnode_io
with GPE. Looks something like:
PANIC: Unexpected exception "General Protection Exception" occurred in kernel mode! Error code: 0x0 Thread 9 "page writer" running on CPU 0 stack trace for thread 9 "page writer" kernel stack: 0xffffffff81cc6000 to 0xffffffff81ccb000 .... kernel iframe at 0xffffffff81ccac28 (end = 0xfffffff81ccacf0) rax 0xdeadbeefdeadbeef rbx 0xffffffff92597680 rcx 0x1 rdx 0xffffffff92e5f868 rsi 0xffffffff90765b90 rdi 0xffffffff92597680 rbp 0xffffffff81ccad30 r8 0x1000 r9 0x7 r10 0xffffffff r11 0x0 r12 0xffffffff92e5f868 r13 0xffffffff90765b90 r14 0x1 r15 0x1000 rip 0xffffffff800fdedb rsp 0xffffffff81ccacf0 flags 0x10286 vector: 0xd, error code: 0x0 10 ffffffff81ccac28 (+ 264) ffffffff800fdedb <kernel_x86_64> vfs_vnode_io + 0x1a
More detail in attached screenshot.
Running in VirtualBox with 3 vCPUs, hrev51985. Happened running haikuporter --no-source-package llvm -j3
Also, noticed PageWriteWrapper
suddenly had serious issues writing pages, e.g.
KERN: acquire_advisory_lock(vnode = 0xffffffffa287dd80, flock = 0xffffffff80798eb0, wait = yes) KERN: low resource pages: normal -> note KERN: low resource pages: note -> normal KERN: bfs: bfs_io:502: Invalid Argument KERN: Last message repeated 251 times. KERN: PageWriteWrapper: Failed to write page 0xffffffff82824a40: Invalid Argument KERN: PageWriteWrapper: Failed to write page 0xffffffff82824ae0: Invalid Argument KERN: PageWriteWrapper: Failed to write page 0xffffffff82824b80: Invalid Argument KERN: PageWriteWrapper: Failed to write page 0xffffffff82819e60: Invalid Argument
Attachments (2)
Change History (12)
by , 7 years ago
Attachment: | syslog-page-writer.txt added |
---|
by , 7 years ago
Attachment: | general protection fault.PNG added |
---|
follow-up: 3 comment:1 by , 7 years ago
follow-up: 4 comment:2 by , 7 years ago
follow-up: 5 comment:3 by , 7 years ago
Replying to waddlesplash:
KDL message is the same as #14160. Possibly related to GCC7 upgrade then; it would be nice to figure out the root of the problem instead of just disabling that optimization for even more files. (Are we mishandling the SSE registers somewhere? Is something getting misaligned?)
My comment from the other ticket still stands: "It would be nice to point to the documentation reference for the GCC flag."
comment:4 by , 7 years ago
Replying to waddlesplash:
Maybe related: #11920 (see especially Simon South's last comment) and #10509 comment 13 ("still crashing due to unaligned stack access using movdqa").
It's a GPE because the address is not in canonical form, otherwise it would be a normal pagefault.
comment:5 by , 7 years ago
Replying to korli:
My comment from the other ticket still stands: "It would be nice to point to the documentation reference for the GCC flag."
I tried to dump the disassembly of the kernel with and without the optimization, couldn't find a difference with or without. Maybe I'm doing something wrong.
follow-up: 7 comment:6 by , 7 years ago
Since "rtl-stv1" is a GCC pass, not a specific optimization group that can be disabled, the only real documentation is here: https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html#Developer-Options
On IRC, the GCC developers informed me that pass is a vectorization pass. If you do diffs of assembly of files with and without it, you will see that an awful lot of code now uses SSE registers.
Maybe I'm doing something wrong.
Changing kernel flags or even ObjectC++Flags or the like does not cause jam to rebuild, apparently. So if you are trying to test with/without a certain flag, you will need to delete the kernel objects directory inbetween runs (something like rm -rf generated/objects/haiku/x86_64/system/kernel
iirc.)
comment:7 by , 7 years ago
Replying to waddlesplash:
On IRC, the GCC developers informed me that pass is a vectorization pass. If you do diffs of assembly of files with and without it, you will see that an awful lot of code now uses SSE registers.
AFAICT this was already the case with GCC 5.4.
Changing kernel flags or even ObjectC++Flags or the like does not cause jam to rebuild, apparently. So if you are trying to test with/without a certain flag, you will need to delete the kernel objects directory inbetween runs (something like
rm -rf generated/objects/haiku/x86_64/system/kernel
iirc.)
That's actually how I built each kernel.
comment:9 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
All other linked tickets of GPEs were reported fixed following stack alignment changes. So closing this one as fixed also; nobody seems to have seen it since.
comment:10 by , 5 years ago
Milestone: | Unscheduled → R1/beta2 |
---|
Assign tickets with status=closed and resolution=fixed within the R1/beta2 development window to the R1/beta2 Milestone
KDL message is the same as #14160. Possibly related to GCC7 upgrade then; it would be nice to figure out the root of the problem instead of just disabling that optimization for even more files. (Are we mishandling the SSE registers somewhere? Is something getting misaligned?)