Opened 3 years ago
Closed 23 months ago
#17468 closed bug (fixed)
riscv64 images built with icu compiled under gcc 11.x lockup at boot
Reported by: | kallisti5 | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | System/Kernel | Version: | R1/beta3 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | riscv64 |
Description
The unmatched is no longer booting in recent commits...
ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module vm_soft_fault: va 0x34f5de1000 not covered by area in address space vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x34f5de1ff0, ip 0x257ee83c4c, write 1, user 1, exec 0, thread 0x2ce thread_hit_serious_debug_event(): Failed to install debugger: thread: 718: Bad port ID error starting "/boot/system/servers/launch_daemon" error = -1
Attachments (1)
Change History (24)
by , 3 years ago
Attachment: | boot_logs.txt added |
---|
comment:1 by , 3 years ago
comment:3 by , 3 years ago
I've confirmed that gcc11 is to blame by building the latest Haiku code with the gcc8 buildtools repo. Haiku boots as usual when compiled with gcc8 buildtools.
The first thing to test is likely getting a gcc11 syslibs package and updating the build-package repo with it.
comment:4 by , 3 years ago
I've bootstrapped haiku for riscv64 and updated our build-packages. The sifive unmatched still isn't booting with the same error.
gcc 11 introduced some major problem.
comment:5 by , 3 years ago
It is infinite recursion in runtime_loader.
FP: 0x3dd877aaa0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aac0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aae0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab00, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab20, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab40, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab60, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab80, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aba0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877abc0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c kdebug>
comment:6 by , 3 years ago
0000000000114632 <memset>: 114632: 01 11 addi sp, sp, -32 114634: 22 e8 sd s0, 16(sp) 114636: 26 e4 sd s1, 8(sp) 114638: 06 ec sd ra, 24(sp) 11463a: 00 10 addi s0, sp, 32 11463c: aa 84 mv s1, a0 11463e: 19 c6 beqz a2, 0x11464c <memset+0x1a> 114640: 93 f5 f5 0f andi a1, a1, 255 114644: 97 40 ff ff auipc ra, 1048564 114648: e7 80 c0 00 jalr 12(ra) 11464c: e2 60 ld ra, 24(sp) // <-- HERE 11464e: 42 64 ld s0, 16(sp) 114650: 26 85 mv a0, s1 114652: a2 64 ld s1, 8(sp) 114654: 05 61 addi sp, sp, 32 114656: 82 80 ret
follow-up: 9 comment:7 by , 3 years ago
That problem should have already been resolved by hrev55661.
comment:8 by , 3 years ago
After fixing infinite recursion by globally applying -fno-builtin
it now crash in libicu
:
PANIC: thread_hit_serious_debug_event Welcome to Kernel Debugging Land... Thread 282 "launch_daemon" running on CPU 0 Stack: FP: 0xffffffc000004a80 FP: 0xffffffc000004aa0, PC: 0xffffffc000152d8a <kernel_riscv64> arch_debug_call_with_fault_handler + 32 FP: 0xffffffc000004af0, PC: 0xffffffc0000d3b88 <kernel_riscv64> debug_call_with_fault_handler.localalias + 128 FP: 0xffffffc000004b80, PC: 0xffffffc0000d4ee8 <kernel_riscv64> _ZL20kernel_debugger_loopPKcS0_Pvi + 324 FP: 0xffffffc000004bf0, PC: 0xffffffc0000d5290 <kernel_riscv64> _ZL24kernel_debugger_internalPKcS0_Pvi + 284 FP: 0xffffffc000004c30, PC: 0xffffffc0000d5524 <kernel_riscv64> panic + 92 FP: 0xffffffc000004ca0, PC: 0xffffffc0000e1e36 <kernel_riscv64> _ZL30thread_hit_serious_debug_event22debug_debugger_messagePKvi + 38 FP: 0xffffffc000004d00, PC: 0xffffffc0000e21c4 <kernel_riscv64> user_debug_exception_occurred + 78 FP: 0xffffffc000004de0, PC: 0xffffffc00013f91a <kernel_riscv64> vm_page_fault + 460 FP: 0xffffffc000004ed0, PC: 0xffffffc0001541fc <kernel_riscv64> STrap + 800 FP: 0xffffffc000004ff0, PC: 0xffffffc000151d38 <kernel_riscv64> SVecU + 120 STrap(exception execPageFault) sstatus: (ie: {u}, pie: {s}, spp: u, fs: dirty, xs: off, sum: 0, mxr: 0, uxl: 2, sd: 1) stval: 0x3205051300134516 ra: 0x0000003efcd2c292 t6: 0x000000000000000e sp: 0x0000003f0182ca00 gp: 0x0000000000000000 tp: 0x0000003f0182d000 t0: 0x000000000000000c t1: 0x0000003acc8b11dc t2: 0x0000000000000000 t5: 0x0000000000000040 s1: 0xffffffffffffffff a0: 0x00000039f9fd8100 a1: 0x0000000000000001 a2: 0x0000000000000020 a3: 0x00000039fadc3000 a4: 0x0000000100000000 a5: 0x3205051300134517 a6: 0x0000000000000016 a7: 0x00000006b4d3a62c s2: 0x0000002e990323e8 s3: 0xfffffffffffffffe s4: 0x00000027cdf81c12 s5: 0xfffffffffffffffe s6: 0x0000003efcd9bf18 s7: 0x0000003acc8bf140 s8: 0xfffffffffffffffd s9: 0xfffffffffffffffd s10: 0xffffffffffffffff s11: 0xffffffffffffffff t3: 0x0000003acc8b58bc t4: 0x0000000000000000 fp: 0x0000003f0182ca90 epc: 0x3205051300134516 FP: 0x3f0182ca90, PC: 0x3205051300134516 0x3205051300134516 FP: 0x0, PC: 0x2e98f01270 <libicuuc.so.67> _ZN6icu_676UMutex8getMutexEv + 122 kdebug>
comment:9 by , 3 years ago
Replying to waddlesplash:
That problem should have already been resolved by hrev55661.
Now it occurs in runtime_loader
, not libroot.so
. Jamfile is here: https://git.haiku-os.org/haiku/tree/src/system/runtime_loader/arch/riscv64/Jamfile.
comment:11 by , 3 years ago
Replying to waddlesplash:
The jamfile reuses the already-built .o from libroot.
But not for riscv64, see jamfile above. It use generic C memset
, not assembly code.
comment:13 by , 3 years ago
libicu
crash may be caused by incorrectly built gcc_syslibs[_devel] package. develop/headers/c++/riscv64-unknown-haiku/bits/gthr-default.h
conteins stu instead of pthread implementation.
comment:14 by , 3 years ago
odd. I have https://github.com/haikuports/haikuports.cross/blob/master/sys-devel/gcc_bootstrap/gcc_bootstrap-11.2.0_2021_07_28.recipe#L185 configured for pthread.
--enable-threads=posix
comment:15 by , 3 years ago
Note that the same problem was present in gcc8 bootstrap, but older ICU was used.
comment:16 by , 3 years ago
ack. ok that makes sense. I'll dig into the ICU issue.
As a note here, we added the same no-builtin fix for arch_string in hrev55753 across all architectures. That adjusted the behavior of memset
0000000000112c4a <memset>: 112c4a: 41 11 addi sp, sp, -16 112c4c: 22 e0 sd s0, 0(sp) 112c4e: 06 e4 sd ra, 8(sp) 112c50: 2a 84 mv s0, a0 112c52: 09 c6 beqz a2, 0x112c5c <memset+0x12> 112c54: 93 f5 f5 0f andi a1, a1, 255 112c58: ef 50 9f 9f jal 0x108650 <.plt+0x590> 112c5c: a2 60 ld ra, 8(sp) 112c5e: 22 85 mv a0, s0 112c60: 02 64 ld s0, 0(sp) 112c62: 41 01 addi sp, sp, 16 112c64: 82 80 ret
I can confirm here my unmatched desktop is booting again after hrev55753
comment:17 by , 3 years ago
comment:18 by , 3 years ago
so.. still seeing the original crash on my unmatched even after hrev55754
SiFive unmatched:
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0xffffff9300134516, ip 0xffffff9300134516, write thread_hit_serious_debug_event(): Failed to install debugger: thread: 702: Bad port ID
Compiling kernel memset.o
.../riscv64-unknown-haiku-gcc ... -fno-builtin-fork -fno-builtin-vfork -march=rv64gc -nostdinc -finline -fno-builtin -Wno-main ... -c "../src/system/libroot/posix/string/arch/generic/memset.c" ... "objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memset.o"
Linking kernel memset:
.../riscv64-unknown-haiku-ld -Bstatic -Bsymbolic -nostdlib -znocombreloc -no-undefined -r objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_elf.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_uart_sifive.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/sbi_syscalls.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/debug_uart.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/debug_uart_8250.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_cpu.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/byteorder.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memcpy.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memset.o -o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/boot_arch_riscv64.o
compiling libroot memset:
.../riscv64-unknown-haiku-gcc -O2 -Wall -Wno-multichar -Wpointer-arith -Wsign-compare -Wmissing-prototypes -fno-strict-aliasing -fno-delete-null-pointer-checks -fno-builtin-fork -fno-builtin-vfork -march=rv64gc -nostdinc -fno-builtin -c "../src/system/libroot/posix/string/arch/riscv64/../generic/memset.c" ... -o "objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memset.o"
Linking libroot memset:
.../riscv64-unknown-haiku-ld -r objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memcpy.o objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memset.o -o objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/posix_string_arch_riscv64.o
memset from bootloader post hrev55754:
0000000000009b9a <memset>: 9b9a: 0ff5f593 zext.b a1,a1 9b9e: 00c50733 add a4,a0,a2 9ba2: 87aa mv a5,a0 9ba4: c611 beqz a2,9bb0 <memset+0x16> 9ba6: 0785 addi a5,a5,1 9ba8: feb78fa3 sb a1,-1(a5) 9bac: fee79de3 bne a5,a4,9ba6 <memset+0xc> 9bb0: 8082 ret
to me, everything looks correct :-|
comment:19 by , 3 years ago
Haiku riscv64 hrev55862 successfully boot in TinyEMU when downgraded ICU to icu-57.2-2
.
comment:20 by , 3 years ago
Summary: | unmatched hang - bad address → riscv64 images built with icu compiled under gcc 11.x lockup at boot |
---|
indeed. I've pushed that as a temporary workaround in hrev56122.
The important bit seems to be that icu-57.2-2 was compiled with gcc 8.x The non-functional icu 66, icu 70 were built with gcc 11.2.0 or 11.3.0.
It seems less tied to ICU version, and more tied to gcc toolchain used.
comment:21 by , 2 years ago
comment:23 by , 23 months ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
It might be related to the recent gcc11 merge. We are still using gcc8 syslibs. Going to unbootstrap and see if new build-packages solves it.