Opened 3 years ago
Closed 23 months ago
#17468 closed bug (fixed)
riscv64 images built with icu compiled under gcc 11.x lockup at boot
Reported by: | kallisti5 | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | System/Kernel | Version: | R1/beta3 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | riscv64 |
Description
The unmatched is no longer booting in recent commits...
ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module module: Search for bus_managers/pci/x86/v1 failed. ahci: failed to get pci x86 module vm_soft_fault: va 0x34f5de1000 not covered by area in address space vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x34f5de1ff0, ip 0x257ee83c4c, write 1, user 1, exec 0, thread 0x2ce thread_hit_serious_debug_event(): Failed to install debugger: thread: 718: Bad port ID error starting "/boot/system/servers/launch_daemon" error = -1
Attachments (1)
Change History (24)
by , 3 years ago
Attachment: | boot_logs.txt added |
---|
comment:1 by , 3 years ago
comment:3 by , 3 years ago
I've confirmed that gcc11 is to blame by building the latest Haiku code with the gcc8 buildtools repo.
The first thing to test is likely getting a gcc11 syslibs package and updating the build-package repo with it.
comment:4 by , 3 years ago
I've bootstrapped haiku for riscv64 and updated our build-packages. The sifive unmatched still isn't booting with the same error.
gcc 11 introduced some major problem.
comment:5 by , 3 years ago
It is infinite recursion in runtime_loader.
FP: 0x3dd877aaa0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aac0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aae0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab00, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab20, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab40, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab60, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877ab80, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877aba0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c FP: 0x3dd877abc0, PC: 0x33b765164c </boot/system/runtime_loader> 0x1464c kdebug>
comment:6 by , 3 years ago
0000000000114632 <memset>: 114632: 01 11 addi sp, sp, -32 114634: 22 e8 sd s0, 16(sp) 114636: 26 e4 sd s1, 8(sp) 114638: 06 ec sd ra, 24(sp) 11463a: 00 10 addi s0, sp, 32 11463c: aa 84 mv s1, a0 11463e: 19 c6 beqz a2, 0x11464c <memset+0x1a> 114640: 93 f5 f5 0f andi a1, a1, 255 114644: 97 40 ff ff auipc ra, 1048564 114648: e7 80 c0 00 jalr 12(ra) 11464c: e2 60 ld ra, 24(sp) // <-- HERE 11464e: 42 64 ld s0, 16(sp) 114650: 26 85 mv a0, s1 114652: a2 64 ld s1, 8(sp) 114654: 05 61 addi sp, sp, 32 114656: 82 80 ret
follow-up: 9 comment:7 by , 3 years ago
That problem should have already been resolved by hrev55661.
comment:8 by , 3 years ago
After fixing infinite recursion by globally applying -fno-builtin
it now crash in libicu
:
PANIC: thread_hit_serious_debug_event Welcome to Kernel Debugging Land... Thread 282 "launch_daemon" running on CPU 0 Stack: FP: 0xffffffc000004a80 FP: 0xffffffc000004aa0, PC: 0xffffffc000152d8a <kernel_riscv64> arch_debug_call_with_fault_handler + 32 FP: 0xffffffc000004af0, PC: 0xffffffc0000d3b88 <kernel_riscv64> debug_call_with_fault_handler.localalias + 128 FP: 0xffffffc000004b80, PC: 0xffffffc0000d4ee8 <kernel_riscv64> _ZL20kernel_debugger_loopPKcS0_Pvi + 324 FP: 0xffffffc000004bf0, PC: 0xffffffc0000d5290 <kernel_riscv64> _ZL24kernel_debugger_internalPKcS0_Pvi + 284 FP: 0xffffffc000004c30, PC: 0xffffffc0000d5524 <kernel_riscv64> panic + 92 FP: 0xffffffc000004ca0, PC: 0xffffffc0000e1e36 <kernel_riscv64> _ZL30thread_hit_serious_debug_event22debug_debugger_messagePKvi + 38 FP: 0xffffffc000004d00, PC: 0xffffffc0000e21c4 <kernel_riscv64> user_debug_exception_occurred + 78 FP: 0xffffffc000004de0, PC: 0xffffffc00013f91a <kernel_riscv64> vm_page_fault + 460 FP: 0xffffffc000004ed0, PC: 0xffffffc0001541fc <kernel_riscv64> STrap + 800 FP: 0xffffffc000004ff0, PC: 0xffffffc000151d38 <kernel_riscv64> SVecU + 120 STrap(exception execPageFault) sstatus: (ie: {u}, pie: {s}, spp: u, fs: dirty, xs: off, sum: 0, mxr: 0, uxl: 2, sd: 1) stval: 0x3205051300134516 ra: 0x0000003efcd2c292 t6: 0x000000000000000e sp: 0x0000003f0182ca00 gp: 0x0000000000000000 tp: 0x0000003f0182d000 t0: 0x000000000000000c t1: 0x0000003acc8b11dc t2: 0x0000000000000000 t5: 0x0000000000000040 s1: 0xffffffffffffffff a0: 0x00000039f9fd8100 a1: 0x0000000000000001 a2: 0x0000000000000020 a3: 0x00000039fadc3000 a4: 0x0000000100000000 a5: 0x3205051300134517 a6: 0x0000000000000016 a7: 0x00000006b4d3a62c s2: 0x0000002e990323e8 s3: 0xfffffffffffffffe s4: 0x00000027cdf81c12 s5: 0xfffffffffffffffe s6: 0x0000003efcd9bf18 s7: 0x0000003acc8bf140 s8: 0xfffffffffffffffd s9: 0xfffffffffffffffd s10: 0xffffffffffffffff s11: 0xffffffffffffffff t3: 0x0000003acc8b58bc t4: 0x0000000000000000 fp: 0x0000003f0182ca90 epc: 0x3205051300134516 FP: 0x3f0182ca90, PC: 0x3205051300134516 0x3205051300134516 FP: 0x0, PC: 0x2e98f01270 <libicuuc.so.67> _ZN6icu_676UMutex8getMutexEv + 122 kdebug>
comment:9 by , 3 years ago
Replying to waddlesplash:
That problem should have already been resolved by hrev55661.
Now it occurs in runtime_loader
, not libroot.so
. Jamfile is here: https://git.haiku-os.org/haiku/tree/src/system/runtime_loader/arch/riscv64/Jamfile.
comment:11 by , 3 years ago
Replying to waddlesplash:
The jamfile reuses the already-built .o from libroot.
But not for riscv64, see jamfile above. It use generic C memset
, not assembly code.
comment:13 by , 3 years ago
libicu
crash may be caused by incorrectly built gcc_syslibs[_devel] package. develop/headers/c++/riscv64-unknown-haiku/bits/gthr-default.h
conteins stu instead of pthread implementation.
comment:14 by , 3 years ago
odd. I have https://github.com/haikuports/haikuports.cross/blob/master/sys-devel/gcc_bootstrap/gcc_bootstrap-11.2.0_2021_07_28.recipe#L185 configured for pthread.
--enable-threads=posix
comment:15 by , 3 years ago
Note that the same problem was present in gcc8 bootstrap, but older ICU was used.
comment:16 by , 3 years ago
ack. ok that makes sense. I'll dig into the ICU issue.
As a note here, we added the same no-builtin fix for arch_string in hrev55753 across all architectures. That adjusted the behavior of memset
0000000000112c4a <memset>: 112c4a: 41 11 addi sp, sp, -16 112c4c: 22 e0 sd s0, 0(sp) 112c4e: 06 e4 sd ra, 8(sp) 112c50: 2a 84 mv s0, a0 112c52: 09 c6 beqz a2, 0x112c5c <memset+0x12> 112c54: 93 f5 f5 0f andi a1, a1, 255 112c58: ef 50 9f 9f jal 0x108650 <.plt+0x590> 112c5c: a2 60 ld ra, 8(sp) 112c5e: 22 85 mv a0, s0 112c60: 02 64 ld s0, 0(sp) 112c62: 41 01 addi sp, sp, 16 112c64: 82 80 ret
I can confirm here my unmatched desktop is booting again after hrev55753
comment:17 by , 3 years ago
comment:18 by , 3 years ago
so.. still seeing the original crash on my unmatched even after hrev55754
SiFive unmatched:
vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0xffffff9300134516, ip 0xffffff9300134516, write thread_hit_serious_debug_event(): Failed to install debugger: thread: 702: Bad port ID
Compiling kernel memset.o
.../riscv64-unknown-haiku-gcc ... -fno-builtin-fork -fno-builtin-vfork -march=rv64gc -nostdinc -finline -fno-builtin -Wno-main ... -c "../src/system/libroot/posix/string/arch/generic/memset.c" ... "objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memset.o"
Linking kernel memset:
.../riscv64-unknown-haiku-ld -Bstatic -Bsymbolic -nostdlib -znocombreloc -no-undefined -r objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_elf.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_uart_sifive.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/sbi_syscalls.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/debug_uart.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/debug_uart_8250.o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/arch_cpu.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/byteorder.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memcpy.o objects/haiku/riscv64/release/system/kernel/lib/arch/riscv64/memset.o -o objects/haiku/riscv64/release/system/boot/arch/riscv64/efi/boot_arch_riscv64.o
compiling libroot memset:
.../riscv64-unknown-haiku-gcc -O2 -Wall -Wno-multichar -Wpointer-arith -Wsign-compare -Wmissing-prototypes -fno-strict-aliasing -fno-delete-null-pointer-checks -fno-builtin-fork -fno-builtin-vfork -march=rv64gc -nostdinc -fno-builtin -c "../src/system/libroot/posix/string/arch/riscv64/../generic/memset.c" ... -o "objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memset.o"
Linking libroot memset:
.../riscv64-unknown-haiku-ld -r objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memcpy.o objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/memset.o -o objects/haiku/riscv64/release/system/libroot/posix/string/arch/riscv64/posix_string_arch_riscv64.o
memset from bootloader post hrev55754:
0000000000009b9a <memset>: 9b9a: 0ff5f593 zext.b a1,a1 9b9e: 00c50733 add a4,a0,a2 9ba2: 87aa mv a5,a0 9ba4: c611 beqz a2,9bb0 <memset+0x16> 9ba6: 0785 addi a5,a5,1 9ba8: feb78fa3 sb a1,-1(a5) 9bac: fee79de3 bne a5,a4,9ba6 <memset+0xc> 9bb0: 8082 ret
to me, everything looks correct :-|
comment:19 by , 3 years ago
Haiku riscv64 hrev55862 successfully boot in TinyEMU when downgraded ICU to icu-57.2-2
.
comment:20 by , 3 years ago
Summary: | unmatched hang - bad address → riscv64 images built with icu compiled under gcc 11.x lockup at boot |
---|
indeed. I've pushed that as a temporary workaround in hrev56122.
The important bit seems to be that icu-57.2-2 was compiled with gcc 8.x The non-functional icu 66, icu 70 were built with gcc 11.2.0 or 11.3.0.
It seems less tied to ICU version, and more tied to gcc toolchain used.
comment:21 by , 3 years ago
comment:23 by , 23 months ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
It might be related to the recent gcc11 merge. We are still using gcc8 syslibs. Going to unbootstrap and see if new build-packages solves it.