Opened 3 years ago
Closed 3 years ago
#17511 closed bug (fixed)
qemu riscv64 no longer booting
Reported by: | kallisti5 | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta4 |
Component: | System/Boot Loader/EFI | Version: | R1/beta3 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | riscv64 |
Description
gcc11 introduced some major bugs in booting Haiku on the unmatched and qemu.
qemu-system-riscv64 -M virt -m 1G -device ati-vga -kernel u-boot.bin \ -drive file=haiku-mmc.image,format=raw,if=virtio \ -usb -device usb-ehci,id=echi -device usb-kbd -device usb-tablet . . Booting /EFI\BOOT\BOOTRISCV64.EFI PLIC contexts context 1: 2 cpu id: 0 GOP protocol not found Welcome to the Haiku boot loader! add_partitions_for(0x00000000bc6c7258, mountFS = no) add_partitions_for(fd = 0, mountFS = no) 0x00000000bc6c72b0 Partition::Partition 0x00000000bc6c72b0 Partition::Scan() check for partitioning_system: GUID Partition Map check for partitioning_system: Intel Partition Map priority: 810 check for partitioning_system: Intel Extended Partition 0x00000000bc6c74a0 Partition::Partition 0x00000000bc6c72b0 Partition::AddChild 0x00000000bc6c74a0 0x00000000bc6c74a0 Partition::SetParent 0x00000000bc6c72b0 new child partition! 0x00000000bc6c75b8 Partition::Partition 0x00000000bc6c72b0 Partition::AddChild 0x00000000bc6c75b8 0x00000000bc6c75b8 Partition::SetParent 0x00000000bc6c72b0 new child partition! 0x00000000bc6c72b0 Partition::Scan(): scan child 0x00000000bc6c74a0 (start = 2048, size = 33554432, parent = 0x00000000bc6c72b0)! 0x00000000bc6c74a0 Partition::Scan() check for partitioning_system: GUID Partition Map check for partitioning_system: Intel Partition Map check for partitioning_system: Intel Extended Partition 0x00000000bc6c72b0 Partition::Scan(): scan child 0x00000000bc6c75b8 (start = 33556480, size = 314572800, parent = 0x00000000bc6c72b0)! 0x00000000bc6c75b8 Partition::Scan() check for partitioning_system: GUID Partition Map check for partitioning_system: Intel Partition Map check for partitioning_system: Intel Extended Partition 0x00000000bc6c72b0 Partition::~Partition 0x00000000bc6c74a0 Partition::SetParent 0x0000000000000000 0x00000000bc6c75b8 Partition::SetParent 0x0000000000000000 0x00000000bc6c74a0 Partition::_Mount check for file_system: BFS Filesystem 0x00000000bc6c74a0 Partition::_Mount check for file_system: FAT32 Filesystem 0x00000000bc6c74a0 Partition::_Mount check for file_system: TAR Filesystem 0x00000000bc6c74a0 Partition::~Partition 0x00000000bc6c75b8 Partition::_Mount check for file_system: BFS Filesystem PackageVolumeInfo::SetTo() PackageVolumeInfo::_InitState(): failed to parse activated-packages: No such file or directory load kernel kernel_riscv64... Unhandled exception: Load access fault EPC: 00000000be6dfd9e RA: 00000000be6e07bc TVAL: af9f6b284653f724 EPC: 000000007e980d9e RA: 000000007e9817bc reloc adjusted Code: f4a6 ecce e0da fc5e f862 f466 f06a ec6e (4783 0a95) UEFI image [0x00000000be6c7000:0x00000000be7237cf] pc=0x18d9e '/EFI\BOOT\BOOTRISCV64.EFI' resetting ...
Change History (5)
comment:2 by , 3 years ago
With the trace statements above (and enabling debug for the kernel + bootloader), ran across this:
PCI: pci_module_init pci_controller_init() sizeof(PciDbi): 0x1000 hostCtrlType: ecam reg[0]: (0x30000000, 0x10000000) configRegs: (0x30000000, 0x10000000) interrupt-map: bus: 0, dev: 0, fn: 0, childIrq: 1, parentIrq: (3, 32) bus: 0, dev: 0, fn: 0, childIrq: 2, parentIrq: (3, 33) bus: 0, dev: 0, fn: 0, childIrq: 3, parentIrq: (3, 34) bus: 0, dev: 0, fn: 0, childIrq: 4, parentIrq: (3, 35) bus: 0, dev: 1, fn: 0, childIrq: 1, parentIrq: (3, 33) bus: 0, dev: 1, fn: 0, childIrq: 2, parentIrq: (3, 34) bus: 0, dev: 1, fn: 0, childIrq: 3, parentIrq: (3, 35) bus: 0, dev: 1, fn: 0, childIrq: 4, parentIrq: (3, 32) bus: 0, dev: 2, fn: 0, childIrq: 1, parentIrq: (3, 34) bus: 0, dev: 2, fn: 0, childIrq: 2, parentIrq: (3, 35) bus: 0, dev: 2, fn: 0, childIrq: 3, parentIrq: (3, 32) bus: 0, dev: 2, fn: 0, childIrq: 4, parentIrq: (3, 33) bus: 0, dev: 3, fn: 0, childIrq: 1, parentIrq: (3, 35) bus: 0, dev: 3, fn: 0, childIrq: 2, parentIrq: (3, 32) bus: 0, dev: 3, fn: 0, childIrq: 3, parentIrq: (3, 33) bus: 0, dev: 3, fn: 0, childIrq: 4, parentIrq: (3, 34) ranges: IOPORT (0x01000000): child: 00000000, parent: 03000000, len: 10000 MMIO32 (0x02000000): child: 40000000, parent: 40000000, len: 40000000 MMIO64 (0x03000000): child: 400000000, parent: 400000000, len: 400000000 AllocRegs() PANIC: Unexpected exception occurred in kernel mode! Welcome to Kernel Debugging Land... Thread 14 "main2" running on CPU 0 Stack: FP: 0xffffffc0029c9e70 FP: 0xffffffc0029c9e90, PC: 0xffffffc00217204a <kernel_riscv64> _ZL22stack_trace_trampolinePv + 16 FP: 0xffffffc0029c9ec0, PC: 0xffffffc002230774 <kernel_riscv64> arch_debug_call_with_fault_handler + 58 FP: 0xffffffc0029c9f10, PC: 0xffffffc002174636 <kernel_riscv64> debug_call_with_fault_handler.localalias + 118 FP: 0xffffffc0029c9fa0, PC: 0xffffffc0021722d2 <kernel_riscv64> _ZL20kernel_debugger_loopPKcS0_Pvi + 638 FP: 0xffffffc0029c9fe0, PC: 0xffffffc0021727a6 <kernel_riscv64> _ZL24kernel_debugger_internalPKcS0_Pvi + 144 FP: 0xffffffc0029ca020, PC: 0xffffffc002174c00 <kernel_riscv64> panic + 104 FP: 0xffffffc0029ca140, PC: 0xffffffc002231cd2 <kernel_riscv64> _ZL10SendSignal20debug_exception_typejimi + 302 FP: 0xffffffc0029ca2e0, PC: 0xffffffc00223220e <kernel_riscv64> STrap + 460 FP: 0xffffffc0029ca400, PC: 0xffffffc00222ee44 <kernel_riscv64> SVec + 100 STrap(exception loadAccessFault) sstatus: (ie: {}, pie: {s}, spp: s, fs: dirty, xs: off, sum: 0, mxr: 0, uxl: 0, sd: 1) stval: 0xffffffc006000000 ra: 0xffffffc0025383b6 t6: 0x00000000fbf4fd11 sp: 0xffffffc0029ca400 gp: 0x0000000000000000 tp: 0xffffffc0032bb580 t0: 0x80000000000fbf24 t1: 0xffffffc002532acc t2: 0x0000000000000000 t5: 0x0000000000000000 s1: 0xffffffc00253c8b8 a0: 0xffffffc006000000 a1: 0x0000000000000000 a2: 0x0000000000000000 a3: 0x0000000000000000 a4: 0x0000000010000000 a5: 0xffffffc016000000 a6: 0xffffffc0029ca1c0 a7: 0x0000000000000000 s2: 0x0000000000000000 s3: 0x0000000000000000 s4: 0x0000000000000000 s5: 0x0000000000000008 s6: 0x0000000000000020 s7: 0x000000000000ffff s8: 0x0000000000000000 s9: 0x0000000002000000 s10: 0x0000000003000000 s11: 0xffffffc00253baa8 t3: 0xffffffc00253791e t4: 0xffffffffffffffc8 fp: 0xffffffc00253c8b8 epc: 0xffffffc0025383b8 FP: 0xffffffc00253c8b8, PC: 0xffffffc0025383b8 <pci> _ZN17ArchPCIController9AllocRegsEv + 96 FP: 0xffffffc0022971e0, PC: 0xffffffc0033720a0 <slab area> 0x3720a0 FP: 0xffffffc00217b6a6, PC: 0x1 <commpage> 0x1 FP: 0x853e4781807ff0ef, PC: 0x80826105644260e2 0x80826105644260e2 kdebug>
comment:3 by , 3 years ago
ahci: failed to get pci x86 module PCI: pci_module_init pci_controller_init() sizeof(PciDbi): 0x1000 hostCtrlType: ecam reg[0]: (0x30000000, 0x10000000) configRegs: (0x30000000, 0x10000000) interrupt-map: bus: 0, dev: 0, fn: 0, childIrq: 1, parentIrq: (3, 32) bus: 0, dev: 0, fn: 0, childIrq: 2, parentIrq: (3, 33) bus: 0, dev: 0, fn: 0, childIrq: 3, parentIrq: (3, 34) bus: 0, dev: 0, fn: 0, childIrq: 4, parentIrq: (3, 35) bus: 0, dev: 1, fn: 0, childIrq: 1, parentIrq: (3, 33) bus: 0, dev: 1, fn: 0, childIrq: 2, parentIrq: (3, 34) bus: 0, dev: 1, fn: 0, childIrq: 3, parentIrq: (3, 35) bus: 0, dev: 1, fn: 0, childIrq: 4, parentIrq: (3, 32) bus: 0, dev: 2, fn: 0, childIrq: 1, parentIrq: (3, 34) bus: 0, dev: 2, fn: 0, childIrq: 2, parentIrq: (3, 35) bus: 0, dev: 2, fn: 0, childIrq: 3, parentIrq: (3, 32) bus: 0, dev: 2, fn: 0, childIrq: 4, parentIrq: (3, 33) bus: 0, dev: 3, fn: 0, childIrq: 1, parentIrq: (3, 35) bus: 0, dev: 3, fn: 0, childIrq: 2, parentIrq: (3, 32) bus: 0, dev: 3, fn: 0, childIrq: 3, parentIrq: (3, 33) bus: 0, dev: 3, fn: 0, childIrq: 4, parentIrq: (3, 34) ranges: IOPORT (0x01000000): child: 00000000, parent: 03000000, len: 10000 MMIO32 (0x02000000): child: 40000000, parent: 40000000, len: 40000000 MMIO64 (0x03000000): child: 400000000, parent: 400000000, len: 400000000 AllocRegs() j: 0 i: 0 readConfig address: 0xFFFFFFC006000000 bus: 0 device: 0 func: 0 offset: 0 size: 2PANIC: Unexpected exception occurred in kernel mode! Welcome to Kernel Debugging Land... Thread 14 "main2" running on CPU 0
comment:4 by , 3 years ago
To touch on this one, we have determined a bit more. There are two known major issues with the riscv64 port.
1) Only seen in qemu (this issue)
Adding a single dprintf in src/system/boot/loader/loader.cpp solves this crash...
status_t status = elf_load_image(modules, name); dprintf("DJ QUACKY QUACK\n");
The cause is unknown. Speculation has included some kind of memory alignment issue, or some other mmu issue.
2) userspace hang #17468 - seen in qemu and the unmatched board.
This is seen on both the Unmatched hardware, and in qemu after problem 1 above is worked around. x512 has mentioned the issue happens due to the ICU67 package (and downgrading to previous ICU 5x solves the problem).
The most likely core cause is the GCC 11.2 upgrade which occurred around roughly the same time as the ICU version bump. Building ICU70 might solve this problem, however more likely is some lingering bug in binaries generated by gcc 11.2
comment:5 by , 3 years ago
Milestone: | Unscheduled → R1/beta4 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
- 1 above was solved by a u-boot binary upgrade. Must have been some bug in u-boot's allocations.
- 2 was worked around by downgrading to icu57 compiled with gcc 8.x.
I'm calling this one resolved since the scope is too large now for a single ticket.
I started doing a debug build of our kernel, and the -O0 it sets seems to solve the early boot issue above we see on qemu after the gcc 11 upgrade.
I'm rolling with -O0 for now to look at the other issues.. but we definitely have multiple problems.