Opened 15 years ago
Closed 15 years ago
#4115 closed bug (fixed)
Failed to relocate error when attempting to boot PPC kernel.
Reported by: | kallisti5 | Owned by: | mmu_man |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | PowerPC |
Description
When booting the Haiku PPC Kernel a:
boot_arch_elf_relocate_rela(): Failed to relocate entry index
error is presented for a split second before returning to the PPC bootloader.
The Failed to relocate error comes from kernel/arch/ppc/arch_elf.cpp
I am working on getting the *exact* message now by inserting a temporary while(1) after the error condition.
Attachments (4)
Change History (28)
comment:1 by , 15 years ago
follow-up: 3 comment:2 by , 15 years ago
err... think I found where the error is coming from...
src/tests/system/boot/loader/platform_misc.cpp:
extern "C" status_t boot_arch_elf_relocate_rel(struct preloaded_image *image,
struct Elf32_Rel *rel, int rel_len)
{
return B_ERROR;
}
extern "C" status_t boot_arch_elf_relocate_rela(struct preloaded_image *image,
struct Elf32_Rela *rel, int rel_len)
{
return B_ERROR;
}
This looks incomplete... right?
comment:3 by , 15 years ago
Please ignore the above comment. Nothing wrong there.
Turning on CHATTY in src/system/kernel/arch/ppc/arch_elf.cpp and adding some additional CHATTY text.
comment:4 by , 15 years ago
After some trial and error I figured out the following:
On my PPC PowerBook Lombard: kernel/arch/ppc/arch_elf.cpp is not getting used. kernel/arch/m68k/arch_elf.cpp is getting used. This is where the error is coming from.
I am working on enabling chatty in the m68k code.. is this right? Should kernel/arch/m68k be used? If the m68k directory is correct.. what is the PPC directory in the source tree for?
--Alex
comment:5 by , 15 years ago
Owner: | changed from | to
---|
That is most definitely incorrect, and implies an error in the Jamfiles somewhere. M68K is as its name implies, code specific to the Motorola 680x0 CPUs (specifically Atari Falcon), and as such is fundamentally incompatible with the PPC port.
comment:6 by , 15 years ago
It seems there is a bug in the Jamfile and changing a source file and rebuilding without cleaning will not include the changes in the new haiku-image. We are in fact running kernel/arch/ppc/ vs m68k. please ignore my previous comment. (man I wish there was an edit option in trac for users)
I am attaching a patch which lets the boot continue further and provides some context around the issue. (the patch ppcarch_elf_forcerelocate.patch is NOT a fix, but simply shows the issue better)
I will post the results here shortly.
by , 15 years ago
Attachment: | ppcarch_elf_forcerelocate.patch added |
---|
temporary solution to show the larger problem.
comment:7 by , 15 years ago
ok, here is what happens. When the failure to relocate entry happens, vlErr is set to -2147478780 each time.
Here are the last of the "Failed to relocate entry index X type X" messages... Entry Type 1346 21 1347 1 1348 21 1349 1 1350 21 7002 1 7003 21 7004 1 7005 21 7006 1 7007 21 7008 1 7009 21 7010 1 7011 21 7012 1 7013 21 9087 21 Kernel entry at: 0x8007b7c8 Kernel stack top: 0x80004000 <kernel startup locks here>
comment:9 by , 15 years ago
The boot process is getting hung up right after the "Kernel stack top" message. Looking through the sources this is where we jump to the kernel which has now been copied into memory.
mmu_man, Where are you getting the Symbol not found message from?
comment:10 by , 15 years ago
Quick note... Tried the same image on a G4 quicksilver PPC and got the exact same error and result.
by , 15 years ago
Attachment: | synccleanup.patch added |
---|
comment:11 by , 15 years ago
ok, used readelf was able to extract a listing of the missing relocations as per the offsets in the errors. Magic fingers!
8017b3b0 000e3201 R_PPC_ADDR32 00000000 _Z18_user_atomic_set64 + 0 8018d208 000e3215 R_PPC_JMP_SLOT 00000000 _Z18_user_atomic_set64 + 0 8017b3b8 0009ef01 R_PPC_ADDR32 00000000 _Z27_user_atomic_test_ + 0 8018c340 0009ef15 R_PPC_JMP_SLOT 00000000 _Z27_user_atomic_test_ + 0 8017b3c0 000ae201 R_PPC_ADDR32 00000000 _Z18_user_atomic_add64 + 0 8018c6f0 000ae215 R_PPC_JMP_SLOT 00000000 _Z18_user_atomic_add64 + 0 8017b3c8 0000d301 R_PPC_ADDR32 00000000 _Z18_user_atomic_and64 + 0 8018a428 0000d315 R_PPC_JMP_SLOT 00000000 _Z18_user_atomic_and64 + 0 8017b3d0 0008cc01 R_PPC_ADDR32 00000000 _Z17_user_atomic_or64P + 0 8018bf08 0008cc15 R_PPC_JMP_SLOT 00000000 _Z17_user_atomic_or64P + 0 8017b3d8 00063501 R_PPC_ADDR32 00000000 _Z18_user_atomic_get64 + 0 8018b640 00063515 R_PPC_JMP_SLOT 00000000 _Z18_user_atomic_get64 + 0 8018d1c0 000e0d15 R_PPC_JMP_SLOT 00000000 arch_cpu_init_percpu + 0
comment:12 by , 15 years ago
arch_cpu_init_percpu is defined in the following files
src/system/kernel/arch/m68k/arch_cpu.cpp:arch_cpu_init_percpu(kernel_args *args, int curr_cpu) src/system/kernel/arch/x86/arch_cpu.cpp:arch_cpu_init_percpu(kernel_args *args, int cpu)
oh look! function not defined for ppc! For now a function which returns 0 should be enough until multi cpu support is added for PPC
comment:13 by , 15 years ago
adding patch to add arch_cpu_init_percpu to ppc code.
This patch also cleans up the my previous isync patch that fixes the PPC bootloader not starting.
attachment: isyncmissingcpuinit.diff
by , 15 years ago
Attachment: | isyncmissingcpuinit.diff added |
---|
comment:14 by , 15 years ago
just a quick note, after the patch above we now get one less failed to relocate entry error (the arch_cpu_init_percpu one) and the kernel drops back to the openfirmware prompt.
I think the atomic_set,test,etc errors are because the Kernel is having trouble accessing the functions in libroot/os/arch/ppc/atomic.S
comment:15 by , 15 years ago
alex@linux:~/develop/haiku/generated/objects/haiku/ppc/release/system/kernel$ readelf -s kernel_ppc | grep "UND" 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 211: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_and64PVx 1589: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_get64PVx 2252: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z17_user_atomic_or64PVxx 2543: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z27_user_atomic_test_and 2786: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_add64PVx 3634: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_set64PVx 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 2042: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_and64PVx 3420: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_get64PVx 4083: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z17_user_atomic_or64PVxx 4374: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z27_user_atomic_test_and 4617: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_add64PVx 5465: 00000000 0 NOTYPE GLOBAL DEFAULT UND _Z18_user_atomic_set64PVx
I think these are coming from src/system/kernel/arch/ppc/arch_atomic.c arch_atomic defines int64's on 32-bit platforms. Looking into commenting them out for now.
comment:16 by , 15 years ago
Ingo pointed out that this was due to a missing extern "C" statement in headers/private/kernel/user_atomic.h
This fixes the undefined user_atomic errors!
The *final* attached contains lots of good PPC fixes and gets rid of all of the relocation errors. We still get choked when jumping to the kernel but at least we get the kernel in memory properly now.
After checking in the attached ppc-isync-relocation-final.diff patch this TRAC should be resolved.
comment:18 by , 15 years ago
I vote to leave it in.
What happens is on the first missing relocation we return an error instantly throwing us back to the bootloader and causing the end user to not see the error message.
With the _BOOT_MODE def in there we will try and push on if it is the kernel booting showing all the relocation errors and then freeze up when trying to jump into the kernel.
Missing relocation errors still may occur in the future and this is a handy way to fish them out.
comment:19 by , 15 years ago
also a quick note on those sync commands, they always have to be called prior to and sometimes after those context changes. Since you always have to do it we might as well avoid bugs and do them in the assembly vs the c. If we call isync/sync multiple times on accident somewhere no harm will come of it.
comment:20 by , 15 years ago
Well it should panic when getting an error from relocate_foo(). Skipping errors and hoping for the kernel to crash (which might not be obviously right at boot, depending on which sym is missing), is not really clean.
comment:21 by , 15 years ago
Good point. A panic would be ideal.. or maybe set some kind of $panic = true thing while relocating then after everything is relocated trigger the panic if something wasn't right?
For now feel free to remove the _BOOT_MODE bit as its not required and not the most important part of this fix.
Thanks!
-- Alex
comment:23 by , 15 years ago
this is working great... the system locks up when jumping to the kernel, but all the relocations are now proper.
safe to close.
comment:24 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | in-progress → closed |
boot_arch_elf_relocate_rela(): Failed to relocate entry index 1339, rel type 1, offset 0x8017a568, sym 0xe30, addend 0x0