Opened 5 months ago
Closed 3 weeks ago
#19021 closed bug (fixed)
Debugger can't resolve some function names when using lld
Reported by: | Zardshard | Owned by: | anevilyak |
---|---|---|---|
Priority: | normal | Milestone: | R1/beta6 |
Component: | Applications/Debugger | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
I've encountered this bug a lot when working with HaikuWebKit.
Sometimes, when using lld for linking, Debugger fails to decode the function name in a back trace and instead display something like /Storage/haikuwebkit/WebKitBuild/Debug/lib/libWebKit.so.1.9.11 + 0x249414d
. However, running addr2line -f -e /Storage/haikuwebkit/WebKitBuild/Debug/lib/libWebKit.so.1.9.11 0x249414d
does show the symbol name.
Reproducing
Take a simple C++ file, such as
#include <OS.h> int main() { debugger(""); }
and compile it with g++ -fuse-ld=lld -g <file>
.
Run it, open Debugger, and instead of seeing _main
in the backtrace, you should see /boot/home/Desktop/test/main + 0x18ae
.
System information
Haiku x86_64 hrev57966
gcc version 13.3.0_2023_08_10-1
llvm17_lld version 17.0.6-3
Change History (11)
comment:1 by , 3 weeks ago
comment:2 by , 3 weeks ago
Ah, actually, that lookup is happening right before we insert the 18b5 section, due to the check for duplicates inside _ParseFrameSection. So that's not the problem here, and we do read this section.
comment:3 by , 3 weeks ago
So, the problem is that ImageDebugInfo::FunctionAtAddress
doesn't find the function. In the file I built, it's searching for 32e768c8
(in the stack trace this is IP 0xf632e768c8
), but "main" is actually between 32e75125
and 32e7513f
. The stack trace in Debugger reads "a.out + 0x18c8".
comment:4 by , 3 weeks ago
Or rather, I should say, the FunctionInstance's start and start+size are in that range. I guess in this case the FunctionInstance is more likely to be wrong than the stack trace. addr2line at least does indicate that 0x18c8 in a.out is inside "main".
comment:5 by , 3 weeks ago
The addresses here are coming from SymbolTableBasedImage::NextSymbol. The one for "main" has a st_value of 0x18b5. I guess the fLoadDelta must somehow be incorrect.
This file has 4 LOAD sections, as opposed to the usual 2 (as binutils ld currently generates by default):
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12 filesz 0x0000000000000784 memsz 0x0000000000000784 flags r-- LOAD off 0x0000000000000790 vaddr 0x0000000000001790 paddr 0x0000000000001790 align 2**12 filesz 0x0000000000000240 memsz 0x0000000000000240 flags r-x LOAD off 0x00000000000009d0 vaddr 0x00000000000029d0 paddr 0x00000000000029d0 align 2**12 filesz 0x00000000000001b8 memsz 0x0000000000000630 flags rw- LOAD off 0x0000000000000b88 vaddr 0x0000000000003b88 paddr 0x0000000000003b88 align 2**12 filesz 0x0000000000000068 memsz 0x00000000000000c8 flags rw-
The r-x LOAD section has a vaddr of 0x1790. From the addresses I posted in comment:3, 32e768c8-32e75125=17A3
, and 17A3==1790+13
; that's within "main". So it looks like something isn't handling the second LOAD section properly.
comment:6 by , 3 weeks ago
Adding + sectionHeader->sh_addr
in SymbolTableBasedImage does fix the problem for this test program, but seems to break resolving all other symbols. So I must still be missing something here.
comment:7 by , 3 weeks ago
I'm not sure I've ever actually had to touch that class since Ingo wrote it, so I'm afraid that I'm going to be of minimal use here. Sounds like you're on the right track though, I do wonder if the non-executable sections are confusing things here though, since those clearly won't contain code. Perhaps worth seeing if the loader isn't skipping those properly.
comment:8 by , 3 weeks ago
They don't contain code, but they do contain the read-only data and thus symbols. This is a new strategy employed by LLD by default, but I think binutils ld supports it too: put the read-only ("static const", etc.) data in separate sections/segments than the executable data. So the first LOAD segment does have symbols in it and we need to handle resolving those.
PulkoMandy pointed out what I'm getting wrong here: I'm mixing up symbols and segments.
<PulkoMandy> there are 3 "address spaces" here: file offsets, vaddr/paddr of the segments, and the actual vaddr after loading (that is different from the one in the segment header because aslr)
So, we need to translate file offset (st_value) -> vaddr -> actual vaddr. Right now the middle step is not done because we presume text offset == 0, but if we have a LOAD segment with symbols in it that's not at offsets 0 then we must do that or else we get this problem.
comment:9 by , 3 weeks ago
(It does seem like Debugger could potentially be "lazier" here and avoid just looking up all the symbols immediately, but instead only load what it actually needs, but that's a problem for another time.)
comment:10 by , 3 weeks ago
Fixed by https://review.haiku-os.org/c/haiku/+/8819. It seems this doesn't fix #19022 however.
comment:11 by , 3 weeks ago
Milestone: | Unscheduled → R1/beta6 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Fixed in hrev58524.
This doesn't work without -g either.
.eh_frame:
I added some tracing in Debugger's _GetContainingFDEInfo, to dump the previous, current, and next FDE info ranges when the searched address isn't found. Here's what we get:
i.e. we only get one, so it must be the last item in the list. So, where's the second entry that readelf shows here, that should contain the info we need?