Attachments (3)
Change History (12)
by , 5 years ago
comment:1 by , 5 years ago
Please include a debug report directly. This may be a yab issue and the platform difference could boil down to differences in the memory layout. Please run the application from a terminal with the guarded heap and attach a debug report if it generates a fault on x86_gcc2 as well:
LD_PRELOAD=libroot_debug.so MALLOC_DEBUG=grs25 program...
comment:2 by , 5 years ago
Description: | modified (diff) |
---|---|
Summary: | segment violation on x86_s4 with bbitmap → Segment violation on x86_64 in bbitmap |
by , 5 years ago
Attachment: | msize-12350-debug-02-07-2020-13-41-00.report added |
---|
LD_PRELOAD=libroot_debug.so MALLOC_DEBUG=grs25 msize
comment:4 by , 5 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
This isn't a Haiku bug but one in yab. As the stack trace suggests, it ends up in YabInterface::BitmapSave() where it shouldn't be at all as it is only supposed to load an image. The reason is a missing return statement in yi_BitmapLoad which leads to undefined behaviour. In this specific case, the version built for x86_64 misses the final stack pop and ret, so after the call to YabInterface::BitmapLoad() it simply continues running. The next function happens to be YabInterface::BitmapSave() which is then run with a bogus stack and crashes.
The compiler does point the issue out as a warning (even by default) but, as the function called may or may not actually return, it can't simply treat it as an error outright. In almost all cases you really do want to heed those warnings though and one should IMHO always strive for "-Wall -Wextra -Werror" to have such mistakes break the build instead of getting subtly broken binaries. The codebase has more of these return value and other issues (like calling "sprintf(stderr, ...)" which probably should have been "fprintf(stderr, ...)" instead) that would've been caught this way.
comment:5 by , 5 years ago
This does not explain the reason that the same code worked on previous versions of haiku and is the same code is used and works on x86_gcc2. As for our BitmapLoad, it completes prtoperly and there is no way it should flow into the two functions that follow it in the source. (BitmapSave follows BitmapGet)
comment:6 by , 5 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
I re-compiled the same yab source code on haiku-r1beta1-x86_64 and it works just fine. This is not a yab problem.
comment:7 by , 5 years ago
Fact that Yab is working in previous Haiku versions do not always means that it is Haiku fault. Yab can cause undefined behavior that worked fine before by accident.
comment:8 by , 5 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
mmlr pointed out above exactly what the problem is:
The reason is a missing return statement in yi_BitmapLoad which leads to undefined behaviour.
R1/beta1 may work due to its use of a different compiler. That does not change the fact that the code is wrong.
comment:9 by , 5 years ago
I've spent a fair amount of time on this and even pointed out where exactly the problem in this specific instance is. I understand that this is getting somewhat low-level, but I tried to explain that and how yab is buggy in this case and pointed out an easy way to find and fix this and some similar issues. Just dismissing that and continuing to blame the operating system isn't going to make this go away, only fixing the code will.
I don't have enough personal investment in yab to fix these issues myself, but I know that it is a used by others and consider it a nice thing to have. I'll therefore try to explain the problem some more in the hope that it leads to fixes in yab that benefit the broader community.
The fundamental problem is this: Having control flow reach the end of a non-void function without a return statement results in undefined behaviour. In such cases pretty much anything is allowed to happen, including for the compiler to not produce a function epilogue at all. The epilogue is the part that restores registers and the stack to where it needs to be and actually returns the control flow to the calling function. When it's missing, the function will quite literally not return, i.e. it will continue to run code adjacent to its code. The layout of the source code is not relevant, only the layout of the produced binary is and that one can vary depending on the toolchain, compiler flags, used libraries, etc and should be considered unpredictable.
The yab package currently available for x86_64 as well as one I built locally (using gcc-8.3.0_2019_05_24-7-x86_64.hpkg
for reference) shows exactly this missing epilogue. This is what a disassembly of the relevant part looks like (produced by objdump --demangle --disassemble-all /boot/system/lib/libyab_1.7.8.so
):
0000000000061ac4 <yi_BitmapLoad>: 61ac4: 55 push %rbp 61ac5: 48 89 e5 mov %rsp,%rbp 61ac8: 48 89 f8 mov %rdi,%rax 61acb: 48 89 d7 mov %rdx,%rdi 61ace: 48 89 f2 mov %rsi,%rdx 61ad1: 48 89 c6 mov %rax,%rsi 61ad4: e8 87 03 fe ff callq 41e60 <YabInterface::BitmapLoad(char const*, char const*)@plt> 61ad9: 90 nop 0000000000061ada <YabInterface::BitmapSave(char const*, char const*, char const*)>: 61ada: 55 push %rbp 61adb: 48 89 e5 mov %rsp,%rbp 61ade: 41 57 push %r15 61ae0: 41 56 push %r14 ...
It shows that no epilogue was produced, i.e. no final stack pop and no ret instruction. Therefore the program will simply continue to run and in this case it happens to have YabInterface::BitmapSave
directly following which it then executes. That function has different arguments and therefore expects a different stack layout and register values and then crashes.
Compare this with a similar function that isn't missing the return statement:
0000000000061cec <yi_BitmapSave>: 61cec: 55 push %rbp 61ced: 48 89 e5 mov %rsp,%rbp 61cf0: 48 89 f8 mov %rdi,%rax 61cf3: 48 89 cf mov %rcx,%rdi 61cf6: 48 89 d1 mov %rdx,%rcx 61cf9: 48 89 f2 mov %rsi,%rdx 61cfc: 48 89 c6 mov %rax,%rsi 61cff: e8 1c d8 fd ff callq 3f520 <YabInterface::BitmapSave(char const*, char const*, char const*)@plt> 61d04: 5d pop %rbp 61d05: c3 retq
As I already said above, the compiler warns about such errors. It only doesn't fail compilation because it cannot know for sure that undefined behaviour will actually occur (the inner function could for example always exit by throwing an exception or actually be meant to execute an infinite loop). But in almost all cases they are indeed just errors and they will lead to hard to understand bugs. So ignoring these compilation warnings because "they are just warnings" is not an option.
Recompiling in an even slightly different environment may alter what happens. It is possible that the layout of the binary changes so that a function happens to follow that wouldn't crash with the given stack. Obviously it's always problematic to run functions that weren't intended to run. Even if it doesn't crash, it will possibly have dangerous side effects (what if it was a function to delete some files for example?) and it most certainly will mangle the return value so the rest of the program will run with bogus data.
So my advice to build yab with -Wall -Wextra -Werror
and fixing anything it points out stands. It will fix some bugs that may have been hidden by pure chance so far (or lead to so far unexplained strange behaviour) and it will increase code quality. You can always disable some of these warnings if you explicitly don't want to change the code (-Wno-unused-variable
or -Wno-unused-parameter
for example if you don't want to remove them). But even the "misleading indent" ones are sometimes right and even if they aren't, adjusting the code style doesn't really hurt. A quick glance showed at least one instance of a misleading indent warning that indeed looked wrong.
yab script using bitmap