Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#9789 closed bug (invalid)

BDirectWindow Problems

Reported by: scanty Owned by: nobody
Priority: normal Milestone: R1
Component: - General Version: R1/alpha4.1
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Hello. I am wondering if there have been any changes to BDirectWindow code from Alpha 4.1 to the current hrevs. I have code that works fine in A4.1 but crashes nearly immediately in hrev45704. It's been difficult for me to track down the problem because the graphical debugger does not provide any source path. The best I can do is provide information from MALLOC_DEBUG, hopefully it will be of help to someone.

KERN: vm_soft_fault: va 0x152000 not covered by area in address space KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x152c00, ip 0x455acf, write 0, user 1, thread 0xa507 KERN: vm_page_fault: thread "pretendo_thread" (42247) in team "pretendo" (42237) tried to read address 0x152c00, ip 0x455acf ("pretendo_seg0ro" +0x82acf) KERN: debug_server: Thread 42247 entered the debugger: Segment violation KERN: stack trace, current PC 0x455acf (/pretendo/pretendo/pretendo + 0): KERN: (0x729414e0) 0x450b07 _ZN14PretendoWindow10DrawDirectEv + 0x107 KERN: (0x72941550) 0x450ce8 _ZN14PretendoWindow10BlitScreenEv + 0x78 KERN: (0x729415b0) 0x450e20 _ZN14PretendoWindow9end_frameEv + 0x30 KERN: (0x729415d0) 0x45103e _ZN14PretendoWindow16emulation_threadEPv + 0x3e KERN: (0x72941600) 0x16702e7 thread_entry + 0x1c USER: application/x-vnd.scanty-Pretendo killed for a problem in DirectConnected(): Operation timed out KERN: debug_server: Killing team 42237 (./pretendo) KERN: debug_server: TeamDebugHandler::Init(): Failed to get info for team 42237: Operation on invalid team KERN: debug_server: KillTeam(): Error getting info for team 42237: Operation on invalid team KERN: debug_server: Killing team 42237 ()

I'm not sure why the code works fine in A4.1, but fails in later revisions. Maybe someone can shed some light on this issue. Thank you.

Attachments (4)

PretendoWindow.cc (25.1 KB ) - added by scanty 12 years ago.
source file with BDirectWindow code in it
blitters.asm (4.8 KB ) - added by scanty 12 years ago.
mmx optimised blitters
pretendo-40529-debug-25-05-2013-03-41-45.report (11.8 KB ) - added by scanty 12 years ago.
debug report
pretendo (5.0 MB ) - added by scanty 12 years ago.
pretendo binary

Change History (17)

by scanty, 12 years ago

Attachment: PretendoWindow.cc added

source file with BDirectWindow code in it

comment:1 by phoudoin, 12 years ago

Maybe it's the realloc'ed() fClipInfo.clip_list:

http://dev.haiku-os.org/attachment/ticket/9789/PretendoWindow.cc#L205

Could you try with a free()/malloc() pair instead? Maybe realloc(), which is trying to keep the previous content and expand or shrink its area map, is broken somewhere.

comment:2 by scanty, 12 years ago

Well, it wasn't the realloc(). According to gdb, the problem lies in blit_windowed_dirty_mmx, which for some reason is crashing, but works fine in doublesize mode (just a different function). I don't know how many ASM nuts there are out there, but maybe I'm doing something wrong with respect to how Haiku is using registers -- I have no idea. I can't get anything meaniningful out of gdb, or the graphical debugger, which doesn't seem to want to touch my code with a ten foot pole. I wrote these blitters a few years ago, and they have worked fine in Haiku up until recently. If I can pin down the hrev where things go south, maybe that would be helpful. I tried a straight memcpy() instead, and even that is crashing, so I don't really know exactly what the problem could be. Whoever can fix it gets a free beer. Thanks guys, keep up the hard work.

comment:3 by scanty, 12 years ago

Last edited 12 years ago by scanty (previous) (diff)

comment:4 by anevilyak, 12 years ago

Would you be able to provide the executable that the graphical debugger is having trouble with? Also, can I assume you've tried compiling your code with gcc -g?

by scanty, 12 years ago

Attachment: blitters.asm added

mmx optimised blitters

comment:5 by scanty, 12 years ago

On a tip from hamishm on IRC, here are the relevant blitter addresses:

dest: 0xcf1105fc source: 0x24dc00 dirty: 0x430c00 size: 1028

I've also attached a debug report from the graphical debugger if it is of any help. I also rolled back to rev 45516 and the blitter works fine there. Seems like something strange is going on. Additionally I attached the executable -- it's a little big. You will need some NES ROMs, and set the renderer to BDIrectWindow from the app's menu.

by scanty, 12 years ago

debug report

by scanty, 12 years ago

Attachment: pretendo added

pretendo binary

comment:6 by mmadia, 12 years ago

Owner: changed from nobody to pdziepak
Status: newassigned

As hrev45516 works and hrev45704 doesn't, I'm guessing ASLR/DEP.

in reply to:  4 ; comment:7 by scanty, 12 years ago

Replying to anevilyak:

Would you be able to provide the executable that the graphical debugger is having trouble with? Also, can I assume you've tried compiling your code with gcc -g?

Yes. I am compiling with gcc -g

in reply to:  7 comment:8 by anevilyak, 12 years ago

Replying to scanty:

Yes. I am compiling with gcc -g

Please try using -gdwarf-2 -gstrict-dwarf . The problem is that gcc 4.7 is emitting some new draft extensions to the debug info format that we don't yet handle, which is why the graphical debugger isn't able to do anything with it. I've filed ticket #9799 for that part if you want to track it, though I can't promise how soon that will be done. In the meantime, however, the above -g flags should prevent it from using those, so that should output a debug binary we can actually handle.

comment:9 by pdziepak, 12 years ago

Owner: changed from pdziepak to nobody

Actually, this problem doesn't look like related to ASLR.

In PretendoWindow::DrawDirect line 818 the variable y has value -1 what makes blit_windowed_dirty_mmx attempt to read from an address before the start of the actual buffer, hence page fault.

I do not know whether it is a Haiku fault or a bug in the application, because I am not familiar with this part of Haiku and, unfortunately, at the moment I don't have enough spare time to get familiar.

comment:10 by pdziepak, 12 years ago

To be exact, this bug (either in Haiku or in this particular application) might have been there earlier but ASLR made it result in a page fault. Before hrev45518 there was no gaps between areas (unless one of them was deleted) thus an incorrect address before the start of the buffer was still valid from the kernel point of view (since there was another area before the one we wanted to access). Now, there are random gaps between areas and such incorrect addresses almost always result in a page fault.

comment:11 by scanty, 12 years ago

Well sh*t on me! I added a "+ 1" to the y calculation, and the blitter came to life! No more segfault. pdziepak, thank you for finding what could have continued to be a bug that halted development on later revisions of haiku. Good pickup! The fact is, I wrote these blitters about 10 years ago(!) and the bug was never caught, most likely because there was no ASLR, as you say. I'm glad the bug is fixed and I can move on to later revisions. I will try the latest revision of haiku and report back here if we should close this bug or not. As they say, it's never the programmer's fault. Thank you again.

Last edited 12 years ago by scanty (previous) (diff)

comment:12 by scanty, 12 years ago

I just tried the code on hrev45717 and it's working fine. Looks like we can shut the door on this bug. My apologies for driving everyone up the wall.

comment:13 by anevilyak, 12 years ago

Resolution: invalid
Status: assignedclosed
Note: See TracTickets for help on using tickets.