Context Navigation

#3001 closed bug (fixed)

OpenGL demos segfault in MesaSoftwareRenderer::SwapBuffers() when moving window

Reported by:	aldeck	Owned by:	korli
Priority:	high	Milestone:	R1
Component:	Kits/OpenGL Kit	Version:	R1/pre-alpha1
Keywords:		Cc:
Blocked By:		Blocking:
Platform:	All

Description

I can crash glteapot/gldirectmode just by moving the window here on real hardware, 2 cpus, hrev28231, vesa. It segfault in MesaSoftwareRenderer::SwapBuffers().

Can't manage to reproduce under vmware with 1 cpu. Not sure it's a new bug as i didn't test under this hardware until recently. Korli, do you think it might be due to the last update to mesa?

We might want to test this with non-BDirectWindow GL demos.

Setting Kits/OpenGL component for now, but it might be a BDirectWindow problem.

Attachments (1)

Glteapot_gdb_r28231.txt (2.1 KB ) - added by aldeck 16 years ago.

Download all attachments as: .zip

Change History (12)

by aldeck, 16 years ago

Attachment:	Glteapot_gdb_r28231.txt added

comment:1 by mmlr, 16 years ago

I can easily reproduce this here too with GLTeapot when more than one core is activated (quad core CPU). If I disable all but one core (through ProcessController) the GLTeapot works just fine and is movable without crashing.

comment:2 by korli, 16 years ago

I can't reproduce when using a BWindow instead of a BDirectWindow.

comment:3 by korli, 16 years ago

Component:	Kits/OpenGL Kit → System/Kernel
Owner:	changed from korli to axeld

It should be related to hrev28223 as it doesn't crash before this revision. Reassigning to kernel, Ingo or Axel might have an idea.

follow-up: 5 comment:4 by bonefish, 16 years ago

Component:	System/Kernel → Kits/OpenGL Kit
Owner:	changed from axeld to korli

Actually this looks very much like a race condition in userland. Since hrev28223 when a thread wakes up another one (e.g. by releasing a semaphore) the woken up thread is scheduled immediately on another CPU, if one is idle. Before that revision the new thread wouldn't be scheduled before the current thread's quantum or the quantum of the idle thread on another CPU was used up. I.e. if the first thread has a race condition with the second one, it is just way more likely that it shows now.

GDB's stack trace isn't particularly beautiful in this case. At any rate the innermost function (0xffff012c) is memcpy(). The syslog says:

vm_page_fault: thread "Simon" (182) in team "GLTeapot" (177) tried to read address 0x5ffffb5c, ip 0xffff012c ("???" +0xffff012c)

The fault address is in esi i.e. the source of the memcpy() and slightly besides a bitmap area:

kdebug> areas 177
addr          id  base          size    protect lock  name
...
0x810bb8c0  1003  0x60000000    0x0004f000 2033    0  server_memory
...

Looking at MesaSoftwareRenderer::SwapBuffers() I'd say clip->top - fInfo->window_bounds.top becomes negative for whatever reason and the memory before the start of the bitmap's Bits() is addressed.

Bouncing the ticket back to the OpenGL Kit. It might still be a BDirectWindow or app server problem, though.

in reply to: 4 comment:5 by korli, 16 years ago

Resolution:	→ fixed
Status:	new → closed

Replying to bonefish:

Thanks for your thoughts. It also explain why it only happens on SMP machines. In hrev28430, fInfo is now a copy of direct_buffer_info to avoid changes while buffer swapping. I don't know it it's meant to be copied as it can become out of date though.

follow-up: 7 comment:6 by stippi, 16 years ago

I think it makes much more sense like it is now. If I understand correctly, the DirectConnected() hook cannot change it before the rendering thread is done with it. This is how this is meant to work. It probably also fixes the bug where BDirectWindows draw over other windows that are moved in front of their windows, when app_server was running in single buffer mode.

in reply to: 6 comment:7 by anevilyak, 16 years ago

Replying to stippi:

I think it makes much more sense like it is now. If I understand correctly, the DirectConnected() hook cannot change it before the rendering thread is done with it. This is how this is meant to work. It probably also fixes the bug where BDirectWindows draw over other windows that are moved in front of their windows, when app_server was running in single buffer mode.

Are you sure? His commit fix looked specific to the OpenGL kit, not BDirectWindow as a whole.

comment:8 by stippi, 16 years ago

Yes, that's true. But I am not sure if I saw that problem with other BDirectWindows or only with OpenGL demos.

comment:9 by aldeck, 16 years ago

Resolution:	fixed
Status:	closed → reopened

Reopening,

it crashes in the same place when moving a window over a directWindow GL app. (single cpu, virtualbox, hrev28452)

comment:10 by korli, 16 years ago

Resolution:	→ fixed
Status:	reopened → closed

Should be fixed with hrev28504. Please check.

comment:11 by aldeck, 16 years ago

Fixed here on real hardware 2 cpus and virtualbox. Thanks

Note: See TracTickets for help on using tickets.

Download in other formats: