Opened 9 years ago

Closed 4 years ago

#6400 closed bug (fixed)

BLocker won't unlock from a different thread

Reported by: darkwyrm Owned by: axeld
Priority: normal Milestone: R1
Component: Kits/Support Kit Version: R1/alpha2
Keywords: Cc:
Blocked By: Blocking: #11655, #11860, #11984
Has a Patch: no Platform: All

Description

Once again my Paladin development efforts have uncovered another system-level bug.

It seems that if a BLocker is locked from one thread and unlocked from another, the Unlock fails. I've written a quick test app which locks a BLocker from within the main window's constructor and unlocks it in QuitRequested. The BLocker's status is then tested in the App class' destructor. When built and run under Zeta, the lock is unlocked. Under Haiku, it is still locked. I am attaching the test program momentarily.

Attachments (1)

HaikuThreadBugTest.zip (4.0 KB ) - added by darkwyrm 9 years ago.
Test app

Download all attachments as: .zip

Change History (20)

by darkwyrm, 9 years ago

Attachment: HaikuThreadBugTest.zip added

Test app

comment:1 by anevilyak, 9 years ago

That's actually very bad practice generally, usually the thread acquiring a lock should also be the one that releases it when done, mix and matching is almost invariably asking for trouble. BLocker guards against that situation intentionally for that reason, c.f. http://dev.haiku-os.org/browser/haiku/trunk/src/kits/support/Locker.cpp#L152

comment:2 by anevilyak, 9 years ago

Component: System/KernelKits/Support Kit

comment:3 by anevilyak, 9 years ago

That having been said, http://www.haiku-os.org/legacy-docs/bebook/BLocker.html#BLocker_Lock seems to indicate that Be considered that allowed behavior so we may have to change that for compat reasons.

comment:4 by pulkomandy, 5 years ago

Completely ignoring the unlock without even a warning is not really helpful either. I'm changing that to a debugger() call so we can identify apps trying to rely on this. Then we can make a decision on allowing it or not, depending on the issues we find (how many apps are affected, can they be fixed, etc). We could also turn this into a warning message on the console or syslogs (like the "resizing mode and flags swapped" in BWindow).

In hrev48542 I added a debugger() call in that case. Let's see if we can trigger it with some apps and make a decision from that.

comment:5 by anevilyak, 5 years ago

Blocking: 11655 added

(In #11655) Duplicate of #6400.

comment:6 by miqlas, 5 years ago

Homeworld_SDL crashing at start, maybe hrev48542 made this:

Active Threads:
	thread 2620: SDL 
	thread 2625: SDL 
	thread 2626: _BMediaRoster_ 
	thread 2627: SDL Audio control 
	thread 2628: SDL 
	thread 2629: w>Untitled 
	thread 2630: team 2619 debug task 
	thread 2619: homeworld_sdl (main)
		state: Call (Trying to unlock from the wrong thread (#6400))

Whole crashlog here: http://vault.extrowerk.com/homeworld-sdl-haiku/issue/1/crash-at-startup

comment:7 by waddlesplash, 5 years ago

Yes, that's the debugger call that Adrien added in that commit. Does Homeworld use more than one thread? If not, then this is a bug in our SDL port (it may also be a be a bug in our SDL port anyway that we don't handle the application calling SDL from different threads...)

comment:8 by pulkomandy, 5 years ago

@waddlesplash: if you had read the linked bugreport on homeworld side, you would have noticed that Philippe Houdoin already tracked where the problem is. The issue is in Mesa (the legacy version).

It uses a BLock in an unusual way, which worked on BeOS but resulted on the lock being always held in Haiku (since before I adde dthis check, the Unlock call would simply have been ignored). In the Mesa code there is a mention of adding this specifically to get Quake 2 working. So now we have to find/build a Quake 2 port and make sure it doesn't need this anymore. Or, we can comment the debugger() call or convert it to an stderr warning.

comment:9 by waddlesplash, 5 years ago

Ah, OK.

I recall @kallisti5 saying he'd gotten a Quake (2?) port working just this past weekend I think, so we might want to contact him to see if that was GCC2 or GCC4. And what kind of patches it needs.

in reply to:  8 comment:10 by miqlas, 5 years ago

Replying to pulkomandy:

@waddlesplash: if you had read the linked bugreport on homeworld side, you would have noticed that Philippe Houdoin already tracked where the problem is. The issue is in Mesa (the legacy version).

Homeworld depend on mesa_x86_swrast. Is it the legacy version? (I thought, the gcc2 version the legacy, or am I wrong?)

comment:11 by pulkomandy, 5 years ago

Sorry, I'm wrong here. It is indeed in the gcc4 version, which means we don't need to care about BeOS compatibility. So the hack in Mesa needs to be removed (and Quake 2 fixed or ported again using an SDL version).

comment:12 by pulkomandy, 5 years ago

Blocking: 11860 added

(In #11860) It's actually a problem with the use of BLocker in our mesa port, which uses a quirk that worked in BeOS and never worked in Haiku (previously it was ignored, now it's entering debugger, see #6400).

The crash is continuable: click the "debug" button, then quit debugger and click "resume". The app should then run (unless there are other bugs). Anything using SDL and OpenGL will hit this problem (some other games like f1spirit, for example).

comment:13 by miqlas, 5 years ago

Can we delete the problematic section from our Mesa port? If no, why not? It seems, that it makes more trouble as good.

http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/drivers/haiku/swrast/SoftwareRast.cpp#n180

	// some stupid applications (Quake2) don't even think about calling LockGL()
	// before using glGetString and its glGet*() friends...
	// so make sure there is at least a valid context.

	if (!_mesa_get_current_context()) {
		LockGL();
		// not needed, we don't have a looper yet: UnlockLooper();
	}

comment:14 by pulkomandy, 5 years ago

I think removing this should be ok (it's a no-op anyway in Haiku) but I'd prefer if someone more knowledgeable about Mesa would have a look; and if we could make sure there is still a way to get the Quake 2 engine to run (not necessarily the original one, but a "modern" open source version).

comment:15 by diver, 4 years ago

GLTeapot seems to be affected by it too.

comment:16 by pulkomandy, 4 years ago

Yes, all x86 OpenGL apps will have this problem in the current situation.

comment:17 by luroh, 4 years ago

Blocking: 11984 added

comment:19 by pulkomandy, 4 years ago

Resolution: fixed
Status: newclosed

We are now behaving the same as BeOS (with an extra warning message), which fixes the initial problem, but that makes some other apps (such as OpenGL) unhappy.

So yes, let's say this is fixed, OpenGL problems are tracked in #11860.

Note: See TracTickets for help on using tickets.