Context Navigation

#14983 closed bug (fixed)

[app_server] crashes after double clicking Audio Mixer in Cortex

Reported by:	diver	Owned by:	axeld
Priority:	critical	Milestone:	R1/beta2
Component:	Servers/app_server	Version:	R1/Development
Keywords:		Cc:
Blocked By:		Blocking:	#15597
Platform:	All

Description (last modified by diver)

hrev53001 x86_64.

Double clicking Audio Mixer (Haiku) found in Media add-ons window in Cortex crashes app_server. Reproducible.

Attachments (1)

app_server-1682-debug-26-03-2019-07-55-56.report (75.8 KB ) - added by diver 6 years ago.

Download all attachments as: .zip

Change History (21)

by diver, 6 years ago

Attachment:	app_server-1682-debug-26-03-2019-07-55-56.report added

comment:1 by diver, 6 years ago

Description:	modified (diff)
Summary:	[app_server] crashes after double clicking Audio Mixer → [app_server] crashes after double clicking Audio Mixer in Cortex

comment:2 by waddlesplash, 6 years ago

Milestone:	Unscheduled → R1/beta2

comment:3 by nephele, 5 years ago

Is app_server not autorestarting here a seperate bug?

comment:4 by diver, 5 years ago

Yes, see #8292.

comment:5 by diver, 5 years ago

Blocking:	15597 added

comment:6 by waddlesplash, 5 years ago

Priority:	normal → critical

        state: Call (_numBlocks > 0)

This looks like a finally reproducible case of the "classic" double-free that has lurked in app_server for years. We should try and fix this before beta2.

comment:7 by pulkomandy, 5 years ago

Is it possible to run app_server with libroot_debug? This crash happens when allocating memory, so it's hard to see when the double-free happened.

comment:8 by X512, 5 years ago

Is it reproducible with test_app_server?

comment:9 by pulkomandy, 5 years ago

apps need to be built specifically for test_app_server, and I'm not sure how easy it is to get Cortex running there.

comment:10 by pulkomandy, 5 years ago

I reproduced the crash 2 or 3 times and I see that it's always in HWInterface::_AdoptDragBitmap(ServerBitmap const*, BPoint const&).

I see that BListView "resets" fTrack->drag_start on double click. I guess the idea is that this should cancel the drag operation, but I don't see how it would achieve that. Quite the opposite, it makes sure the distance computation in the dragging thread will overflow and give an unexpected result...

Also, this code will spawn a new mouse tracking thread on every mouse click, which... well, I think is not the way to go about this. The threads are snoozing for a while, and if you time the clicks appropriately you can accumulate as many of them as you want. And then they will all want to initiate a drag (well, at least they do lock the looper before doing that).

I think we should rework the code in BListView to not use MouseDownThread, and just run in the normal message loop instead. This will make this a lot simpler to trace and debug, if it doesn't solve it.

comment:11 by X512, 5 years ago

BListView code should not crash app_server, no matter how it wrong.

follow-up: 13 comment:12 by pulkomandy, 5 years ago

I think calling BView::DragMessage() from outside the view own thread should not be allowed. At least it should probably call _CheckOwnerLockAndSwitchCurrent() instead of just _CheckOwnerLock() ?

app_server can try to do some validation, but if we send completely bogus data, it's difficult to catch it all there.

in reply to: 12 comment:13 by X512, 5 years ago

Replying to pulkomandy:

app_server can try to do some validation, but if we send completely bogus data, it's difficult to catch it all there.

app_server must validate all data that come from another processes. Otherwise specially crafted program can directly send data to server port link to crash app_server or install exploit.

comment:14 by pulkomandy, 5 years ago

Yes, indeed. But that's not an easy thing to do because the protocol wasn't necessarily designed with that in mind.

I never said no change is needed in app_server here, just that the current way the code is done for BListView makes it hard to track what's going on, and fixing that on the app side will likely make the bug not reproducible. If we want to fix the app_server we probably need to write a testing tool to feed it commands more directly, and maybe do some fuzzing on it.

follow-up: 16 comment:15 by pulkomandy, 5 years ago

Resolution:	→ fixed
Status:	new → closed

Fixed in hrev54022.

in reply to: 15 comment:16 by X512, 5 years ago

Replying to pulkomandy:

Fixed in hrev54022.

This change don't fix actual problem. Other software can still crash app_server.

comment:17 by pulkomandy, 5 years ago

Yes. But it was impossible to debug from this case (too many threads and too many timing dependant factors). We need to do some fuzzing on the app_server link and I guess it won't be long until we find many problems there. This particular issue is fixed, even if the root cause isn't.

comment:18 by X512, 5 years ago

Is there existing ticket for app_server crash on improper link communication or new ticket should be created?

comment:19 by waddlesplash, 5 years ago

Resolution:	fixed
Status:	closed → reopened

Did we even attempt to use libroot_debug?

I think we should at least do that before we give up and say we can't debug this.

comment:20 by pulkomandy, 5 years ago

Resolution:	→ fixed
Status:	reopened → closed

The issues found so far:

BView::DragMessage does only _CheckOwnerLock() instead of _CheckOwnerLockAndSwitchCurrent()
I don't know if calling AS_VIEW_DRAG_IMAGE from multiple threads, and also while dragging is happening, is a good idea (quite possibly it will replace the bitmap while the mouse handling is drawing it or something)
There could be missing error checks too (for example there is no InitCheck() for the BBitmap passed as parameter in DragMessage, so we can end up sending the invalid -1 server token to app_server)

In general, very few things are checked at the app_server_link level, and app_server is written assuming well-behaved BView is talking to it and nothing else. If you want to improve that, great, but it's a separate ticket with a deep design change of how the app_server_link works, not a simple bug report like here.

Created #15847 to track that.

Note: See TracTickets for help on using tickets.

Download in other formats: