Opened 3 years ago

Closed 3 years ago

#17163 closed bug (fixed)

Mac mini now requires fail-safe graphics driver

Reported by: beaglejoe Owned by: rudolfc
Priority: normal Milestone: R1/beta4
Component: Drivers/Graphics/intel_extreme/9xx Version: R1/beta3
Keywords: Cc: x512, mmu_man
Blocked By: Blocking:
Platform: x86

Description

Starting with hrev55253, my 2006 Mac mini now requires fail-safe graphics driver and no longer supports 1920 x 1080

Attachments (3)

listdev.txt (2.8 KB ) - added by beaglejoe 3 years ago.
syslog (206.4 KB ) - added by beaglejoe 3 years ago.
reverted-55253-plus-status.patch (1.5 KB ) - added by beaglejoe 3 years ago.

Download all attachments as: .zip

Change History (32)

comment:1 by Coldfirex, 3 years ago

Please provide listdev and syslog.

by beaglejoe, 3 years ago

Attachment: listdev.txt added

in reply to:  1 comment:2 by beaglejoe, 3 years ago

Replying to Coldfirex:

Please provide listdev and syslog.

These are from successful boot after choosing fail-safe graphics.

listdev syslog

I don't know how to obtain syslog from failed boot.

The issue can be fixed by reverting the change from

https://review.haiku-os.org/c/haiku/+/4208 (master)

https://review.haiku-os.org/c/haiku/+/4184 (beta3)

This will of course undo the fix for #17078

So further investigation is needed.

comment:3 by pulkomandy, 3 years ago

Component: - GeneralDrivers/Graphics/intel_extreme/9xx
Owner: changed from nobody to rudolfc
Status: newassigned

comment:4 by rudolfc, 3 years ago

Hi, can you please boot normally, so without vesa mode, after the system comes to an apparant idle state, reboot by shortly pressing the power button, or by pressing -alt-. -ctl-, -del- shortly, wait a second or two and press that combi again, this time for 5 secs or more until you see the system rebooting.

On the second boot, use vesa, and when booted, upload the previous syslog here, maybe also the normal (or old syslog) to be sure we have everything. I can delete the ones that don't contain the driver's attempt to init your card later on.

Thank you, and thanks for the hint at DPMS :-)

comment:5 by diver, 3 years ago

Cc: x512 mmu_man added

in reply to:  4 comment:6 by beaglejoe, 3 years ago

Replying to rudolfc:

Hi, can you please boot normally, so without vesa mode, after the system comes to an apparant idle state, reboot by shortly pressing the power button, or by pressing -alt-. -ctl-, -del- shortly, wait a second or two and press that combi again, this time for 5 secs or more until you see the system rebooting.

Neither of these methods will reboot the machine. I can only shutdown by holding the power button for several seconds.

On the second boot, use vesa, and when booted, upload the previous syslog here, maybe also the normal (or old syslog) to be sure we have everything. I can delete the ones that don't contain the driver's attempt to init your card later on.

After rebooting, I do not find a previous syslog. Only syslog, which seems the same as the attached one.

I tried enabling on screen debug output. This only shows one page (of PCI stuff). Press key to continue does not work. At this point I have to power down.

Last edited 3 years ago by beaglejoe (previous) (diff)

comment:7 by rudolfc, 3 years ago

So looks like your keyboard is not responding at least partly during booting. Then there's still the option to use serial port debugging but you would need a second PC to grab the results.. Would that work for you?

in reply to:  7 comment:8 by beaglejoe, 3 years ago

Replying to rudolfc:

So looks like your keyboard is not responding at least partly during booting.

It is odd. I can get to the boot menu and scroll up and down in the current bootlog, but once I leave the menu (continue booting) it seems to get lost?? I tried usb keyboard with the same results.

Then there's still the option to use serial port debugging but you would need a second PC to grab the results.. Would that work for you?

No serial port.

Maybe a clue, when the machine fails to boot, all icons are lit up, the screen goes black for second or two, then returns to logo with all icons lit up.

comment:9 by diver, 3 years ago

Enable on screen debug output *and* disable paging.

Last edited 3 years ago by diver (previous) (diff)

comment:10 by rudolfc, 3 years ago

That's a nice hint, never realized that was possible (disable paging). Another option: would serial debugging work trough a USB to serial adapter? Don't actually know..

comment:11 by pulkomandy, 3 years ago

would serial debugging work trough a USB to serial adapter? Don't actually know..

No, you need a 16C550 compatible UART. Either on the motherboard, or on a PCI or ExpressCard card.

But in this case where the system boots anyway (it's just the screen that doesn't work), probably the easiest way is to connect to the machine with ssh to get the logs?

in reply to:  9 comment:12 by beaglejoe, 3 years ago

Replying to diver:

Enable on screen debug output *and* disable paging.

Indeed, that seems to work, the last item was something about packfs, I could not get a picture before wifi spam covered the screen.

in reply to:  11 comment:13 by beaglejoe, 3 years ago

Replying to pulkomandy:

would serial debugging work trough a USB to serial adapter? Don't actually know..

No, you need a 16C550 compatible UART. Either on the motherboard, or on a PCI or ExpressCard card.

But in this case where the system boots anyway (it's just the screen that doesn't work), probably the easiest way is to connect to the machine with ssh to get the logs?

I believe you are right, the machine is fully booted. I can see it attached to my router and I can ping it.

comment:14 by beaglejoe, 3 years ago

FYI, it constantly reports

/dev/net/atheroswifi/0: media change

Probably should disable that ?

by beaglejoe, 3 years ago

Attachment: syslog added

comment:15 by beaglejoe, 3 years ago

Replaced syslog

comment:16 by rudolfc, 3 years ago

Thanks. Unfortunately the intel_extreme accelerant does not get loaded here. Your card should be supported indeed (ID 27A2, third gen). Furthermore I did not change anything for this generation if all is right. Can you retry booting with the intel_extrme accelerant enabled (so you don't get a screen in the end), and refetch the syslog via the network connection?

Thanks!

in reply to:  16 comment:17 by beaglejoe, 3 years ago

Replying to rudolfc:

Thanks. Unfortunately the intel_extreme accelerant does not get loaded here. Your card should be supported indeed (ID 27A2, third gen). Furthermore I did not change anything for this generation if all is right. Can you retry booting with the intel_extrme accelerant enabled (so you don't get a screen in the end), and refetch the syslog via the network connection?

Thanks!

Interesting. The syslog is not being created at this point. I am currently in the state:
Screen is showing lit boot icons.
Connected from Linux via ftp
There is no syslog (I moved it when I enabled ssh and ftp), then rebooted.

So the logs I've already attached are from previous boots and can be deleted.

Last edited 3 years ago by beaglejoe (previous) (diff)

comment:18 by beaglejoe, 3 years ago

top shows app_server to be running and using 98% cpu

comment:19 by rudolfc, 3 years ago

Ah OK! Now it's beginning to make sense to me. I am now suspecting app_server issues B_DPMS ON -before- a mode was set by it, which may never happen..

Apparantly the driver is going to need a sanity check for this, as I think it maybe it's now waiting for a Vblank that never happens, or the card even 'crashes' because PLL's get enabled that were not programmed before by a mode set command..

comment:20 by beaglejoe, 3 years ago

This code is with hrev55253 reverted:

/src/servers/app/VirtualScreen.cpp approx line 112

status_t
VirtualScreen::AddScreen(Screen* screen, ScreenConfigurations& configurations)
{
	screen_item* item = new(std::nothrow) screen_item;
	if (item == NULL)
		return B_NO_MEMORY;

	item->screen = screen;

	status_t status = B_ERROR;
	display_mode mode;
	if (_GetMode(screen, configurations, mode) == B_OK) {
		// we found settings for this screen, and try to apply them now
		status = screen->SetMode(mode);
	}
	if (status != B_OK) {
		status_t status = screen->SetPreferredMode();
		if (status != B_OK)
			status = screen->SetBestMode(1024, 768, B_RGB32, 60.f);
		if (status != B_OK)
			status = screen->SetBestMode(800, 600, B_RGB32, 60.f, false);
		if (status != B_OK) {
			debug_printf("app_server: Failed to set mode: %s\n",
				strerror(status));
		}
	}

	// Turn on screen if this is not yet done by BIOS
	if (status == B_OK)
		screen->HWInterface()->SetDPMSMode(B_DPMS_ON);

	// TODO: this works only for single screen configurations
	fDrawingEngine = screen->GetDrawingEngine();
	fHWInterface = screen->HWInterface();
	fFrame = screen->Frame();
	item->frame = fFrame;

	fScreenList.AddItem(item);

	return B_OK;
}

The call SetDPMSMode(B_DPMS_ON); was moved to screen.cpp, in the commit.

But I'm thinking that the real problem is that this code is using the wrong 'status'

	// Turn on screen if this is not yet done by BIOS
	if (status == B_OK)
		screen->HWInterface()->SetDPMSMode(B_DPMS_ON);

If any of the 'SetPreferredMode()' or 'SetBestMode()' calls succeed, they are setting a block local 'status' variable

status_t status = screen->SetPreferredMode();

I'm not familiar with the original problem, but this looks wrong to me.

comment:21 by rudolfc, 3 years ago

I agree. The reverted code was wrong indeed.Should have been the same 'status' variable instead of a very local new one as the new status is not relayed back to the original one so to speak. So hence DPMS was never turned on on case one the the two last setmodes succeeded.

I am also not aware of the orignal problem, so I am hoping @X512 is reading along here.. Though also someone else worked on dpms at some commit later on I seem to remember?

Version 0, edited 3 years ago by rudolfc (next)

comment:22 by beaglejoe, 3 years ago

I added some debug statements and it looks to me like @X512's first comment at:
https://review.haiku-os.org/c/haiku/+/4184

Does it means that screen is turned on before setting mode? That may be not correct.

is correct.

At least for the Intel driver. Tracing through the reverted code, the mode is set before DPMS is turned on and I get 1920 x 1080. It in fact works without the call to SetDPMSMode(B_DPMS_ON).
But with the code as it is, (SetDPMS() before setting the mode) app_server hangs using more than 90% CPU.

comment:23 by rudolfc, 3 years ago

I saw the discussion, thanks. I disagree with their conclusions. DPMS should not be called before setting a mode since the driver/hardware is in a not fully configured state. SetMode is part of the configuration of the gfx hardware (And the screen).

There are more things not working correctly btw with app_server when it comes to setting modes, though I can't pinpoint this fully yet. Anyhow, I have a third gen gfx system here, I'll see if I can reproduce your findings with booting and see if I can block the driver from executing that DPMS call effectively.

I wouldn't mind though if app_server would be updated again to -not- use DPMS before the first SetMode is executed succesfully(!).

comment:24 by pulkomandy, 3 years ago

I wouldn't mind though if app_server would be updated again to -not- use DPMS before the first SetMode is executed succesfully(!).

The first patch was merged a bit early before beta3, and then mmu_man tried to quickly fix it as beta3 was nearing. If the taken approach is incorrect, let's just remove these changes and fix things in a more correct way.

I would say it's up to each driver to turn DPMS on as needed whenever there is a mode change?

comment:25 by rudolfc, 3 years ago

I think reverting it to the original code with a patch for the double declaration for status_t is correct. That might well fix the problem x512 was seeing on risc-v as well.

DPMS should be turned ON by app-server -after- succesfull setmode, and off -before-. Though the driver may do this itself as well. Probably the Be example driver would be a good reference for a check how it should be done.

DPMS can also lower gfxcard power consumption,so it not only has an effect on a connected screen.

by beaglejoe, 3 years ago

in reply to:  25 comment:26 by beaglejoe, 3 years ago

Replying to rudolfc:

I think reverting it to the original code with a patch for the double declaration for status_t is correct. That might well fix the problem x512 was seeing on risc-v as well.

Attached patch as described, if anyone would like to test.
reverted-55253-plus-status.patch

comment:27 by rudolfc, 3 years ago

With a little help I just pushed your patch to gerrit: https://review.haiku-os.org/c/haiku/+/4362

Thanks for the patch!

comment:28 by beaglejoe, 3 years ago

I tested Patchset 2 at https://review.haiku-os.org/c/haiku/+/4362

It does fix the problem.

comment:29 by pulkomandy, 3 years ago

Milestone: UnscheduledR1/beta4
Resolution: fixed
Status: assignedclosed

Fixed in hrev55344.

Note: See TracTickets for help on using tickets.