Opened 8 years ago

Last modified 2 years ago

#7787 assigned bug

VESA regression: 640x480 no longer works at boot (but works in Screen prefs!)

Reported by: ttcoder Owned by: nobody
Priority: low Milestone: Unscheduled
Component: System/Boot Loader Version: R1/alpha3
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: x86

Description

At first bootup after R1A3 installation, selecting "use failsafe video mode" in bootloader, the stage 1 bootloader exits into a turned-off monitor, and the monitor stays off ad infinitum (i.e. even after app_server starts).

VESA worked out-of-the-box in hrev41843

Also see #7655 for background on this.

Details coming up.

Attachments (3)

Change History (19)

comment:1 Changed 8 years ago by ttcoder

Details:

Behavior on first bootup(s) after installing R1A3 (those are the scary ones! :-)

  • "Use fail-safe video mode" does not work: the monitor turns off as soon as stage-2 bootloader exits. The HDD keeps spinning and spinning, and when it stops the monitor is still turned off. Game over.

Behavior after finally understanding that I have to pick 800x600 VESA resolution (this is the only one that works in VESA now in alpha3):

  • "Use fail-safe video mode" works fine
  • you have to keep picking that resolution in the safemode boot menu each time, though.

If in addition I open the Screen preflet and 'Apply',

  • the settings/kernel/driver/vesa file is created with contents "mode 800 600 32"...
  • which means that from now on:

Current Behavior:

  • "Use fail-safe video mode" works perfectly, even leaving resolution set to "Default" in the boot menu.
  • but if I ever select 640x480 in the boot menu, I return back to situation one, the monitor turns off and stays off even after the app_server starts!
  • Edit To Add n.2: high weirdness: if I boot into 800x600 VESA and then open Screen preferences and switch to 640x480, then this resolution works perfectly. So it's not the VESA driver itself which is at fault it seems, just boot process.
  • Edit To Add: weirdly, I still need to select 800x600 in the boot menu if I want the nVidia accelerant to work correctly. If I leave "Default" resolution, the screen turns off! syslogs coming up soon.

So in a nutshell, there is a regression somewhere between hrev41848 and Alpha3, which makes the VESA 640x480 resolution not work. I can't fault Haiku for picking that one instead of 800x600, it makes sense to pick the most conservative res. at first and let the user improve.. But I don't understand why 640x480 no longer works now.

As an important note, this also affects the nVidia driver ( http://dev.haiku-os.org/ticket/7655#comment:11 ). In other words, the VESA boot menu settings affect the proper functioning of the nVidia driver, a native accelerant which has nothing to do with VESA (from the user's point of view)! Probably a bad interaction with the bootloader passing the resolution to.. createdisplaymodes() which.... breaks nvidia.accelerant..??

I'm picking "bootloader" for component now as I'm clueless about who does what exactly, but I suspect there are more guys involved (or maybe the other guys read some data gathered at boot time by the bootloader, and that data has regressed ?)

Creating only one ticket for now, as I have a feeling this is one regression with two consequences.

Last edited 8 years ago by ttcoder (previous) (diff)

comment:2 Changed 8 years ago by ttcoder

Summary: VESA regression: 640x480 no longer worksVESA regression: 640x480 no longer works at boot (but works in Screen prefs!)

comment:3 Changed 8 years ago by ttcoder

Collected the syslogs and ran a diff on them but there's almost zero actionable info it seems: lots of differing lines due to different RAM address/pointers, and the one line that says "640 x 480" instead of "800 x 600", but no hint at why the monitor fails to sync in 640x480 mode (no indication of pixelclock or refresh rate or anything). Attaching syslogs anyway..

Changed 8 years ago by ttcoder

comment:4 Changed 8 years ago by axeld

GTF support implemented in hrev42420. That won't help much with this ticket, it will just ease it for someone looking into it :-)

comment:5 Changed 8 years ago by ttcoder

A bit of feedback:

  • Found a gtf.cpp file in apps/preferences/Screen, looks like it had already been done a couple years back.. Wished I had looked before, it's a bit late to warn you now :-x
  • Examining the Screen preflet more, I think I understand the sequence to be like thus: Screen calls ->BScreen::GetModes(), which -> gets to the server side (app_server) Screen class, which -> calls AccelerantHWInterface's accelerant hook ... which might (or might not, in the case of older drivers) call create_display_modes() (in accelerants/common).
  • So in short, the Screen preflet relies on the accelerant to call (or not) create_display_modes(), which is needed to get the best possible listing of modes (from all sources: EDID Vesa modes, EDID Std modes, EDID DetailedMonitor mode, the hardcoded base modes list, the whole shebang).
  • Edit:
  • since hrev42420 create_display_mode() (called by the VESA driver and others) ensures that the General Timing Function compute_display_timing() is called at least on one mode, as a fall-back option.
  • since hrev42421 the VESA driver additionally calls GTF-compute_display_timing() itself, as well, on its hardcoded list of modes.
  • So next I have to dig in the bootloader's video.cpp file to see if it does the same (call create_display_modes()) or not..

Ok off I go to try the latest nightly first.

Last edited 8 years ago by ttcoder (previous) (diff)

comment:6 Changed 8 years ago by ttcoder

Just booted into the hrev42421 a few times to test different configurations, and indeed the situation is mostly not fixed -- but there is one improvement. What is unchanged is this: the home/config/settings/kernel/drivers/vesa file is still necessary to obtain a successful bootup in VESA mode, otherwise the screen stays black (for VESA, and nVidia). However, the nVidia accelerant is now satisfied with having only this kernel/drivers/vesa file: once it is created, no more need to press "Shift" at bootup to enter safemode menu and select 800x600 override.

But this does not necessarily invalidates the experimenting I had done yesterday, with the R1A3 Haiku:

I had made some progress on the bootloader front, trying to replace the haiku_loader file with older revisions, and indeed clobbering the alpha3 bootloader with an older bootloader "fixes" the Haiku boot process, allowing me to boot into nVidia accelerant without tweaking anything in the safemode menu (aaah the joys of not having to press the Shift key at boot :-) . The file names should tell the story:

~> ls -l /boot/system/
(..)
-rwxrwxrwx 1 user root  197424 Jul 10 20:35 haiku_loader
-rwxrwxrwx 1 user root  210928 May 16 17:48 haiku_loader.41539 (ok, like 41843 would))
-rwxr-xr-x 1 user root  210864 Jul  2 19:34 haiku_loader.42211.R1A3 (bad)
-rwxrwxrwx 1 user root  197424 Jul 10 20:35 haiku_loader.42403 (good but with two icon rows ??)

Both experiments (new haiku build, and file-clobbering hacks) still hint at a "side effect" of some sort IMHO... But I gotta find out for sure. Will keep looking.

comment:7 Changed 8 years ago by axeld

Although I even committed the patch, I did not remember, thanks for the reminder! At least we now correctly advertise it in AboutSystem :-)

I see Gerald has taken the work to simplify the function, and also adapt it to our coding style. I guess those should be merged again in the future (Screen should just reuse the code in accelerants/common/ then).

The boot loader does only work with VESA, and is therefore not interested in the display timings.

comment:8 Changed 8 years ago by ttcoder

Update: ignore this comment below; see next comment after.. The below comment is bollocks, turns our it's caused simply by http://dev.haiku-os.org/browser/haiku/trunk/src/system/boot/platform/bios_ia32/video.cpp#L226 not by memory corruption

@devs:
It still looks like a memory corruption. Been reviewing video.cpp side-by-side with the syslog, and I understand better the code flow: it turns out that the bootloader does select 800x600 when initializing, it is smart enough to pick that one over the smaller resolutions. But that choice gets lost along the way to kernel-land, AND the other choices get corrupted; explanation:

  • platform_switch_to_text_mode() is ok
  • platform_init_video() is ok (it does select 800x600 in vesa_init())
  • platform_switch_to_logo() oddly dprintf()s that gKernelArgs.frame_buffer.height=640 (x480) instead of 800x600 !

You can check the 'syslog-failure' file I have posted, it does log what I describe: the loader TRACE()ing 0x115 (that is, 800x600), and then asking the kernel to use 640x480x32 (dprintf())..

So it seems that 'gKernelArgs' gets corrupted (part of it gets clobbered) inbetween the user menu and kernel loading. And thus when the kernel kicks in, it is passed the wrong arguments; additionally, the parameters for 640x480 maybe get corrupted too (since there is also the bug that this resolution does not work since alpha3).. Additionally, it also corrupts the nVidia driver depending on the situation?

The last few days I was thinking about building the loader from source.. Maybe I will still do that: as experimented before, reverting to an older revision of 'haiku_loader' fixes the corruption, so this hints at the memory corruption occuring insider the loader, not in the kernel.

NOTE on video.cpp:50 :

static video_mode *sMode, *sDefaultMode;

'sMode' is not initialized to NULL on construction, and only gets affected at the end of platform_init_video() though that's probably not the source of my problems.

Last edited 8 years ago by ttcoder (previous) (diff)

comment:9 Changed 8 years ago by axeld

I find it highly unlikely that a memory corruption will replace a previously chosen 800x600 with 640x480. But adding more debug output will certainly help you to understand the situation :-)

comment:10 Changed 8 years ago by ttcoder

Ok so I'm back on track, and working on a first 'real' patch. It will address only a tiny parcel of my problem(s), but I believe it's a big enough one that it has to be attacked with a cartesian strategy :-]..

Here's the first problem I want to address:

It should not. Forcing the DETAILED_MONITOR_DESC on the user is deprecated in Haiku, as per other discussions in this ticket and others. Such code flow is present also in the nVidia driver, which will also need a patch later on, so that it calls create_display_modes() (that one being a nice guy who uses all 3 EDID sections) but let's focus on the bootloader for now, and this ticket here.

Instead I propose that...

  • find_edid_mode() applies the "prefer higher resolutions" policy to both DETAILED_MONITOR_DESC and STD_TIMING, merged somehow.

Later on I'll propose a patch to also look in the third VESA_Desc section (which is the one that has 1024x768 for me.. Yummy!) but again, let's attack these problems one at a time :-)

Feedback? Shall I start writing this (small) change ?

comment:11 Changed 8 years ago by axeld

Well, no. The detailed timing is the one to use in 99% of the cases. What you experience is due to a pretty much broken EDID report. Your monitor is at fault; punishing everyone else is not the solution.

No matter what modes are present, the system will not always default to use the highest one, as that one is not always the best supported one; it may just kind of work. The common use of the detailed timing is to specify what the monitor does best, and the boot loader as well as the app_server will choose that mode when it's present. create_display_modes() will only create the mode list, it's not responsible for choosing the default mode.

Now your monitor obviously doesn't use the detailed timing this way. So what we need here is a work-around for your particular problem, rather than to weaken an otherwise perfectly working implementation (AFAIK). I would suggest some heuristics to apply, ie. if the detailed timing is too low or otherwise unbelievable, ignore it as the default mode, and fall back to the other way to determine the default mode. If the problem is not otherwise solvable, you could also make the code choose the right mode if it encounters a particular monitor; however, this shouldn't be needed here.

comment:12 in reply to:  11 ; Changed 8 years ago by ttcoder

Tried the 'debug' vesa accelerant posted in the other ticket, from different partitions and safemode parameters, but it always results (when app_server starts) in a white blank screen and hanged PC -- the PC even refuses to drop into KDL. I've restored the 'plain' vesa accelerant for now.

---

Replying to axeld:

No matter what modes are present, the system will not always default to use the highest one, as that one is not always the best supported one;

Ok I was confusing two unrelated things indeed: trusting EDID-Detailed makes sense when looking for an optimal screen mode at bootup, since even if not optimal then the user can still change the resolution later on anyway to a higher one (at his own risk); whereas the issue of some accelerants "filtering out" some resolutions (and making them completely inaccessible to the user) despite their being listed in EDID should be fixed by improving their EDID interpretation, but that's a different thing altogether.

I'm up to speed now (hopefully)..

If I may suggest so, the syslog would be improved if this

KERN: Using mode 0x115

was expressed instead as

KERN: Using mode 0x115 as fallback in case EDID is absent

And if

TRACE(("Using EDID mode %u x %u x %u\n", sDefaultMode->width, sDefaultMode->height, sDefaultMode->bits_per_pixel);

was inserted.

Anyway that's my 2 cents feedback as an outsider without intimate knowledge of the source who tried to debug his problem by reading wayy too many syslogs of late :-)

--

Will go stealth now and try to report back here only when finding out exactly what happens when this gets passed a 640x480 mode.

Last edited 8 years ago by ttcoder (previous) (diff)

comment:13 in reply to:  12 Changed 8 years ago by siarzhuk

Replying to ttcoder:

Tried the 'debug' vesa accelerant posted in the other ticket, from different partitions and safemode parameters, but it always results (when app_server starts) in a white blank screen and hanged PC -- the PC even refuses to drop into KDL.

That is the crash of app_server trying to access initialModes NULL poiniter:

http://www.freelists.org/post/haiku-commits/r42427-haikutrunksrcaddonsaccelerantsvesa,1

comment:14 Changed 5 years ago by ttcoder

Priority: normallow

(low: I no longer use this nvidia-based 'puter, and nobody else reported this bug as still present)

comment:15 Changed 4 years ago by ttcoder

Milestone: R1Unscheduled

comment:16 Changed 2 years ago by axeld

Owner: changed from axeld to nobody
Status: newassigned
Note: See TracTickets for help on using tickets.