Ticket #1950 (new bug)

Opened 2 months ago

Last modified 3 weeks ago

Garbled screen contents with true colour mode on nvidia gf2mx

Reported by: jopadan Assigned to: axeld
Priority: normal Milestone: R1
Component: - General Version: R1 development
Cc: Platform: x86

Description

System is a single Opteron 240 on a Gigabyte K8NNXP-940 nForce3 150 Chipsatz Mainboard.
Graphics Cards:
Nvidia GeForce 2 MX 32MB AGP
2x Voodoo2 12MB SLI

Haiku version is current SVN trunk.
Initial screen at first boot is garbled too.
I don't know if any nvidia accelerant or vesa driver is in use I just use the default configuration
Setting 8 and 16 bit modes works fine.
24 and 32 bit modes are garbled and you have a very hard time trying to figure out the contents

Attachments

MTRR_serial_r24559.txt (80.6 kB) - added by jonas.kirilla on 03/24/08 14:56:22.
syslog (107.3 kB) - added by jopadan on 03/26/08 07:59:19.

Change History

03/21/08 20:30:37 changed by jonas.kirilla

I see this too with a nVidia FX5500 card. I think r24494 is where it starts happening, so I suppose it's got to do with MTRR. The screen looks like the accelerant isn't working: no block fills and no block moves, everything shown leaves traces. (Even the Shutdown window leaves its mark, already at bootup!) Can provide serial and screenshots tomorrow if desired.

03/23/08 14:16:10 changed by jonas.kirilla

03/24/08 14:56:22 changed by jonas.kirilla

  • attachment MTRR_serial_r24559.txt added.

03/24/08 15:00:51 changed by jonas.kirilla

Serial output attached. Part of it:
...
allocate MTRR slot 0, base = 0, length = 20000000, type=0x6
kernel debugger extension "debugger/hangman/v1": loaded
kernel debugger extension "debugger/invalidate_on_exit/v1": loaded
allocate MTRR slot 1, base = f0000000, length = 100000, type=0x1
...
loaded driver /boot/beos/system/add-ons/kernel/drivers/dev/graphics/vesa
allocate MTRR failed, it overlaps an existing MTRR slot
allocate MTRR slot 2, base = f0000000, length = 8000000, type=0x1

Same base? (f0000000)

03/25/08 10:40:50 changed by korli

I would need additional information if it's possible : on Linux, you should find something about "BIOS-provided physical RAM map" in /var/log/messages, especially lines beginning with "BIOS-e820". Please provide on Linux the result of "cat /proc/mtrr". Thanks.

03/25/08 13:33:06 changed by korli

Hmm I didn't notice the two graphics cards, I don't know if this is a problem.
I also noticed the second slot (slot 1) seems to be allocated by the function frame_buffer_console_init_post_modules() in src/system/kernel/debug/frame_buffer_console.cpp.
How should this case be handled ?

(follow-up: ↓ 8 ) 03/25/08 13:41:26 changed by jopadan

I'll attach it here for you without the voodoo mtrr I think and you should keep in mind it is x86_64:

reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1
reg01: base=0xe0000000 (3584MB), size= 128MB: write-combining, count=1

BIOS-provided physical RAM map:

BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)

03/25/08 14:28:55 changed by rudolfc

Hi,

From the pictures it's clear that the acceleration engine crashes. Since after a few timeouts the driver simply drops all accelerated drawing commands you see what you see here. Instead of the system totally hanging that is (gfx wise).

Maybe try:
PCI mode versus AGP mode (nvidia.settings or disable AGP busmanager)
If PCI mode works then the driver/card might dislike PCI->AGP switching after it was engaged.
Fix: hmmm, don't know at this time. Is the splash icons screen being drawn accelerated? maybe not doing that would fix it, but you could call that a work around. AFAIK it's impossible (== not known) to do a full hard reset initiated by software of the cards: sometimes if the acc engine hangs a reboot is nessesary to solve that.


MTRR is used in the driver as well indeed. Inside the kerneldriver a temporary recompile with disabled MTRR support could be tried (it sits in multiple places!).
If it works without MTRR, well then the MTRR change is probably the problem.

Regards,

Rudolf.

(in reply to: ↑ 6 ; follow-up: ↓ 11 ) 03/25/08 15:10:20 changed by korli

Replying to jopadan:

I'll attach it here for you without the voodoo mtrr I think and you should keep in mind it is x86_64:

Any chance to have a serial log or syslog ?

03/25/08 15:57:11 changed by korli

Could you check with r24582 ?

03/25/08 16:05:27 changed by jonas.kirilla

/proc/mtrr

reg00: base=0x00000000 (   0MB), size= 512MB: write-back, count=1
reg01: base=0xf8000000 (3968MB), size=  64MB: write-combining, count=1

/var/log/messages

 ...
Mar 25 22:02:23 kirilla kernel: [    0.000000] BIOS-provided physical RAM map:
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 0000000000100000 - 000000001ffec000 (usable)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 000000001ffec000 - 000000001ffef000 (ACPI data)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 000000001ffef000 - 000000001ffff000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 000000001ffff000 - 0000000020000000 (ACPI NVS)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000]  BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
Mar 25 22:02:23 kirilla kernel: [    0.000000] 0MB HIGHMEM available.
Mar 25 22:02:23 kirilla kernel: [    0.000000] 511MB LOWMEM available.
 ...
Mar 25 22:02:23 kirilla kernel: [   13.550042] Console: colour VGA+ 80x25
 ...
Mar 25 22:02:23 kirilla kernel: [   30.342591] Linux agpgart interface v0.102 (c) Dave Jones
Mar 25 22:02:23 kirilla kernel: [   30.355408] agpgart: Detected an Intel 845G Chipset.
Mar 25 22:02:23 kirilla kernel: [   30.359285] agpgart: AGP aperture is 64M @ 0xf8000000
 ...
Mar 25 22:02:23 kirilla kernel: [   32.062146] NVRM: loading NVIDIA UNIX x86 Kernel Module  100.14.19  Wed Sep 12 14:12:24 PDT 2007
 ...
Mar 25 22:02:27 kirilla kernel: [   41.341550] agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
Mar 25 22:02:27 kirilla kernel: [   41.341578] agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
Mar 25 22:02:27 kirilla kernel: [   41.341602] agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
 ...

(BTW, just to be extra clear: jonas.kirilla != jopadan. Two separate sets of hardware.)

(in reply to: ↑ 8 ) 03/26/08 07:57:55 changed by jopadan

Replying to korli:

Replying to jopadan:

I'll attach it here for you without the voodoo mtrr I think and you should keep in mind it is x86_64:


Any chance to have a serial log or syslog ?

I've attached a syslog but I don't see anything MTRR related in it.
Also the Tracker seems to crash everytime I try to shutdown.
Since I am not able to access any ext3 partitions yet I was unable to install the developer tools yet. Maybe there is bfs write support in linux.

03/26/08 07:59:19 changed by jopadan

  • attachment syslog added.

(in reply to: ↑ description ) 03/28/08 17:23:06 changed by peat

Same here (GeForce 4 Mx 440).
Safemode GFX Setting (1280 x 1024 x 32 with Vesa Driver) works fine.

03/29/08 14:29:50 changed by aliensoldier

I have that bug also Geforce 2 MX400 i think (0x0110).

the other day i got that bug in R5 also, only once. It was a once in a lifetime occurence type of bug, so i'm not able to reproduce it but i tought it might help point to a part of the solution.

I was starting to play a video in VLC while i noticed i took the wrong one, so i stoped it just when the display was starting. From there i was having the same behavior than the haiku start bug and needed a reboot.

03/30/08 16:10:45 changed by jonas.kirilla

Screen still garbled for me with r24682. No visible change. I tried setting force_pci true in nvidia.settings in an earlier revision (prior to r24679) with no improvement.

04/23/08 03:36:44 changed by rudolfc

Hi there!

Is this still a problem today? Or was it solved already?

Anyhow: I tried Haiku of 14 april or something on a system of mine and I looked at the driver a bit. It doesn't work there indeed in 32bit color (this colorspace is accelerated by the driver). 16, 15 and 8 bit work (will not be accelerated due to app_server).

The acceleration engine doesn't work, BUT it does NOT crash. The hooks are called with (AFAIK) valid lists of drawing commands.

I don't understand yet what's wrong.. still investigating. (might take a week or two since I'm on holiday next week)

Bye!

Rudolf.

04/23/08 03:57:04 changed by stippi

The MTRR regions might still be a problem. IIRC, Korli disabled the overlap check, but it might still point to a problem somewhere with regards to MTRR setup. Nice to hear you want to have a look, Rudolf! Much appreciated!