Opened 3 years ago

Last modified 17 months ago

#17664 new bug

radeon_hd "Cedar": app_server crashes writing to 0MB tiny framebuffer

Reported by: thebuck Owned by: kallisti5
Priority: high Milestone: Unscheduled
Component: Drivers/Graphics/radeon_hd Version: R1/Development
Keywords: Cedar Cc:
Blocked By: Blocking:
Platform: x86-64

Description (last modified by thebuck)

Booting with defaults and a Radeon HD 5450 "Cedar". After all icons have lit up, get a colorful horizontally-bricked garbled display (hard to work with, but no crash).

With VESA it works.

Monitor info (Model, 1600x1200 at DVI) from syslog seems adequate.

NOTE: Originally this bug described an app_server crash due to accessing a too tiny framebuffer. That part has been fixed in hrev56333.

Please update title, it seems I cannot.

Attachments (6)

previous_syslog.redacted.txt (87.5 KB ) - added by thebuck 3 years ago.
Syslog hrev55954 x86_64
radeon_hd_bios_1002_68e0_0.bin (128.0 KB ) - added by thebuck 3 years ago.
AtomBIOS dump Radeon HD 5450 Cedar
dev-misc-mem-0xc0000.bin (128.0 KB ) - added by thebuck 3 years ago.
dd if=/dev/misc/mem of=/usbstick/dev-misc-mem-0xc0000.bin bs=64K skip=12 count=2
syslog.hrev56332plus5526set2.redacted.txt (233.8 KB ) - added by thebuck 2 years ago.
With https://haiku.movingborders.es/testbuild/I52b9691cd6a04b58b70e905bc29e803f06936789/2/hrev56332/x86_64/haiku-nightly-anyboot.iso the driver works, but app_server paints with artifacts.
screenshot3.png (371.3 KB ) - added by thebuck 2 years ago.
Redraws have different noise seed (would be a good redraw debugging technique if less intense!). Turbulences somehow snap to 32x1 pixels screen-fixed cell grid. The 50% grey is always correctly drawn. I changed appearance settings. Various resolutions and colordepths are working, but show the same artifacts.
syslog.hrev56334.redacted.txt (122.5 KB ) - added by thebuck 2 years ago.
No change in behaviour.

Download all attachments as: .zip

Change History (29)

by thebuck, 3 years ago

Syslog hrev55954 x86_64

by thebuck, 3 years ago

AtomBIOS dump Radeon HD 5450 Cedar

by thebuck, 3 years ago

Attachment: dev-misc-mem-0xc0000.bin added

dd if=/dev/misc/mem of=/usbstick/dev-misc-mem-0xc0000.bin bs=64K skip=12 count=2

comment:1 by thebuck, 3 years ago

Tried to retrieve second AtomBIOS half manually:
dd if=/dev/misc/mem of=/usbstick/dev-misc-mem-0xfebc0000.bin bs=64K skip=65212 count=1
Hangs machine, needed to hard reset.

comment:2 by thebuck, 3 years ago

The attached binaries differ slightly (first 64KiB).

comment:4 by korli, 2 years ago

this can't really help in this case as reg 1 is empty. But the patch looks good.

base reg 0: host d0000000, pci d0000000, size 10000000, flags 0c
base reg 1: host 00000000, pci 00000000, size 00000000, flags 00

comment:5 by kallisti5, 2 years ago

For a more complete picture.

The PCI_BAR_FB is 0, so you're right it's the first pair (32 bit + 64-bit)

335	PCI:   base reg 0: host d0000000, pci d0000000, size 10000000, flags 0c
336	PCI:   base reg 1: host 00000000, pci 00000000, size 00000000, flags 00
337	PCI:   base reg 2: host febc0000, pci febc0000, size 00020000, flags 04
338	PCI:   base reg 3: host 00000000, pci 00000000, size 00000000, flags 00
339	PCI:   base reg 4: host 0000e000, pci 0000e000, size 00000100, flags 01
340	PCI:   base reg 5: host 00000000, pci 00000000, size 00000000, flags 00

Looks like base reg 0 is indeed 64-bit (0x0c being 0x04 and 0x08) I'm actually not 100% sure how we even got to 0 given the current code.

I could see us misreading the bar size due to this bug, and shrinking a reasonable framebuffer size to to 0 because of it... however the logs never show the trace message "shrinking frame buffer to PCI bar." to indicate that. One of the two "adjustment" conditions should be firing, but neither are present in the logs.

Either way, give https://haiku.movingborders.es/testbuild/I510dba971ca5f1ed8d2b96094cc2e6b367e95dc3/2/hrev56326/x86_64/haiku-nightly-anyboot.iso a try to see if it helps address the issue for you @thebuck

comment:6 by korli, 2 years ago

info.shared_info->graphics_memory_size might be between 0 and 1024, printed as 0MB.

comment:7 by korli, 2 years ago

This means the computed size is probably 1, because Cedar IGP sizes are thought in bytes. But they are only in bytes for Palm, Sumo, Sumo2. https://github.com/torvalds/linux/blob/b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1/drivers/gpu/drm/radeon/evergreen.c#L3753

More correct would be:

diff --git a/src/add-ons/kernel/drivers/graphics/radeon_hd/radeon_hd.cpp b/src/add-ons/kernel/drivers/graphics/radeon_hd/radeon_hd.cpp
index 14c1090722..c1e2a11703 100644
--- a/src/add-ons/kernel/drivers/graphics/radeon_hd/radeon_hd.cpp
+++ b/src/add-ons/kernel/drivers/graphics/radeon_hd/radeon_hd.cpp
@@ -706,15 +706,19 @@ radeon_hd_init(radeon_info &info)
 
        // *** Populate frame buffer information
        if (info.chipsetID >= RADEON_CEDAR) {
-               if ((info.chipsetFlags & CHIP_APU) != 0
-                       || (info.chipsetFlags & CHIP_IGP) != 0) {
-                       // Evergreen+ fusion in bytes
-                       info.shared_info->graphics_memory_size
-                               = read32(info.registers + CONFIG_MEMSIZE) / 1024;
-               } else {
-                       // Evergreen+ has memory stored in MB
-                       info.shared_info->graphics_memory_size
-                               = read32(info.registers + CONFIG_MEMSIZE) * 1024;
+               switch (info.chipsetID) {
+                       default:
+                               // Evergreen+ has memory stored in MB
+                               info.shared_info->graphics_memory_size
+                                       = read32(info.registers + CONFIG_MEMSIZE) * 1024;
+                               break;
+                       case RADEON_PALM:
+                       case RADEON_SUMO:
+                       case RADEON_SUMO2:
+                               // Fusion in bytes
+                               info.shared_info->graphics_memory_size
+                                       = read32(info.registers + CONFIG_MEMSIZE) / 1024;
+                               break;
                }
        } else if (info.chipsetID >= RADEON_R600) {
                // R600-R700 has memory stored in bytes

comment:8 by kallisti5, 2 years ago

Groan. Yeah, that makes sense for the core issue. The whole Palm and Sumo chipset era of these cards had so many patches and workarounds.

Early on I tied a lot of logic to the DCE version and chipset flags over the individual card generations since the chipset names + generations were a lot squishier.

Over time though the DCE version stopped being as relevant (being completely omitted on recent cards... anything over 13 is just made up at this point by us). Radeon is a lot better than Intel in terms of tech debit in silicon design, but the ATI marketing team historically was pretty bad about injecting older generation cards in the middle of next-gen card series with a confusing naming.

Last edited 2 years ago by kallisti5 (previous) (diff)

comment:11 by thebuck, 2 years ago

KERN: radeon_hd: radeon_hd_init: shrinking frame buffer to PCI bar...
KERN: radeon_hd: radeon_hd_init: mapping a frame buffer of 256MB out of 1024MB video ram
KERN: radeon_hd: framebuffer paddr: 0xd0000000
KERN: set MTRRs to:
KERN:   mtrr:  0: base: 0xbffb0000, size:    0x10000, type: 0
KERN:   mtrr:  1: base: 0xbffc0000, size:    0x40000, type: 0
KERN:   mtrr:  2: base: 0xc0000000, size: 0x40000000, type: 0
KERN: radeon_hd: frambuffer vaddr: 0xffffffff9d000000
KERN: radeon_hd: frambuffer size: 0x1000000

EDIT: include the shrinking message like in comment:17

Last edited 2 years ago by thebuck (previous) (diff)

by thebuck, 2 years ago

Attachment: screenshot3.png added

Redraws have different noise seed (would be a good redraw debugging technique if less intense!). Turbulences somehow snap to 32x1 pixels screen-fixed cell grid. The 50% grey is always correctly drawn. I changed appearance settings. Various resolutions and colordepths are working, but show the same artifacts.

comment:12 by thebuck, 2 years ago

Color channels seem mapped correctly.

comment:13 by thebuck, 2 years ago

Hm, 0x1000000 = 16MiB, not 256MiB. The former is enough for a strided framebuffer.

comment:14 by thebuck, 2 years ago

At 24bpp, picture is squeezed to left 3/4 of screen with colors away from 50% grey being completely garbled. The right 1/4 of screen contains the old 32bpp picture.

The screenshooter utility in this case creates a 1600x1200 non-squeezed image without 32x1 bricking but noisy as well.

Last edited 2 years ago by thebuck (previous) (diff)

comment:15 by thebuck, 2 years ago

At more than 60Hz, monitor complains that 60Hz should be chosen, but works.

comment:16 by korli, 2 years ago

Please attach a syslog after hrev56334.

by thebuck, 2 years ago

No change in behaviour.

comment:17 by thebuck, 2 years ago

KERN: radeon_hd: radeon_hd_init: shrinking frame buffer to PCI bar...
KERN: radeon_hd: radeon_hd_init: mapping a frame buffer of 256MB out of 1024MB video ram
KERN: radeon_hd: framebuffer paddr: 0xd0000000
KERN: set MTRRs to:
KERN:   mtrr:  0: base: 0xbffb0000, size:    0x10000, type: 0
KERN:   mtrr:  1: base: 0xbffc0000, size:    0x40000, type: 0
KERN:   mtrr:  2: base: 0xc0000000, size: 0x40000000, type: 0
KERN: radeon_hd: frambuffer vaddr: 0xffffffff9d800000
KERN: radeon_hd: frambuffer size: 0x10000000

comment:18 by thebuck, 2 years ago

I tried KDL: It also has the painting artifacts (since hrev56332plus5526set2).

comment:19 by kallisti5, 2 years ago

Hm. The framebuffer allocation issues definitely seems solved.

Does this resolution sound reasonable for the laptop?

KERN: radeon_hd: display_crtc_fb_set: fb: 1600x1200 (32 bpp)

I noted a todo for IGP connector probing.

I have a few radeon_hd fixes which could help this one in Gerrit... let me review them and see if I can find a relevant one.

in reply to:  19 comment:20 by thebuck, 2 years ago

Hm. The framebuffer allocation issues definitely seems solved.

Yes, but why waste 256MiB VRAM for 1600x1200x32bpp ≈ 7MiB ? I cannot imagine a stride that big.

Does this resolution sound reasonable for the laptop?

Not a laptop. Yes, it is the native resolution of the attached monitor. Display is stable over time (there is no refresh-dependent noise nor flicker).

Summary:

  • Only the writes to the framebuffer are mangled somehow; mangling involves:
    • values from horizontal neighbor pixels up to 32 (for 32bpp case) pixels far away (→ brick effect), and
    • either the old value of the pixel (maybe also neighbors), or
    • random/uninitialized values (→ areas get different noise each time they are repainted)
  • If you mentally filter out the above mangling, color channels are not mixed up. It does not look like for example RGB being misread as BGR.
  • 50%-grey always looks original, not even noisy.
  • The Haiku screenshooter correctly records what I see onscreen.
    • Only at 24bpp the screenshooter is lying, reality is described in comment:14 (I guess the card is still configured for 32bpp while writes to the FB are using 24bpp format, which would have a stride 1600 bytes bigger, to retain the right 1/4 old 32bpp picture).
Last edited 2 years ago by thebuck (previous) (diff)

comment:21 by thebuck, 2 years ago

Description: modified (diff)

Still there with R1beta4tc0.

comment:22 by thebuck, 2 years ago

Description: modified (diff)

comment:23 by Carl_Miller, 17 months ago

Apparently #18470 (which I reported) is a variant of this.

Note: See TracTickets for help on using tickets.