Opened 15 years ago
Closed 7 weeks ago
#5353 closed enhancement (fixed)
Improve MTRR setup
Reported by: | PieterPanman | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | R1.1 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | mtrr | Cc: | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
Just so that this isn't forgotten: For my laptop (with vesa), the current MTRR setup is not able to properly cover the framebuffer. Some people will probably be stuck with vesa, and there is a huge performance difference if the MTRR setup covers the framebuffer for vesa. A better way of achieving this is needed.
See bonefish's comment in #5085.
Refer to #5085 for more information. Right now I just use the workaround in that bug, which works well.
Attachments (1)
Change History (11)
comment:1 by , 15 years ago
Type: | bug → enhancement |
---|
comment:2 by , 15 years ago
Blocking: | 5383 added |
---|
(In #5383) Here's the interesting part from the latest syslog:
1049 KERN: add_memory_type_range(-1, 0x0, 0x1ffec000, 6) 1050 KERN: set MTRRs to: 1051 KERN: mtrr: 0: base: 0x0, size: 0x20000000, type: 6 1052 KERN: mtrr: 1: base: 0x1fff0000, size: 0x10000, type: 0 1053 KERN: mtrr: 2: base: 0x1ffec000, size: 0x4000, type: 0
That's the RAM range (write-back) at 0 - 0x1ffec000.
1060 KERN: add_memory_type_range(75, 0xf0000000, 0x300000, 1) 1061 KERN: set MTRRs to: 1062 KERN: mtrr: 0: base: 0x0, size: 0x20000000, type: 6 1063 KERN: mtrr: 1: base: 0x1fff0000, size: 0x10000, type: 0 1064 KERN: mtrr: 2: base: 0x1ffec000, size: 0x4000, type: 0 1065 KERN: mtrr: 3: base: 0xf0000000, size: 0x200000, type: 1 1066 KERN: mtrr: 4: base: 0xf0200000, size: 0x100000, type: 1
That's a 3 MB write-combining memory range at 0xf0000000.
1110 KERN: add_memory_type_range(1279, 0x4d00000, 0x100000, 1) 1111 KERN: add_memory_type_range(1279, 0x4d00000, 0x100000, 1): Memory range intersects with existing one (0x0, 0x1ffec000, 6).
A one MB write-combining memory range at 0x4d00000, which intersects with the RAM range, and is therefore ignored.
The previous algorithm would just have used a free MTR register, which would have worked in this case. Generally this is not an option though, if a subtractive MTRR setup is used. Theoretically the existing range could be split and the MTRR setup be recomputed, but since oftentimes the MTRR setups are so complex that the number of MTRRs is barely sufficient (or not even that), I don't think this is a reasonable approach. As suggested in #5353 we should probably not even try to use MTRRs, but rather define memory types via the respective PTE bits.
comment:3 by , 15 years ago
Blocking: | 5383 removed |
---|
(In #5383) Replying to rudolfc:
The driver's buffer is used for writing by the CPU only. The GPU reads from this buffer. Using MTRR-WC has a big-time acceleration performance increase compared to write-trough or uncached (especially in accelerated 3D, I benchmarked this once).
I was a bit surprised that WT is slower than WC, since the specification for WT says that "write-combining is allowed". Setting the frame buffer to WT instead of WC makes the graphics feel tremendously slower, so apparently the "is allowed" part doesn't mean it's actually done.
Anyway, the problem should be fixed for P6 and later in hrev35515, since overlapping ranges are now handled correctly. Please close the ticket, if you can verify this.
comment:4 by , 15 years ago
Pieter, please give hrev35515 a try. Regardless of whether it improves the situation on your laptop, I would appreciate a new syslog from this revision. Thanks!
follow-up: 7 comment:6 by , 15 years ago
I reverted everything back to normal and updated to hrev35519. It now performs smoothly, even playing videos at full screen. Syslog attached. Mouse acts a little jerky, possibly related to the ps/2 outputs in the syslog? Bugworthy?
comment:7 by , 15 years ago
Replying to PieterPanman:
I reverted everything back to normal and updated to hrev35519. It now performs smoothly, even playing videos at full screen. Syslog attached.
Thanks, looks nice!
Mouse acts a little jerky, possibly related to the ps/2 outputs in the syslog? Bugworthy?
Sure.
Leaving the ticket open as general enhancement ticket. The current algorithm is much better at using the MTRRs, but still it can run out of registers. Furthermore the Pentium doesn't have MTRRs (the memory type ranges are predefined by hardware), so not everything that should work does actually work yet. The method of using the PTE flags as proposed in 1 would work fine for Pentium III and later, since there PAT is available, which makes all memory types settable. For Pentium II/Pro some mix of MTRRs and fallback to PTE flags could be used. For Pentium the PTE flags should be used, if a type stricter than predefined by the hardware is required.
comment:8 by , 8 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:9 by , 4 years ago
Milestone: | R1 → R1.1 |
---|
comment:10 by , 7 weeks ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I think we can close this as fixed at this point. Since bonefish's last comment above, there were further changes to skip uncacheable ranges in MTRRs, which reduces pressure on them; and more recently PAT was implemented, making MTRRs obsolete on basically all hardware from the last two decades.
That's not only VESA related. Even for cards that have driver support the graphics memory needs to be marked "write-combining".
A relatively simple solution would be to mark the whole physical address space WB via one MTRR and use the PCD and PWT bits in the PTEs to define the actual memory type. Not sure, if that has any disadvantages, though.