Opened 16 years ago
Closed 15 years ago
#3772 closed bug (fixed)
Freeze on high memory load when not limiting available memory (reproduceable)
Reported by: | michael.weirauch | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System | Version: | R1/Development |
Keywords: | Cc: | marcusoverhagen, imker@… | |
Blocked By: | Blocking: | ||
Platform: | x86 |
Description
System freezes reproduceably when compiling the Haiku tree. No KDL. No KDL-enter with F12.
Environment: Haiku gcc2 and gcc4 any rev up until now (current hrev30177)
System: Thinkpad T500 (NK13AGE), C2D T9600@2.8Ghz, 4GB DDR3, 320GB SATA (WD Scorpio Black WD3200BEKT 16MB Cache), VESA
listdev: http://dev.haiku-os.org/attachment/ticket/3632/listdev-29784.txt
KDL-ints: http://dev.haiku-os.org/attachment/ticket/3632/kdl-ints-r29784-small.jpg (hda on int11 since some revs after that one)
Evaluated all of the below in different combinations /scenarios with hrev30177 and reproduceable freeze:
- gcc2 & gcc4
- new ata bus_manager
- KDEBUG_LEVEL_2 on every item in kernel_debug_config.h (small fix reguired for gcc4 compile in src/system/kernel/vm/vm_page.cpp)
- INTA-INTH configured in BIOS for distributing the devices on different IRQs instead of all on IRQ 11 as seen in kdl-ints shot from above (figure further below)
- INTA only on IRQ11 (INTB-INTH disabled) (ahci + uhci)
- INTA only on IRQ11 (INTB-INTH disabled) (ahci only) removed busses/usb/[e,o,u]hci
- same as above with all safe mode options enabled
- BIOS-settings tried: SMP disabled, disabled EM64T, Intel VT, IDE DMA
INT-lines/IRQs when configured in BIOS:
INTA | 3 | ahci/uhci |
INTB | 4 | hda/firewire/uhci |
INTC | 5 | uhci |
INTD | 6 | ehci |
INTE | 7 | uhci/ipro1000 |
INTF | 9 | - |
INTG | 10 | uhci |
INTH | 11 | ehci |
Doing intense "time sync" during compilation gives real times in the average of 200ms. One extreme was 5047ms which blocked AboutWindows uptime display and console output of the compile IIRC. (At least I remember the uptime display stuttering occassionally before freeze-on-compile during the test scenarios.)
A perhaps related ticket: #3632
Getting more and more clueless on what things I might evaluate on my own. Thanks in advance for any pointers on how to (help) track this issue!
Michael
Attachments (1)
Change History (46)
comment:1 by , 16 years ago
comment:2 by , 16 years ago
Thermal/EIST: I would say that there are no thermal issues with this ThinkPad. It probably the best cooling system I've heard/seen/felt so far. The system fan never gets noisy and thermals are - subjective impression - medium during longer compilation runs on GNU/Linux or gaming on Windows. Nothing near hot or anything near the thermals of my old Samsung X20 which allowed for cooking a pile of eggs on it.
In BIOS everything is set to "Max Performance". There seems to be no frequency management done by the BIOS itself. No signs of that during runs of Haiku, too. (No slower clock ticks) At least I've never seen something like that except the system actively (as in an EIST driver/mechanism) requesting frequency changes.
hrev30266; new-ata-bm; acpi enabled: No go. System froze after ~ 4m. Will do some more testing when I get back home today. Also will recheck the BIOS settings.
comment:3 by , 16 years ago
hrev30284; new-ata-bm; acpi
PMCPU: CPU Power Management (throttling on inactivity) PMPCI: PCI BUS Power Management
jam -qaj2 sessions:
- BIOS:
EIST,PMCPU,PMPCI- freeze 12m14s
- BIOS:
EIST, PMCPU, PMPCI- freeze 5m30s
- BIOS: EIST, PMCPU, PMPCI
- compile in 29m (finished after 30m30s uptime)
- leaving system idle after compile: freeze after 32m30s uptime
Perhaps the latest changes in bfs (hrev30221) or file_cache (hrev30276) do have some influence on the (greatly) improved uptimes. I wouldn't really say that enabling/disabling EIST, PMCPU or PMPCI - especially as when disabled, freezes appeared earlier - do have an influence. Perhaps just coincidence.
Happy to report some success, though!
comment:4 by , 16 years ago
hrev30284; new-ata-bm; acpi
blender scons sessions:
- BIOS:
EIST,PMCPU,PMPCI- freeze after 8m36s
- BIOS: EIST, PMCPU, PMPCI
- compile in 6m51
- leave idle: freeze after 40s
- BIOS: EIST, PMCPU, PMPCI
- compile in 6m53
- leave idle: freeze after 1m
Perhaps there is some kind of power management going on as the freeze after 40s/1m might indicate a cool down of the proc and the fan spinning down. (Not audible, though.) Haven't tested stressing the system afterwards.
comment:5 by , 16 years ago
hrev30347; new-ata-bm; acpi
I stripped down the system one by one removing bt, hda, firewire, ipro1000, usb and the eist driver. The freeze is still reproduceable, but comes at a later stage. (see further below) I also tested a installation from a USB hard disk and a compilation there.
The main observation is that the system freezes either during high I/O or shorty afterwards:
- ./configure for SDL-1.2.3 freeze short after config.status is written and header dep generation takes place; or shortly after the whole configure is run
- jam -qaj2 on the Haiku tree; ctrl+c'ing the process right in the middle waiting a bit and the system freezes some time afterwards on inactivity (not reproduceable every time)
- freeze during random_file_actions -hrev100000 -f150000 -d100 -m128000 -v during execution
There are occassions where the whole system just works quite long. Remember two days ago with a full tree compile, browsing the net, downloading and checking out blender trees via svn at the same time...
I am going for testing out the random_file_actions with tracing options enabled as mentioned in 3808. Jaming the tree with these options froze unfortunately last night and I fell asleep waiting for it :)
Can it be that the system might get into a freeze/deadlock due to file system corruption? The storage partition I am jam'ing on is existant for quite a time and has gone through several dozens of outages/freezes since 2008-11.
comment:6 by , 16 years ago
Even if the file system already has problems (checkfs should be able to tell you, though), the system should never freeze that you cannot enter the KDL anymore.
I would still suspect ACPI related problems. Have you tried to enable APM instead, and see if that makes any difference (also in the kernel settings file)?
comment:7 by , 16 years ago
The freezes are/were reproduceable with and without acpi. Only tried acpi later on as that also helped with #3632. Will try with apm enabled when getting back home.
Regarding file system corruption: Yes there is according to checkfs. ;)
comment:8 by , 16 years ago
hrev30464; new-ata-bm; default image:
It seems Marcus' recent changes (hrev30443 and hrev30454) have had some impact.
- Only one freeze on svn co of the haiku tree with acpi disabled.
- Full tree compile with acpi enabled. (Freeze minutes later when changing font prefs in Firefox)
What is observable that there are seconds of UI freeze (especially noticeable Deskbar and ActivityMonitor living on the desktop). The mouse and windows are movable, though. Just nothing gets updated. Populating the haiku.image "froze" UI updates for 1m12 seconds.
During heavy IO (svn checkout + deletion of 40k files via Tracker; tree compile) there is noticeable UI update freezes, but the system gets back after up to 10 seconds.
Generally, the "scsi scheduler" kernel thread takes up full CPU periodically. (Only confirmable when UI is not frozen.)
comment:9 by , 16 years ago
Cc: | added |
---|
comment:10 by , 16 years ago
comment:11 by , 16 years ago
I tried last night right after your changes (hrev30475 without vm prefetch and with ATA_STACK 1). System behaves as with hrev30464 observations in my previous comment. Nevertheless this is a big improvement over the weeks/months before your changes.
- system freeze on SDL configure
- system freeze after ~10m of inactivity after having done a "jam -qaj2 @install4 update kernel" which involved a little compiling
When the system is frozen, it is completely frozen. No KDL, nor KDL-enter. As mentioned earlier in this thread, the trackpoint and touchpad are PS2-attached and are not reacting either on complete freeze.
comment:12 by , 16 years ago
- SDL configure on a distcleaned tree -> freeze after ~30s of idle when process finsihed
- jam -qj2 the haiku tree (mostly present objects) -> freeze after ~30s of idle when process finished
- jam -qaj2 the haiku tree -> freeze after 2m10s of idle when process finished
comment:13 by , 16 years ago
hrev30772; ata-stack;
For reproduction, I wen't on and wrote a little script which dd's 10 images 100MB each to disk.
When setting all safe mode options, or just "Disable IDE-DMA" and "Disable SMP" I can run the dd writes several times without freeze. What is reproduceable is that after the images have been written, the "scsi scheduler" kernel thread sucks up one cpu core 100% for about 3-5minutes and then reliefs again. During that high load of the thread a "sync" or system shutdown is not possible. The UI (Deskbar + ActivityMonitor replicants on Desktop) are frozen for a bit of time. The Terminal cursor blinks, and I can move windows around.
When letting SMP enabled, I could produce a total freeze on third run of the dd writes. (Mileage on subsequent test might vary I guess, though.)
Perhaps there is some other fix required as done in hrev30454 as I do have ICH9M running in compat mode? (This rev btw helped a lot regarding uptime until freeze for me.)
comment:14 by , 16 years ago
I just went on testing with hrev30868; gcc4; ata-bm; acpi; on the second sata disk with an installation and work partition and could reproduce the occassional UI freeze during I/O and total freeze after building the Haiku tree.
But I never dared to test the old ide-bm after Marcus' changes (some or all related for me: hrev30443; hrev30454; hrev30475), which - if I got things right - have also been applied to the ide-bm & co after the separation of the two bus managers.
I tested clean installations of hrev30868; gcc4; ide-bm; acpi; on the second sata disk and on my primary sata disk with full Haiku builds (jam -qaj2), SDL configures + builds and the dd write tests mentioned earlier:
- No occassional UI freezes on I/O (e.g. no Deskbar or ActivityMonitor replicant updates)
- No system freeze during or after compiles ... read on ...
I thought I nailed it down. While writing this text, there was more or less 5 mintutes of no heavy disk I/O after the "tests" performed above and it suddenly froze again.
At least the UI freezes are not reproduceable with the ide-bm.
Attaching "hdparm" info for the two disks if these are of interesst:
Primary:
/dev/sda: ATA device, with non-removable media Model Number: WDC WD3200BEKT-00F3T0 Serial Number: WD-WXE808PN8192 Firmware Revision: 11.01A11 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5 Standards: Supported: 8 7 6 5 Likely used: 8 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 device size with M = 1024*1024: 305245 MBytes device size with M = 1000*1000: 320072 MBytes (320 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, with device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 Advanced power management level: 128 Recommended acoustic management value: 128, current value: 254 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * NOP cmd * DOWNLOAD_MICROCODE * Advanced Power Management feature set SET_MAX security extension Automatic Acoustic Management feature set * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * WRITE_{DMA|MULTIPLE}_FUA_EXT * 64-bit World wide name * IDLE_IMMEDIATE with UNLOAD * {READ,WRITE}_DMA_EXT_GPL commands * Segmented DOWNLOAD_MICROCODE * SATA-I signaling speed (1.5Gb/s) * SATA-II signaling speed (3.0Gb/s) * Native Command Queueing (NCQ) * Host-initiated interface power management * Phy event counters DMA Setup Auto-Activate optimization Device-initiated interface power management * Software settings preservation * SMART Command Transport (SCT) feature set * SCT Long Sector Access (AC1) * SCT LBA Segment Access (AC2) * SCT Error Recovery Control (AC3) * SCT Features Control (AC4) * SCT Data Tables (AC5) unknown 206[12] (vendor specific) unknown 206[13] (vendor specific) Security: Master password revision code = 65534 supported not enabled not locked frozen not expired: security count supported: enhanced erase 84min for SECURITY ERASE UNIT. 84min for ENHANCED SECURITY ERASE UNIT. Logical Unit WWN Device Identifier: 50014ee22157687 NAA : 5 IEEE OUI : 14ee Unique ID : 22157687 Checksum: correct
Secondary:
/dev/sdb: ATA device, with non-removable media Model Number: ST9160823AS Serial Number: 5NK1DJD1 Firmware Revision: 3.CME Standards: Supported: 7 6 5 4 Likely used: 8 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 312581808 device size with M = 1024*1024: 152627 MBytes device size with M = 1000*1000: 160041 MBytes (160 GB) Capabilities: LBA, IORDY(can be disabled) Queue depth: 32 Standby timer values: spec'd by Standard, no device specific minimum R/W multiple sector transfer: Max = 16 Current = 16 Advanced power management level: 128 Recommended acoustic management value: 254, current value: 0 DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 Cycle time: min=120ns recommended=120ns PIO: pio0 pio1 pio2 pio3 pio4 Cycle time: no flow control=120ns IORDY flow control=120ns Commands/features: Enabled Supported: * SMART feature set Security Mode feature set * Power Management feature set * Write cache * Look-ahead * Host Protected Area feature set * WRITE_BUFFER command * READ_BUFFER command * DOWNLOAD_MICROCODE * Advanced Power Management feature set SET_MAX security extension * 48-bit Address feature set * Device Configuration Overlay feature set * Mandatory FLUSH_CACHE * FLUSH_CACHE_EXT * SMART error logging * SMART self-test * General Purpose Logging feature set * IDLE_IMMEDIATE with UNLOAD * Disable Data Transfer After Error Detection Write-Read-Verify feature set * WRITE_UNCORRECTABLE_EXT command * SATA-I signaling speed (1.5Gb/s) * Native Command Queueing (NCQ) * Phy event counters Device-initiated interface power management * Software settings preservation * SMART Command Transport (SCT) feature set Security: Master password revision code = 65534 supported not enabled not locked frozen not expired: security count supported: enhanced erase 56min for SECURITY ERASE UNIT. 56min for ENHANCED SECURITY ERASE UNIT. Checksum: correct
comment:15 by , 16 years ago
Please insert a panic directly after
" if (sVectors[vector].ignored_count > 9900) "
into src/system/kernel/int.c
I'm interested to see if that code path gets executed during the freeze.
comment:16 by , 16 years ago
Nope, that code is not taken it seems. At least it seems not taken during the occassional UI freezes. (Which then relief again) Do you think it's worth planting some more dprintf() and or panic() in the interrupt handling code?
Btw, I think I need to invalidate my statement that the UI freezes don't happen with the ide-bm. Dunno what was different yesterday. Had them again with a fresh hrev30878 and ide-bm.
follow-up: 19 comment:17 by , 16 years ago
Michael, the UI freezes (the ones where the entire system seems to freeze and then resumes working after a few seconds) is not restricted to you. AFAIK, it happens with every single person running Haiku. I get it with a Core 2 Quad Extreme at 3.6 GHz and with 3 Gb of memory. It is indeed IO related but I don't think it is the cause of your other problems. If I were you, I would focus on the other pones as this is a know problem (there is a ticket for it somewhere).
comment:18 by , 16 years ago
Hey Bruno, thanks for the info related to the occassional UI freezes. I just mention them as a side note. But as they aren't as dramatical, and as I am apparently not the only one experiencing these, I think can forego mentioning these until they disappear ;)
What is ineed more of a pressing issue are the total freezes of which I haven't yet figured out with the help from the "others" where they might stem from.
comment:19 by , 16 years ago
Replying to bga:
Michael, the UI freezes (the ones where the entire system seems to freeze and then resumes working after a few seconds) is not restricted to you. AFAIK, it happens with every single person running Haiku.
Haven't seen that problem on my machines yet.
comment:20 by , 16 years ago
Then you are lucky. :) It has been discussed several times already in the mailing lists (Axel even started looking into it once as it was even more visible in his EEE PC if I am not mistaken). It happens all the time when I do, for example, "svn up" in the Haiku tree or when compiling Haiku inside itself. It is easier to notice if you are playing an audio file for example but can also be noticed if you have "Show seconds" enabled in the Deskbar clock and you pay attention to it or if you run ActivityMonitor while doing what I mentioned as you will notice that sometimes it refuses to redraw itself for several seconds when heavy IO is going on.
*BUT*, this is a subject for another ticket I guess.
comment:21 by , 15 years ago
hrev30981; ide-bm; acpi; scheduler-affine;
Following the recent "Scheduler"-thread on the mailing list, commenting the asm("hlt");
in x86/arch_cpu.cpp#arch_cpu_idle()
or replacing it by a asm("nop");
results in the system to not freeze during or after compilation runs, or other I/O intense tasks.
System is up for more then two hours including compilations, svn up's and surfing.
comment:22 by , 15 years ago
Cc: | added |
---|
comment:23 by , 15 years ago
For keeping this one updated...
hrev32467-hrev32497;gcc4;ata-bm;acpi
For reproducing freezes I usually do now: dd if=/dev/zero of=dd.img bs=1024k count=4096
Sometimes the system freezes before reaching the 3GB (cache) memory limit, sometimes shortly after or minute(s) after the file has been written and the disk has actually written the file back. (There is heavy disk activity after the file has been created and the dd-process exited.)
On other occassions the dd-write just works (also repeatedly) fine and I can continue work. But sooner or later, the system will freeze.
comment:24 by , 15 years ago
I am inclined to say that I probably found a way of circumventing the system freezes...
I did experiment with kernel_debug_config.h swap-support and memory-limitation:
memory-limit | swap-support | swap-size | freeze |
- | yes | disabled | yes |
- | no | - | yes |
512 | no | - | no |
512 | yes | 509 | no |
2560 | yes | 509 | no |
2560 | no | - | no |
So it seems to boil down to limiting the maximum available memory. I've done several tests - especially with the last config (though it shouldn't matter if swap is enabled or not) - which did reproduceably freeze the system before. (Including jam -qaj2, SDL configure and build, dd-tests creating a 4GB image)
No system freeze with limited available memory yesterday during tests session and a quick dd-test this morning.
As a side effect, it seems the system is able to shut down and reboot correctly where it mostly just frooze right before doing so. (Showing the last "state" of the shutdown alert/dialog.)
comment:26 by , 15 years ago
hrev32893-hrev33027-trunk;gcc4;ata;acpi
3067MB seems to be the last possible RAM-limit-setting on which the system is not freezable with the dd-tests.
memory-limit | freeze |
- | yes |
3070 | yes |
3068 | yes |
3067 | no |
3066 | no |
3064 | no |
3056;3040;2944 | no |
In the meantime it turned out chaotic (trac-user) has the same ThinkPad T500 series, just another processor and other hard drive.
He also reported occassional system freezes and could reproduce the dd-test-freeze.
So I built him a hrev32932 hybrid kernel limited to 2944MB RAM for testing on his r1a1 installation and he could not get the system to freeze with the dd-tests. Just the occassional *UI*-freezes. (But these are another story...)
comment:27 by , 15 years ago
Perhaps some memory io range is not useable but it used by haiku? Is the e820 table ok?
follow-up: 29 comment:28 by , 15 years ago
Marcus, pardon my ignorance, but how can I determine what contents that "e820" table holds and how I can make it available to you for verification? I have not the slightest idea where that table resides and what is supposed to represent ;) Thanks!
Btw, hrev33032 looks interessting. Will have to test later on...
follow-up: 30 comment:29 by , 15 years ago
Replying to michael.weirauch:
Btw, hrev33032 looks interessting. Will have to test later on...
Please try hrev33037 instead. The protection in hrev33032 didn't actually work due to overflowing.
comment:30 by , 15 years ago
Replying to mmlr:
Replying to michael.weirauch:
Btw, hrev33032 looks interessting. Will have to test later on...
Please try hrev33037 instead. The protection in hrev33032 didn't actually work due to overflowing.
Just tested with a fresh hrev33040 and it did still freeze with the dd-test.
Let me know on how I can help tracking/determine if there is still something wrong with the overflows.
comment:31 by , 15 years ago
hrev33064; freezes still reproduceable; sorry to have report that.
4 runs: (max cache-memory visually approximated from ActivityMonitor output)
- freeze after disk-activity ceased (long after dd exited)
- max cache-memory bypassed, freeze right after
- freeze some megabytes before max cache-memory
- freeze some megabytes before max cache-memory
comment:32 by , 15 years ago
Summary: | Freeze on Haiku tree compilation (reproduceable) → Freeze on high memory load when not limiting available memory (reproduceable) |
---|---|
Version: | R1/pre-alpha1 → R1/Development |
hrev33655; gcc4hybrid;
Still persistent. Some mtrr info. Perhaps this sheds some more light into the issue as it seems memory related. (Limiting memory to 3067MB works)
bott_mtrr_dump.diff from #4399
mtrr: 7 variable ranges mtrr: default type: 0xc00 (uncacheable, variable enabled, fixed enabled) mtrr: entry 0: base: 0x13c000000; length: 0x40007ff; type: 0 uncacheable mtrr: entry 1: base: 0x0; length: 0x800007ff; type: 6 write-back mtrr: entry 2: base: 0x80000000; length: 0x400007ff; type: 6 write-back mtrr: entry 3: base: 0x100000000; length: 0x400007ff; type: 6 write-back mtrr: entry 4: empty mtrr: entry 5: empty mtrr: entry 6: empty
/proc/mtrr from openSUSE:
reg00: base=0x13c000000 (5056MB), size= 64MB: uncachable, count=1 reg01: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1 reg02: base=0x80000000 (2048MB), size=1024MB: write-back, count=1 reg03: base=0x100000000 (4096MB), size=1024MB: write-back, count=1 reg04: base=0xd0000000 (3328MB), size= 256MB: write-combining, count=1
comment:33 by , 15 years ago
That's the MTRR setup produced by the BIOS and since we reset all MTRRs during the boot process, this is not related. It would be interesting to see the MTTR setup we produce, though. Enable TRACE_MTRR in src/add-ons/kernel/cpu/x86/generic_x86.cpp to have it printed to the syslog. You might need to increase syslog_buffer_size in your kernel settings to prevent the output from being dropped.
follow-up: 36 comment:35 by , 15 years ago
Replying to bonefish:
hrev34197 reimplements the MTRR handling. Please retest.
hrev34198-gcc4h; fresh install; Unfortunately the freeze still occurs. On 2 of the 2 dd-tests, the ActivitMonitor-GUI froze on max block-cache memory usage. Mouse still movable. Then it started drawing again after about 5-10 seconds. And then the system froze completely. Once after the 4GB file was written, and once right after the ActivityMonitor drew again.
Btw, I am not getting any "mtrr:" output, allthough TRACE_MTRR is defined. If it helps, I haven't seen any early dprintf() of my little acpi_thinkpad in the early "support"-hook either on my work installation.
comment:36 by , 15 years ago
Replying to michael.weirauch:
Btw, I am not getting any "mtrr:" output, allthough TRACE_MTRR is defined. If it helps, I haven't seen any early dprintf() of my little acpi_thinkpad in the early "support"-hook either on my work installation.
Have you increased syslog_buffer_size as I've suggested? If you still get <TRUNC>/<DROP> the buffer size is still not sufficiently large.
by , 15 years ago
Attachment: | haiku-r34198-trace-mtrr-syslog.txt added |
---|
ThinkPad T500 hrev34198 TRACE_MTRR
comment:37 by , 15 years ago
Hi Ingo, sorry for the trouble! The elevated syslog_buffer_size setting which I usually have enabled when cross compiling and installing got lost on my work installation.
Attached a syslog. Additionally the boot_mttr_dump-patch output is still present at the very beginning. Hope this helps.
follow-up: 39 comment:38 by , 15 years ago
Our MTRR setup looks OK -- considering what it can do with the weird ranges it gets ((base 0x0, size 0xbfac6000) and (base 0xbfdff000, size 0x1000)) at least. The BIOS has a laxer setup, so I'd say your issue is not MTRR related. To be sure you could check whether the MTRR setup under Linux is also not stronger than ours.
PS: The boot_mttr_dump patch is not needed when you enable TRACE_MTRR -- the same info is printed anyway (in fact even more correctly).
follow-up: 40 comment:39 by , 15 years ago
Replying to bonefish:
Our MTRR setup looks OK -- considering what it can do with the weird ranges it gets ((base 0x0, size 0xbfac6000) and (base 0xbfdff000, size 0x1000)) at least. The BIOS has a laxer setup, so I'd say your issue is not MTRR related. To be sure you could check whether the MTRR setup under Linux is also not stronger than ours.
Please see some posts above for the openSUSE (11.1) setup: here
PS: The boot_mttr_dump patch is not needed when you enable TRACE_MTRR -- the same info is printed anyway (in fact even more correctly).
Ok, going to remove it again.
follow-up: 41 comment:40 by , 15 years ago
Replying to michael.weirauch:
Replying to bonefish:
Our MTRR setup looks OK -- considering what it can do with the weird ranges it gets ((base 0x0, size 0xbfac6000) and (base 0xbfdff000, size 0x1000)) at least. The BIOS has a laxer setup, so I'd say your issue is not MTRR related. To be sure you could check whether the MTRR setup under Linux is also not stronger than ours.
Please see some posts above for the openSUSE (11.1) setup: here
Ah, missed that (the ticket is getting rather long :-/). Verified, the setup is laxer than ours too. So this is not MTRR related for sure.
comment:41 by , 15 years ago
Replying to bonefish:
Ah, missed that (the ticket is getting rather long :-/). Verified, the setup is laxer than ours too. So this is not MTRR related for sure.
Thanks for having a look at this Ingo! Before I let this ticket remain idle until it is resolved one day...
I am currently running with 3067MB limit and am experiencing no freezes whatsoever. Everything is fine. The other day I switched (in BIOS) to use the Intel GMA X4500 instead of the Radeon HD3650 in order to be able to use the full screen estate (1680x1050 instead of 1400x1050 as the Radeon VBE doesn't export the full mode list). I removed the intel_extreme driver though, because it seemed to "flicker", so I went with VESA. I was very quickly able to get the system to freeze again during use of the system. (Memory was 3034MB due to RAM shared with the GPU)
The question is, if this is not MTRR related, can this still be "memory management"-related in general as the use of the integrated graphics influences the memory system by using parts of it as video memory and seemed to bring back the freezeability?
follow-up: 43 comment:42 by , 15 years ago
0xbfac6000 seems to be your upper memory limit. Maybe the system tries to use this physical memory even if it seems to be mapped with map_physical_memory(). Which driver tries to map this range ?
comment:43 by , 15 years ago
Replying to korli:
0xbfac6000 seems to be your upper memory limit. Maybe the system tries to use this physical memory even if it seems to be mapped with map_physical_memory(). Which driver tries to map this range ?
Could you give a hint on how to provide this info? ;)
comment:44 by , 15 years ago
hrev35736 fixed this longstanding issue without having to limit the available system memory to 3067MB! (System now shows 3069MB due to ignoring the lower 1MB)
None of the dd-tests to provoke the system freeze actually get it to do so anymore. Great work! Bug can be closed.
Maybe your system gets warm during compilation, and the BIOS tries to start the fan, or even lower the chip frequency. This might actually have the same cause as #3632 I would think.
Hopefully enabling ACPI will already do the trick. However, we currently don't do much of CPU thermal/frequency management ourself, so please check if your system is not running hot for some reason.