Opened 3 months ago

Closed 2 months ago

Last modified 2 months ago

#19117 closed bug (fixed)

Haiku only boots with "Ignore memory beyond 4GB" turned on.

Reported by: LSS37040 Owned by: nobody
Priority: normal Milestone: R1/beta6
Component: System/Kernel Version: R1/beta5
Keywords: Cc:
Blocked By: #14659 Blocking:
Platform: x86

Description (last modified by LSS37040)

Motherboard: ASRock X99M Killer/3.1

CPU: Xeon E5-2699v4 (22C, HT disabled in BIOS)

RAM: 128GB DDR4 (32GB x4)

Video Card: Quadro K6000 (12GB)

Recently I've been trying to get Haiku 32-bit booting on this board, but it's not straightforward.

By default it would freeze on Haiku logo with no icons lighting up. If I turn on the "Ignore memory beyond 4GB" safe mode option, then it boots fine.

As of R1/beta5 I can actually boot, make partition and install Haiku 32-bit on this board without any major blockade, as long as I turn on that particular safe mode option, but it's not ideal since:

  • I have to manually invoke boot menu (by keep pressing left SHIFT) every time I choose to boot it.
  • Maybe it's the video card I'm using, the system has only about 1.4GB of accessible RAM in below 4GB range, which is rather small in comparison. I'm not sure why exactly, however.

On the other hand, I also tried booting R1/beta5 64-bit (via USB and via CSM) on this same board as a reference, but the attempts were not successful even after I toggled "Ignore memory beyond 4GB". It's not important for the scope of this issue, however, as I intend to use the 32-bit version on this board.

The question here is where I should start with investigating why the board can't boot with the default setting (without "Ignore memory beyond 4GB"), as well as how to make the safe mode option toggle persistent without having to manually invoke the boot menu every time.

By the way, I did try enabling debug outputs, but it's not that simple, as some parts of the boot process do not like being delayed by debug outputs (most likely paging-related options), and would KDL if blocked for too long.

Attachments (15)

haiku-x64-selfbuild-newkdl-20241011.txt (76.1 KB ) - added by LSS37040 2 months ago.
Boot log until KDL from self-built Haiku x64 image. (hrev58219)
haiku-x86-selfbuild-newkdl-20241012.txt (67.6 KB ) - added by LSS37040 2 months ago.
Boot log until KDL from self-built Haiku x86 image. (hrev58220)
haiku-hrev58228-8448-2-x86-kdl.txt (9.0 KB ) - added by LSS37040 2 months ago.
x86 KDL on hrev58228-8448-2
haiku-hrev58228-8448-2-x64-booted.txt (136.2 KB ) - added by LSS37040 2 months ago.
x64 booted successfully on hrev58228-8448-2
haiku-hrev58239-8448-4-x64-kdl.txt (10.8 KB ) - added by LSS37040 2 months ago.
x64 KDL on hrev58239-8448-4
haiku-hrev58239-8448-4-x86gcc2h-boot-error.txt (121.3 KB ) - added by LSS37040 2 months ago.
x86 boots, but black screen on hrev58239-8448-4, first try
haiku-hrev58239-8448-4-x86gcc2h-boot-error-2.txt (120.7 KB ) - added by LSS37040 2 months ago.
x86 boots, but black screen on hrev58239-8448-4, second try
haiku-hrev58239-8448-4-x86gcc2h-oldboot-error.txt (120.6 KB ) - added by LSS37040 2 months ago.
x86 boots, still not working even with R1 beta5 bootloader, hrev58239-8448-4
haiku-hrev58240-8448-5-x64-boot.txt (136.1 KB ) - added by LSS37040 2 months ago.
x64 booted successfully on hrev58240-8448-5
haiku-hrev58240-8448-5-x86-error.txt (120.7 KB ) - added by LSS37040 2 months ago.
x86 boots, but still black screen on hrev58240-8448-5
haiku-hrev58245-x86_64-boot.txt (135.9 KB ) - added by LSS37040 2 months ago.
x64 booted successfully on hrev58245
haiku-hrev58245-x86gcc2h-boot-error.txt (121.0 KB ) - added by LSS37040 2 months ago.
x86 still running out of resource space on hrev58245
haiku-hrev58265-x86_64-boot.txt (136.3 KB ) - added by LSS37040 2 months ago.
x64 booted successfully on hrev58265
haiku-hrev58265-x86gcc2h-boot.txt (125.8 KB ) - added by LSS37040 2 months ago.
x86 booted successfully on hrev58265 (issues with ahci)
haiku-hrev58265-ide-no-boot.txt (4.5 KB ) - added by LSS37040 2 months ago.
Neither x86 nor x64 boots when SATA set to IDE mode. Bootloader output (same on both)

Download all attachments as: .zip

Change History (51)

comment:1 by LSS37040, 3 months ago

Description: modified (diff)

comment:2 by LSS37040, 3 months ago

Just tried enabling on-screen debug output and without paging, but it seems when "Ignore memory beyond 4GB" isn't turned on (that is, booting normally), nothing is showing up when the boot logo appears and apparently hangs (no icons lighting up after long enough). In some cases (not 100%) I can make the system reboot via CTRL-ALT-DEL, however.

By the way, how much memory does Haiku 32-bit support at maximum (including PAE)? AFAIK PAE normally allowed 36-bit (up to 64GB). I wonder if Haiku has any option to limit maximum available memory in a more flexible manner (e.g. limiting to 16GB) so as to investigate whether Haiku is having trouble handling this much RAM on this board (128GB in this case)...

comment:3 by madmax, 3 months ago

as well as how to make the safe mode option toggle persistent

I've never tried it myself with that specific option, but you'd edit /boot/home/config/settings/kernel/drivers/kernel and add a line with 4gb_memory_limit true (or just uncomment the one that's already there).

in reply to:  3 comment:4 by LSS37040, 3 months ago

I've never tried it myself with that specific option, but you'd edit /boot/home/config/settings/kernel/drivers/kernel and add a line with 4gb_memory_limit true (or just uncomment the one that's already there).

Thank you very much. I've found the config file and uncommented the line in question. Now Haiku can boot without having to invoke the boot menu every time.

Just checked the memory info from About screen:

  • It shows the total amount of installed RAM regardless (in my case, 130992 MiB).
  • The 4gb_memory_limit indeed disabled all RAM mapped past 4GB boundary which would be accessed via PAE. The percentage near the amount of memory in use did reflect to the amount of memory actually available (that would be about 1.42GiB in my case).

If it's somehow possible to just set a RAM limit (e.g. 8GB or 16GB) without disabling PAE altogether, it might be helpful with investigating the underlying cause of why it's not booting correctly with default settings...

comment:5 by waddlesplash, 3 months ago

Odds are that this is just another instance of #19009. You will need a serial capture to determine that though; those messages aren't displayed in the onscreen debug output.

comment:6 by LSS37040, 3 months ago

Just hooked a serial cable on that system and I got the following output...

PANIC: error allocating early page!

Welcome to Kernel Debugging Land... Thread 0 "" running on CPU 0 stack trace for thread 0 ""

kernel stack: 0x00000000 to 0x00000000

frame caller <image>:function + offset

0 81004da4 (+ 32) 8014100f 1 81004db0 (+ 12) 8012f33a 2 81004de0 (+ 48) 800a60a0 3 81004e30 (+ 80) 800a74a0 4 81004e70 (+ 64) 800a782f 5 81004e90 (+ 32) 800a7b3f 6 81004ee0 (+ 80) 8011329a 7 81004f40 (+ 96) 8011f041 8 81004fd0 (+ 144) 80117e01 9 81004ff0 (+ 32) 8005f088

kdebug>

That's all of it. So it apparently KDL'd very early.

The cause of the KDL looks really similar to #19009, though.

comment:7 by waddlesplash, 3 months ago

Blocked By: 19009 added
Keywords: memory removed

comment:8 by waddlesplash, 2 months ago

Blocked By: 14659 added; 19009 removed

comment:9 by waddlesplash, 2 months ago

Please retest after hrev58212.

by LSS37040, 2 months ago

Boot log until KDL from self-built Haiku x64 image. (hrev58219)

by LSS37040, 2 months ago

Boot log until KDL from self-built Haiku x86 image. (hrev58220)

comment:10 by LSS37040, 2 months ago

I don't know when there will be an official post-hrev58212 nightly build available, so I attempted to build the latest source myself. It wasn't easy.

Only x64 image can be built from a Linux host right now. Tried setting up a x86 hybrid build environment on the same Linux host but wasn't successful, so I have to rely on this active Haiku system to build a x86gcc2 image. At least I managed to obtain some bootable artifacts for both x64 and x86gcc2.

Sadly neither of the build artifacts were able to finish booting without relying on any safe mode option, although the boot process can now actually proceed a bit further before hitting other KDLs.

I'll be uploading serial debug logs produced from my self-built x86gcc2 and x64 artifacts. Since the builds are not official I'm not sure how helpful those logs will be, as I need to retest once up-to-date official nightly builds come out.

Summary of the KDL logs:

x86: KDL'd during USB init due to "vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x38, ip 0x81571b29, write 0, user 0, exec 0, thread 0x23"

x64: Ran out of memory while detecting disks (ahci).

In both boot logs I'm seeing occasions of "vm_page_allocate_page_run()" failures but they do not necessarily trigger the KDL. Don't know what those errors mean for #14659...

comment:11 by korli, 2 months ago

Looks like there isn't enough physical memory under 4GB in the PMAP. Please try again with https://review.haiku-os.org/c/haiku/+/8446 and https://review.haiku-os.org/c/haiku/+/8447

comment:12 by waddlesplash, 2 months ago

USB KDL should be fixed in hrev58224.

I think those other changes are the wrong solution here. I'm working on an alternative patch.

comment:13 by waddlesplash, 2 months ago

Alternative to korli's fixes: https://review.haiku-os.org/c/haiku/+/8448

comment:14 by waddlesplash, 2 months ago

(Note that if you want to test that patch by building it yourself, you'll need to update to the latest git version first, as the VM changes in hrev58224 are necessary for Gerrit change 8448 to work properly.)

by LSS37040, 2 months ago

x86 KDL on hrev58228-8448-2

by LSS37040, 2 months ago

x64 booted successfully on hrev58228-8448-2

comment:16 by LSS37040, 2 months ago

Just tested. It seems x86 is still broken with this fix while x64 is now fine.

The x86 KDL suggests a failure in switching to PAE.

While the system has 128GB of RAM in total, only 1.4GB of which are below 4GB.

comment:17 by waddlesplash, 2 months ago

Indeed, seems my change breaks PAE. I'll have to do something slightly differently here, I suppose.

comment:18 by waddlesplash, 2 months ago

New patchset uploaded to https://review.haiku-os.org/c/haiku/+/8448; test builds hopefully will appear in the near future.

comment:20 by LSS37040, 2 months ago

Just tested. Not good...

x64 doesn't boot, getting an early KDL due to hitting an assert.

x86 can finish booting, but I'm getting a black screen. Debug log output shows a lot of messages suggesting the system has run out of memory for some reasons...

I'm uploading boot logs.

by LSS37040, 2 months ago

x64 KDL on hrev58239-8448-4

by LSS37040, 2 months ago

x86 boots, but black screen on hrev58239-8448-4, first try

by LSS37040, 2 months ago

x86 boots, but black screen on hrev58239-8448-4, second try

comment:21 by waddlesplash, 2 months ago

Can you try with an older bootloader (hrev58213 or before) with the test version?

in reply to:  21 comment:22 by LSS37040, 2 months ago

Can you try with an older bootloader (hrev58213 or before) with the test version?

Do you mean booting the same test image with the bootloader of an older install?

Tried using boot menu from my existing install (R1 beta5) to boot the test image, but it doesn't work (be it on USB or on a DVD-RW disc). It's only showing the Haiku volume on which the boot loader resides, and "Rescan Volumes" does nothing.

Or are you referring to rebasing this patch against an older revision?

Last edited 2 months ago by LSS37040 (previous) (diff)

comment:23 by waddlesplash, 2 months ago

Do you mean booting the same test image with the bootloader of an older install?

Yes.

The BIOS loader may only show volumes on the same disk as it's located. In that case, you can mount the newly flashed image's BFS partition and replace its "haiku_loader...hpkg" with the one from your existing install. (Just overwrite the file, renaming your existing package to the filename of the one on the partition.)

by LSS37040, 2 months ago

x86 boots, still not working even with R1 beta5 bootloader, hrev58239-8448-4

comment:24 by LSS37040, 2 months ago

x86 still not working here. The system still runs out of resource address space when the booting process is about to finish.

Notice these lines that are appearing before things starting to crash.

low resource address space: note -> critical

comment:25 by waddlesplash, 2 months ago

I think the page bookkeeping structures will be somewhere between 650MB-1GB on 32-bit for 64GB RAM. That may mean we really are exhausting kernel address space, I suppose.

I can reproduce the fault with x64 here now, or at least I can with the conditions in #18140. Strange that the earlier test build worked but the new one doesn't...

comment:26 by waddlesplash, 2 months ago

OK, I did the math: 64GB is 0x1000000 pages, and on 32-bit vm_page is 52 bytes, so that's 832 MB of address space in just the page array.

Reasonably the kernel should be able to with the remaining 1.2 GB, but perhaps something else is going wrong here.

by LSS37040, 2 months ago

x64 booted successfully on hrev58240-8448-5

by LSS37040, 2 months ago

x86 boots, but still black screen on hrev58240-8448-5

comment:28 by LSS37040, 2 months ago

The issue with x64 has been fixed.

As for x86, no difference. The system is still running out of resource space.

OK, I did the math: 64GB is 0x1000000 pages, and on 32-bit vm_page is 52 bytes, so that's 832 MB of address space in just the page array.

The system has only about 1.4GB memory below 4GB range. Taking out 832MB would leave only about 600MB of memory usable in that region for other things.

Maybe that amount is not enough for everything. On Haiku x64 builds that I could successfully boot, the initial memory usage is about 2820MB according to the About dialog. Don't know if it's possible to get an estimate on how much memory the system would require on 32-bit from that amount...

comment:29 by waddlesplash, 2 months ago

The system has only about 1.4GB memory below 4GB range.

That doesn't matter here. "low resource address space" means the kernel virtual address space, of which there's 2GB, the physical memory allocated for the virtual address space does not matter.

by LSS37040, 2 months ago

x64 booted successfully on hrev58245

by LSS37040, 2 months ago

x86 still running out of resource space on hrev58245

comment:30 by waddlesplash, 2 months ago

I see we hit "low resource address space: warning -> critical" before we even mount the boot partition. This means we have less than 32 MB of kernel address space remaining at that point.

I wonder how this happens. Even if we use ~800MB for the page array, the rest of the kernel shouldn't be using 1.2GB even before we mount system packagefs. Something else must be taking up a significant amount of memory.

I guess you can't easily drop to KDL since the keyboard hasn't fully initialized at this point...

comment:31 by waddlesplash, 2 months ago

Please retest after hrev58262.

by LSS37040, 2 months ago

x64 booted successfully on hrev58265

by LSS37040, 2 months ago

x86 booted successfully on hrev58265 (issues with ahci)

comment:32 by LSS37040, 2 months ago

Tested hrev58265 and it seems both x86 and x64 can boot successfully. On x86 system reports 62896MB of memory with an initial usage of about 1010MB, which looks fine to me.

However, it seems there's another issue with x86, about ahci. Not sure if it is related to this issue, or it's a bug with ahci from this point on.

With x86, partitions on AHCI SATA drives cannot be properly detected, with the first one showing some garbled names, and they all appear as "raw". NVMe and USB drives are okay.

With x64 everything's okay. All partitions can be correctly detected.

by LSS37040, 2 months ago

Neither x86 nor x64 boots when SATA set to IDE mode. Bootloader output (same on both)

comment:33 by LSS37040, 2 months ago

I changed SATA mode to IDE just in case and the situation was worse. Neither x86 nor x64 boots this way.

There's only a black screen, with just a few lines of bootloader output coming out of the serial port. At this point I can either CTRL-ALT-DEL to reboot or power the system off with the power button.

Switching SATA mode back to AHCI and the system can boot again as usual.

Anyway, I think this issue can be considered fixed for now. As for the issue with SATA I'll be opening a separate issue as I'm not sure if it's still related to this one.

EDIT: Additionally tested with the "Ignore memory beyond 4GB" option on x86. AHCI SATA disks are properly detected when that option is enabled, so maybe that driver has issues with PAE...

Last edited 2 months ago by LSS37040 (previous) (diff)

comment:34 by LSS37040, 2 months ago

Blocking: 19191 added

comment:35 by waddlesplash, 2 months ago

Milestone: UnscheduledR1/beta6
Resolution: fixed
Status: newclosed

Thanks for testing!

comment:36 by waddlesplash, 2 months ago

Blocking: 19191 removed
Note: See TracTickets for help on using tickets.