Opened 4 years ago

Closed 4 years ago

#15587 closed bug (fixed)

Regression finding Haiku partition after GNU-EFI removal

Reported by: tqh Owned by: jessicah
Priority: blocker Milestone: R1/beta2
Component: System/Boot Loader/EFI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Live USB-stick with hrev53644 finds Haiku partition. hrev53658 does not find the partition, however on my old laptop which had an old Haiku partition it found that one.

Change History (48)

comment:1 by tqh, 4 years ago

hrev53649 does not find the partition. So need to see if devices.cpp or one of the header values is different.

comment:2 by waddlesplash, 4 years ago

Component: - GeneralSystem/Boot Loader
Milestone: UnscheduledR1/beta2
Owner: changed from nobody to kallisti5
Priority: normalblocker
Status: newassigned

Same behavior here: partitions on USB drives do not appear now. Elevating to blocker.

comment:3 by pulkomandy, 4 years ago

Can you try reverting just https://git.haiku-os.org/haiku/commit/?id=51429f04179453622530b5fc1a2bb62f8c4deb75 and see if that helps?

If that's the case, we probably need to try both paths (with and without the set_path_end) or otherwise more cleanly fix this.

See the discussion on https://review.haiku-os.org/c/haiku/+/2040 .

comment:4 by tqh, 4 years ago

I don't have a dev environment at the moment, but hopefully I'll be able to look into it soon.

comment:5 by tqh, 4 years ago

Some partition info:

  • Very old partition, that is found:
    sudo /sbin/blkid -p /dev/sdb1
    /dev/sdb1: 
    LABEL="Haiku"
    VERSION="little-endian"
    UUID="2fc9c21566e50935"
    TYPE="befs"
    USAGE="filesystem"
    PART_ENTRY_SCHEME="gpt"
    PART_ENTRY_NAME="Haiku"
    PART_ENTRY_UUID="70186fa8-d6af-fc42-a893-e79c112a3942"
    PART_ENTRY_TYPE="42465331-3ba3-10f1-802a-4861696b7521"
    PART_ENTRY_NUMBER="1"
    PART_ENTRY_OFFSET="40"
    PART_ENTRY_SIZE="46903296"
    PART_ENTRY_DISK="8:16"
    
  • Partition on stick:
    sudo /sbin/blkid -p /dev/sdc1
    /dev/sdc1:
    LABEL="Haiku"
    VERSION="little-endian"
    UUID="2d4cec077f247bea"
    TYPE="befs"
    USAGE="filesystem"
    PART_ENTRY_SCHEME="dos"
    PART_ENTRY_TYPE="0xeb"
    PART_ENTRY_FLAGS="0x80"
    PART_ENTRY_NUMBER="1"
    PART_ENTRY_OFFSET="12288"
    PART_ENTRY_SIZE="1228800"
    PART_ENTRY_DISK="8:32"
    

comment:6 by pulkomandy, 4 years ago

The second one is an MBR partition, while the first is GPT. That seems like the first thing to check.

comment:7 by kallisti5, 4 years ago

mbr is expected on anyboot.

$ blkid haiku-nightly-anyboot.iso haiku-nightly-anyboot.iso: UUID="2020-01-02-01-54-45-00" LABEL="haiku-nightly-x86_64" TYPE="iso9660" PTTYPE="dos"

I documented the technical layout in the anyboot tool a while back:

// Haiku Anyboot Image:
//   (MBR Table + Boot Sector)
//   ISO (Small Haiku ISO9660)
//   First Partition (Haiku OS Image, BFS)
//   Second Partition (EFI Loader, FAT)
//   Third Partition (EFI Mac, HFS) <not implemented>

It's possible you had an old / invalid backup GPT sector at the end of the disk?

Last edited 4 years ago by kallisti5 (previous) (diff)

comment:8 by kallisti5, 4 years ago

Actually, I just re-read this one. You're saying that the latest EFI loader booted from an anyboot is failing to identify your MBR partition on disk from an existing install?

Please clarify the following:

  • Are you booting the Anyboot as BIOS or EFI?
  • Booting the Anyboot image as EFI, does it locate the GPT partition?
  • Booting the Anyboot image as BIOS, does it locate the GPT partition?

comment:9 by tqh, 4 years ago

Anyboot image on USB stick and booted through UEFI can't find its own Haiku partition, but it finds an old Haiku partition on my harddrive. Don't have any BIOS ones and I don't think there is any problem there.

Last edited 4 years ago by tqh (previous) (diff)

in reply to:  9 comment:10 by tqh, 4 years ago

Replying to tqh:

Anyboot image on USB stick and booted through UEFI can't find its own Haiku partition, but it finds an old Haiku partition on my harddrive. Don't have any BIOS machines and I don't think there is any problem there.

comment:11 by kallisti5, 4 years ago

Here's a test build minus the change Pulkomandy pointed out. Please give it a try and let us know: https://keybase.pub/kallisti5/testbuilds/haiku-x86_64-efi-wohrev53645.iso.xz

comment:12 by kallisti5, 4 years ago

@tqh were you able to test this?

comment:13 by X512, 4 years ago

I have one x86_64 EFI only PC and x86_64 BIOS/EFI PC. I can perform tests if needed.

in reply to:  11 comment:14 by X512, 4 years ago

Replying to kallisti5:

https://keybase.pub/kallisti5/testbuilds/haiku-x86_64-efi-wohrev53645.iso.xz

On x86_64 EFI only PC it boot fine, on x86_64 BIOS/EFI PC is says like "can't find boot partition, scanning for all partitions" and no continue booting option is available until boot partition is explicitly selected. After partition is selected, Haiku boots normally. Behavior don't change compared to older (about 1 year old) EFI loader. On x86_64 BIOS/EFI PC, SATA HDD with 32 bit Haiku is installed.

Last edited 4 years ago by X512 (previous) (diff)

comment:15 by tqh, 4 years ago

I've been away, but I am going check and also setup my dev environment to work on it.

X512 I think you are only adding confusion. The important thing is if it can find the partition on the USB anyboot disk under UEFI, not any Haiku image. If you:

  • download the anyboot-image
  • put it on a usb-stick
  • uefi-boot it on a machine without any Haiku partitions
  • it should find and boot Haiku from USB
  • But it does not find the partition at the moment :(

comment:16 by X512, 4 years ago

For me, it successfully find partition and boot with https://keybase.pub/kallisti5/testbuilds/haiku-x86_64-efi-wohrev53645.iso.xz and x86_64 EFI only PC. This PC don't have Haiku installed (it have MMC SSD that is not recognized by Haiku).

comment:17 by tqh, 4 years ago

Are you selecting to boot from the USB with UEFI?

If you are using an already installed bootloader or using the BIOS bootloader it doesn't add any value to this bug.

in reply to:  17 comment:18 by X512, 4 years ago

Replying to tqh:

Are you selecting to boot from the USB with UEFI?

I selected USB drive from EFI menu (F12 key). Otherwise Windows will boot. Does changing boot priority makes things different, or this bug is only about booting from USB when no other boot devices are avalible?

Last edited 4 years ago by X512 (previous) (diff)

comment:19 by tqh, 4 years ago

No, then it probably works, can you verify that ordinary nightly image doesn't?

comment:20 by kallisti5, 4 years ago

I've been testing in Virtualbox under waddlesplash's guidance. Nothing ever goes well there for me (most of the time VirtualBox's EFI doesn't see Haiku even though it should and I can boot it manually from the EFI shell, the rest of the time I get a completely unrelated panic.)

EFI works reliably every time under qemu for me. It also works on real hardware.

I'm definitely not discounting there may be regressions under VirtualBox... but given how random Virtualbox has been it's hard getting a baseline. A VirtualBox upgrade could have also changed the behavior.

If you're testing this, please be sure:

  • EFI is enabled under System > Motherboard
  • You're attaching Haiku's boot media identically.
  • You're using valid media.
    • Don't just rename the .iso to .hdd or something, VirtualBox wants you to convert it to a VDI for the "iso" to be a "hard disk" with vboxmanage convertfromraw haiku-anyboot.iso haiku-disk.vdi

in reply to:  19 comment:21 by X512, 4 years ago

Replying to tqh:

No, then it probably works, can you verify that ordinary nightly image doesn't?

Tried to run haiku-master-hrev53741-x86_64-anyboot and it didn't find boot partition. Boot partition list was empty.

comment:22 by tqh, 4 years ago

That's very good, it means that the big UEFI changes work and only the device path end commit broke things. I will test in a few hours as well.

On another note, I refuse to use VirtualBox. It never behaves in a way that makes it suitable for development.

in reply to:  22 comment:23 by X512, 4 years ago

Replying to tqh:

only the device path end commit broke things.

Which commit (hrev number or gerrit ticket number)?

Last edited 4 years ago by X512 (previous) (diff)

comment:25 by tqh, 4 years ago

Confirming that Kallisti's special build does work.

comment:26 by kallisti5, 4 years ago

That's just plain odd. hrev53645 looks like a solid change. Assigning to Jessica to see if she has the bandwidth to investigate.

comment:27 by kallisti5, 4 years ago

Owner: changed from kallisti5 to jessicah

comment:28 by pulkomandy, 4 years ago

See my comments in the change request:

https://review.haiku-os.org/c/haiku/+/2040

This "drop device path end" was used to find the parent of the partition device (that is, the whole disk we are booting from). Removing this means we are now always referring to the partition, which may work in some EFI implementations, but not all. So, we should either:

  • Restore the code to find the disk from the partition, but in a way that doesn't crash qemu, or,
  • Try to use the partition, and if that doesn't work, restore the code to attempt to boot from the disk

comment:29 by tqh, 4 years ago

I think I have a simpler way of doing things. So maybe right now we should just revert the commit in question.

comment:30 by kallisti5, 4 years ago

-1 on reverting at this moment in time.

I have a pretty extensive rework of EFI in-flight and booting in qemu is a requirement. I'd rather have a slightly broken VirtualBox EFI environment instead of a broken qemu environment since qemu is a lot more reliable for testing EFI images.

comment:31 by tqh, 4 years ago

Right now anyboot images don't boot in UEFI mode. And I boot in QEMU all the time, so I think probably you want a better QEMU setup.

comment:32 by jessicah, 4 years ago

Anyboot images do work with QEMU (old OVMF and new OVMF). I've also just tested both USB HDD and AHCI HDD with the latest anyboot image in VirtualBox 6.1.0 (although needed to specifically choose the EFI loader from shell for the latter), and it also works just fine.

About the only regression I've seen is that Haiku shows up twice in the boot menu. Pretty sure this is why I had the device path end code there to avoid that happening.

The fix addressed newer versions of OVMF; older versions, such as linked from https://jessicah.github.io/working-on-uefi booted with and without the fix.

Last edited 4 years ago by jessicah (previous) (diff)

comment:33 by jessicah, 4 years ago

I suspect the reason that the anyboot didn't work as AHCI HDD is that it didn't find a GPT-based partition table. EFI Shell only showed the disk found as MBR, which suggests we need a protective MBR with GPT, which the anyboot currently doesn't do.

Although, that doesn't explain why USB HDD worked without dropping into EFI shell... the anyboot image really is a bit of a monster :-/

Perhaps we should drop the anyboot tool and solely use xorriso? I'm still trying to grok their documentation (really is terrible), but it sounds like it should be able to produce a USB bootable ISO with both MBR and UEFI support...

Last edited 4 years ago by jessicah (previous) (diff)

comment:34 by kallisti5, 4 years ago

Perhaps we should drop the anyboot tool and solely use xorriso? I'm still trying to grok their documentation (really is terrible), but it sounds like it should be able to produce a USB bootable ISO with both MBR and UEFI support...

That's not really a thing. Linux distros use a tool called isohybrid to make iso's bootable as hdd images (aka, flash drives) . our anyboot works a lot like isohybrid (but with Haiku in mind and a lot more compact)

comment:35 by pulkomandy, 4 years ago

I will need to rework the anyboot tool for sparc (which uses a completely different partitionning format). I'll look into what can be done and I think we can fit a GPT partition table without problems.

comment:36 by kallisti5, 4 years ago

For QEMU testing, lets standardize on the tools in the Haiku repo. (you don't have to run the actual script, just ensure you're testing with the same commands.)

https://git.haiku-os.org/haiku/tree/src/tests/qemu-boot-test

Waddlesplash has confirmed to me in IRC on Intel hardware he sees the following:

  • USB UEFI loader starts fine
  • USB UEFI loader doesn't see the USB partition
  • USB UEFI loader sees the SATA disks / partitions.

I booted the image on my AMD Ryzen hardware:

  • USB UEFI loader starts fine
  • USB UEFI loader sees the USB partition and boots.

We have run into this issue in the past on USB disks (either them not showing up, or haiku not "automatically booting them" and having to manually select the USB partition).

I definitely agree that if hrev53645 has caused clear regressions booting Haiku under UEFI on real hardware *as well as* QEMU, it should be reverted. I see a bunch of conflicting information above, which is why "reverting" isn't a quick and easy call.

From the reports in this ticket, UEFI + USB boots on some hardware / emulators, it doesn't boot on other hardware / emulators.

hrev53645 seems to have "scrambled" who sees what working... which means pre-hrev53645 and post-hrev53645 are both "incorrect".

comment:37 by pulkomandy, 4 years ago

hrev53645 seems to have "scrambled" who sees what working... which means pre-hrev53645 and post-hrev53645 are both "incorrect".

Yes, so as suggested earlier:

Try to use the partition directly, and if that doesn't work, restore the code to attempt to boot from the whole disk instead

This increases our chances that we will somehow manage to find and boot the partition in all cases.

in reply to:  37 comment:38 by jessicah, 4 years ago

Replying to pulkomandy:

hrev53645 seems to have "scrambled" who sees what working... which means pre-hrev53645 and post-hrev53645 are both "incorrect".

Yes, so as suggested earlier:

Try to use the partition directly, and if that doesn't work, restore the code to attempt to boot from the whole disk instead

This increases our chances that we will somehow manage to find and boot the partition in all cases.

Probably could, the only issue would be doing it in a way that doesn't result in the firmware completely hanging.

comment:39 by tqh, 4 years ago

I already have an alternate way that doesn't need to mess with partitions, so if we are reluctant to fix the regression give me some time to clean that up.

comment:40 by pulkomandy, 4 years ago

I confirm that reverting that commit is needed for the bootloader to find partitions on Preetpal's laptop. I have added a bootloader log buffer so it's possible to see the bootloader log in that situation (from the debug options menu), hope that helps.

comment:41 by kallisti5, 4 years ago

@tqh you mentioned a fix 12 days ago. Is that work still ongoing?

comment:42 by tqh, 4 years ago

Yes. Just havn't had much time yet.

comment:43 by pulkomandy, 4 years ago

Component: System/Boot LoaderSystem/Boot Loader/EFI

comment:44 by oco, 4 years ago

During the coding sprint, i have spend some time on the UEFI stuff, mostly to understand things and maybe to help as a side effect.

I have already made some UEFI application with... Freepascal ! But not much beyond an hello world !

After adding a lot of trace (and thanks to pulkomandy's trace log he fixed on monday), i was able to somewhat figure how it works.

If i have understood things correctly, we search for the device path of the partition from where the UEFI bootloader was launched. This device path is constituted of different nodes identifying the links from the PCI bus to the partition.

With current source, the found device path end on the EFI partition (fat32). Haiku use his own partition parsing, so it need to have a BlockIO over the full disk, not on the partition where is located the UEFI boot loader.

The removed code in https://git.haiku-os.org/haiku/tag/?h=hrev53645 was probably hacking the device path to stop it at the drive level (with special values to end the path node chain).

Maybe, in QEMU, it was done directly on an internal structure, that is no more accessible ?

I have use an UEFI protocol (efi_device_path_utilities_protocol) that contains a function to duplicate a full device path (with all it's nodes) and then editing the last node to end the chain before the partition part. I finally use this new device path to ask for a BlockIO.

It looks like it work on real hardware and in QEMU.

My patch still need some work (really quick and dirty). It need a more repeatable algorithm to end the device path at the right place (it is fixed for my laptop currently).

I have also used the UEFI PathToText service to get more human readable representations of device patch in my investigations like this : PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0).

We maybe should use them more and get them in our log file. It still need a function to convert char16_t (native string format in UEFI) to char ascii though. I end up using UEFI fonction to write them on the UEFI console (with a wait on keystroke to be able to see them).

comment:45 by tqh, 4 years ago

We should not use device paths to find disks. Device paths are not only for block devices.

My code iterates over block devices. UEFI lists one per disk + one for each partition. It skips all partition block devices and then let general Haiku bootplatform read and find Haiku partitions. It is very simple, we can also revert an API change we did for boot devices making API for all boot platforms simpler.

There is a special case for a fixed block device with one partition, where it only lists one block device. I don't think we ever plan or can boot on that anyway though.

I already have it compiling and booting on my machines, but it needs cleanup before uploading to Gerritt.

Last edited 4 years ago by tqh (previous) (diff)

comment:46 by oco, 4 years ago

Well, what i have described was only aimed at fixing add_boot_device_for_image (which is the path used on my laptop). If there is a better general solution in the works, it is fine for me.

comment:47 by tqh, 4 years ago

Here is how I think our EFI devices code should work after reading the specs: https://review.haiku-os.org/c/haiku/+/2232

comment:48 by waddlesplash, 4 years ago

Resolution: fixed
Status: assignedclosed

Fix merged in hrev53848.

Note: See TracTickets for help on using tickets.