Opened 4 years ago

Closed 3 years ago

#12324 closed bug (fixed)

VirtualBox 5.0.2: boot hangs on rocket if ahci is used

Reported by: taos Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: Drivers/Disk Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

Trying to boot hrev49564 64bit under VirtualBox 5.0.2 under ArchLinux with SATA AHCI storage controller stops at rocket icon with:

ahci: device signature 0x00000101 (ATA)
ahci: ExecuteAtaRequest port 0: device timeout
ahci: sata_request::abort called for command 0x25
file_cache: read pages failed: Device timeout
packagefs [1836492912:   187] ReadFileData(5972574, 0xffffffff86e20000, 14409) failed to read data: Device timeout
file_cache: read pages failed: Device timeout
runtime_loader: /boot/system/lib/libdebug.so: Troubles reading ELF header
ahci: AHCIPort::ResetPort port 0
ahci: AHCIPort::ResetPort port 0, deviceBusy 0, forceDeviceReset 0
ahci: AHCIPort::PostReset port 0
ahci: device signature 0x00000101 (ATA)
ahci: ExecuteAtaRequest port 0: device timeout
ahci: sata_request::abort called for command 0x25
ahci: AHCIPort::ResetPort port 0
ahci: AHCIPort::ResetPort port 0, deviceBusy 0, forceDeviceReset 0
ahci: AHCIPort::PostReset port 0
ahci: device signature 0x00000101 (ATA)
ahci: ExecuteAtaRequest port 0: device timeout
ahci: sata_request::abort called for command 0x25

Complete serial log is attached.

From .vbox file, the affected storage controller entry:

    <StorageControllers>
      <StorageController name="SATA" type="AHCI" PortCount="2" useHostIOCache="false" Bootable="true" IDE0MasterEmulationPort="0" IDE0SlaveEmulationPort="1" IDE1M
asterEmulationPort="2" IDE1SlaveEmulationPort="3">
        <AttachedDevice type="HardDisk" hotpluggable="false" port="0" device="0">
          <Image uuid="{8f3c1696-353c-4610-9efd-0ba785f17099}"/>
        </AttachedDevice>
        <AttachedDevice type="HardDisk" hotpluggable="false" port="1" device="0">
          <Image uuid="{0464f263-f017-4ab6-81d9-3067c404a384}"/>
        </AttachedDevice>
      </StorageController>
    </StorageControllers>

The attached vmdk images are raw disks: the boot disk is a hard disk partition, the other is a microsd card.

Until last week, this configuration worked. I'm not sure if one of the Haiku updates caused the problems (don't think so because if I try to boot older states from Haiku boot menu I'm still stuck at the rocket) or the VirtualBox update from version 5.0.0 to 5.0.2.

If I attach both raw vmdk disks with a virtual IDE controller I can still boot to the desktop.

Attachments (18)

hrev49564.serial.log (79.0 KB ) - added by taos 4 years ago.
Serial log of unsuccessful start of hrev49564.
Haiku 64bit.vbox (8.3 KB ) - added by taos 4 years ago.
VirtualBox configuration file.
Boot_bt_hrev49666.png (29.0 KB ) - added by taos 4 years ago.
Initial KDL after disk icon (hrev49666).
Boot_co_1_hrev49666.png (29.0 KB ) - added by taos 4 years ago.
First additional KDL after continuing (hrev49666).
Boot_co_2_hrev49666.png (29.0 KB ) - added by taos 4 years ago.
Second additional KDL after continuing (hrev49666).
Boot_co_3_hrev49666.png (29.2 KB ) - added by taos 4 years ago.
Third additional KDL after continuing (hrev49666).
Boot_co_4_hrev49666.png (35.9 KB ) - added by taos 4 years ago.
Booting stops at rocket icon after continuing (hrev49666).
hrev49666.serial.log.txt (63.2 KB ) - added by taos 4 years ago.
Serial log of unsuccessful start of hrev49666.
vmware_serial_7.1.2_ahci_slow_boot (178.1 KB ) - added by diver 4 years ago.
vmware_serial_7.1.2_pre_ahci_changes_hrev49292 (132.3 KB ) - added by diver 4 years ago.
vmware_serial_hrev49683_slow_boot (177.5 KB ) - added by diver 4 years ago.
ATA.png (137.9 KB ) - added by kallisti5 4 years ago.
kallisti5's config
hrev49684.serial.log (56.9 KB ) - added by taos 4 years ago.
Serial log of unsuccessful start of hrev49684.
VirtualBox_5.0.6_ATA.png (47.2 KB ) - added by taos 4 years ago.
Storage configuration of VirtualBox 5.0.6 under ArchLinux.
hrev49705_warm_boot.serial.log.txt (165.3 KB ) - added by taos 4 years ago.
Syslog of warm boot: hrev49648 -> hrev49705.
Haiku.vbox (19.6 KB ) - added by diver 4 years ago.
.vbox file from VirtualBox 5.0.8
hrev49721_good.serial.log (71.5 KB ) - added by taos 4 years ago.
Serial log of successful cold boot of hrev49721.
hrev49721_bad.serial.log (66.4 KB ) - added by taos 4 years ago.
Serial log of failed cold boot of hrev49721.

Download all attachments as: .zip

Change History (44)

by taos, 4 years ago

Attachment: hrev49564.serial.log added

Serial log of unsuccessful start of hrev49564.

by taos, 4 years ago

Attachment: Haiku 64bit.vbox added

VirtualBox configuration file.

comment:1 by diver, 4 years ago

Component: - GeneralDrivers/Disk

comment:2 by taos, 4 years ago

BTW, it works with SATA if "Use Host I/O Cache" is enabled (hrev49602 x86_64).

comment:3 by waddlesplash, 4 years ago

So it's still busted on hrev49602 without "Use Host I/O Cache"?

in reply to:  3 comment:4 by taos, 4 years ago

Replying to waddlesplash:

So it's still busted on hrev49602 without "Use Host I/O Cache"?

Yes. I think hrev49602 solved boot issues described in #12338 where booting stopped at the disk icon (in my case I always reach the stage of the lit up rocket icon). It seems that - apart from involvement of ahci - VirtualBox's issue seems different here and it's still possible that an update of VirtualBox caused this problem. With "Use Host I/O Cache" enabled, hrev49564 boots, too - so no change after hrev49602.

comment:5 by taos, 4 years ago

Works again with "Use Host I/O Cache" disabled (checked with hrev49664 x86_64, VirtualBox 5.04 on Arch).

comment:6 by diver, 4 years ago

It looks like ahci is broken in VMware Fusion 7.1.2, but works in 8.0. It used to work in 7.1.2 as well. Need to recheck.

comment:7 by taos, 4 years ago

Nice, with hrev49666 x86_64, booting stops again (eventually) at the rocket icon if "Use Host I/O Cache" is disabled. However, this time there's a little more information:

After the disk icon lights up, there's a visit to KDL (screenshot after "bt" command): Initial KDL after disk icon (hrev49666).

We can continue ("co" command) and get another KDL (incl. bt): First additional KDL after continuing (hrev49666).

We can continue (again) and get a new KDL (incl. bt): Second additional KDL after continuing (hrev49666).

And again: Third additional KDL after continuing (hrev49666).

After entering "co", booting continues until we're finally stuck at the rocket icon: Booting stops at rocket icon after continuing (hrev49666).

A serial log is attached as hrev49666.serial.log.txt.

Last edited 4 years ago by taos (previous) (diff)

by taos, 4 years ago

Attachment: Boot_bt_hrev49666.png added

Initial KDL after disk icon (hrev49666).

by taos, 4 years ago

Attachment: Boot_co_1_hrev49666.png added

First additional KDL after continuing (hrev49666).

by taos, 4 years ago

Attachment: Boot_co_2_hrev49666.png added

Second additional KDL after continuing (hrev49666).

by taos, 4 years ago

Attachment: Boot_co_3_hrev49666.png added

Third additional KDL after continuing (hrev49666).

by taos, 4 years ago

Attachment: Boot_co_4_hrev49666.png added

Booting stops at rocket icon after continuing (hrev49666).

by taos, 4 years ago

Attachment: hrev49666.serial.log.txt added

Serial log of unsuccessful start of hrev49666.

in reply to:  7 comment:8 by mmlr, 4 years ago

Replying to taos:

Nice, with hrev49666 x86_64, booting stops again (eventually) at the rocket icon if "Use Host I/O Cache" is disabled. However, this time there's a little more information:

This is a bit different from the original problem and likely caused by hrev49665. It's apparently a failure of establishing device communication as the syslog shows that both AHCI port 0 and 1 time out after the initial port reset and detecting device presence.

Looking at the old syslog, one can see that the eventual error there also seem to be timeouts, but only on the actual ATA requests. The initial communication seems to work as does querying the block size and verifying the boot volume.

Continuing through KDLs indicating the failure to find/mount/use the boot volume isn't really meaningful. In such a case the kernel (which was already loaded from disk with the help of the BIOS/firmware) will just boot to completion (i.e. the rocket icon). Without having the boot volume available there's then no way to load and start the userland after that so it will just halt there.

comment:9 by diver, 4 years ago

I've just tested an earlier build I had - hrev49292 - and VMware Fusion is able to boot it in ahci mode. But now it gets stuck in hrev49665.

comment:10 by kallisti5, 4 years ago

The backtraces aren't really useful, that's just the kernel freaking out about losing the disk.

Which version of VMWare Fusion are you doing these latest tests on? During my testing I used qemu, VirtualBox 5.0, real IvyBridge Intel machines, and AMD SB6xx.

Syslogs with the AHCI drive configuration + emulator information are the most helpful thing.

comment:11 by vidrep, 4 years ago

I attached a syslog to ticket #12415 which "may" be of some help.

comment:12 by diver, 4 years ago

Weird, I've tested hrev49681 again and it now boots in VmWare 7.1.2 but very slowly with 28 AHCI ports. It was very fast in hrev49292. Syslog attached.

comment:13 by kallisti5, 4 years ago

The slow boot is a known issue with large numbers of disconnected ports. We iterate over each port attempting to power it up (and the powerup is delayed by the AHCI timeouts). I have a small patch in mind to resolve that and can reproduce it in VirtualBox.

So it looks like this one isn't really an issue anymore? (and we can open a ticket for the slow boot and re-open #12415?)

comment:14 by diver, 4 years ago

I've just tried your changes done in hrev49683 and they didn't change anything wrt vmware at least. BTW, #12415 is still open.

comment:15 by taos, 4 years ago

Booting hrev49684 x86_64, I still get the same KDL with VirtualBox 5.0.6 and "Use Host I/O Cache" disabled as described in comment:7.

comment:16 by kallisti5, 4 years ago

Really strange... I just can't reproduce that issue here in VirtualBox 5.0.0. I've been booting the new code all morning with Host I/O cache disabled.

by kallisti5, 4 years ago

Attachment: ATA.png added

kallisti5's config

comment:17 by kallisti5, 4 years ago

taos: Could you grab a syslog from that machine in vbox 5.0.6?

Since it's virtualbox you can configure the first serial port to write to a file on disk.

Settings -> serial ports -> enable serial port -> raw file -> /tmp/syslog.out

in reply to:  17 comment:18 by taos, 4 years ago

Replying to kallisti5:

taos: Could you grab a syslog from that machine in vbox 5.0.6?

Here it is: hrev49684.serial.log.

BTW, it now doesn't work with "Use Host I/O Cache" enabled either.

by taos, 4 years ago

Attachment: hrev49684.serial.log added

Serial log of unsuccessful start of hrev49684.

by taos, 4 years ago

Attachment: VirtualBox_5.0.6_ATA.png added

Storage configuration of VirtualBox 5.0.6 under ArchLinux.

comment:19 by taos, 4 years ago

Update: I can't cold boot hrev49705 x86_64 at all with ahci under VirtualBox 5.0.6 (it doesn't matter if "Use Host I/O Cache" is enabled or not). The only way to get it working is booting an older revision from the boot menu (hrev49648 in my case) and then restart with shutdown -r from Terminal or by choosing "Shutdown..." -> "Restart system" from the leaf menu. Syslog of warm boot is attached.

by taos, 4 years ago

Syslog of warm boot: hrev49648 -> hrev49705.

comment:20 by diver, 4 years ago

I've just tried VirtualBox 5.0.6 under OS X 10.9.5 and it works just fine with or without "Use Host I/O Cache" setting.

comment:21 by taos, 4 years ago

uname -a gives me Linux archacer 4.2.3-1-ARCH #1 SMP PREEMPT Sat Oct 3 18:52:50 CEST 2015 x86_64 GNU/Linux on my system.

@diver: Could you maybe attach your .vbox configuration file for comparison?

Last edited 4 years ago by taos (previous) (diff)

by diver, 4 years ago

Attachment: Haiku.vbox added

.vbox file from VirtualBox 5.0.8

comment:22 by vidrep, 4 years ago

Can somebody remove me from the mailing list for this ticket? Thanks!

comment:23 by taos, 4 years ago

Hrrrmmphh...

Okay, so I tried to change my configuration to better match diver's (apart from the operating system of course) by adjusting chipset and pointing device, and still couldn't cold boot haiku. After that I updated VirtualBox to version 5.0.8.

Some statistics for "Use Host I/O Cache" enabled:

  • Cold booting hrev49705 still doesn't work and the workaround (booting an older version then warm booting hrev40705) seems to have stopped working. I've updated haiku via pkgman from the older version.
  • Cold booting hrev49719 worked 4 times out of 4, warm booting (from hrev49719) worked 1 time out of 2.
  • Cold booting hrev49721 worked 3 times out of 8 (serial log attached as hrev49721_good.serial.log), warm booting from hrev49719 never, from hrev49721 1 time out of 4 and from hrev49648 never (tried 2 times) (see hrev49721_bad.serial.log for an unsuccessful cold boot).

With "Use Host I/O Cache" disabled, I've never succeeded.

Last edited 4 years ago by taos (previous) (diff)

by taos, 4 years ago

Attachment: hrev49721_good.serial.log added

Serial log of successful cold boot of hrev49721.

by taos, 4 years ago

Attachment: hrev49721_bad.serial.log added

Serial log of failed cold boot of hrev49721.

comment:24 by taos, 4 years ago

With VirtualBox 5.0.10 or 5.0.12 (I think I haven't updated to 5.0.14 yet), it doesn't matter if "Use Host I/O Cache" is enabled or not, booting does no longer stop at the rocket icon. However, the "get_boot_partitions failed" at the lit disk icon stage still happens (roughly during every second boot or so).

comment:25 by taos, 4 years ago

Since the initial problem is gone after updating haiku and VirtualBox, shouldn't we close this ticket? Additionally, I haven't seen the "get_boot_partitions failed" KDL for a while...

comment:26 by kallisti5, 3 years ago

Resolution: fixed
Status: newclosed

seems fixed with recent ahci reworks.

Note: See TracTickets for help on using tickets.