Opened 16 months ago
Closed 14 months ago
#18541 closed bug (duplicate)
System hangs on rocket icon after upgrade from hrev517193 to hrev57199
Reported by: | Starcrasher | Owned by: | nobody |
---|---|---|---|
Priority: | high | Milestone: | R1/beta5 |
Component: | Drivers/Network/ipro1000 | Version: | R1/beta4 |
Keywords: | Cc: | ||
Blocked By: | Blocking: | #18593 | |
Platform: | All |
Description
After updating my 64 bit nightly VM (QEmu+KVM) from hrev517193 to hrev57199, the system hangs after ipro1000 init Here are the last lines in the console.
pci_reserve_device(0, 3, 0, ipro1000) if_initname(0xffffffff8986f800, em, 24) [ipro1000] ipro1000: /dev/net/ipro1000/0 [ipro1000] (em) attach_pre capping queues at 1 [ipro1000] (em) bus_alloc_resource(3, [16], 0x0, 0xffffffffffffffff, 0x1,0x2) set MTRRs to: mtrr: 0: base: 0x7ffe0000, size: 0x20000, type: 0 mtrr: 1: base: 0xf8000000, size: 0x8000000, type: 0 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 1 [ipro1000] (em) bus_alloc_resource(4, [20], 0x0, 0xffffffffffffffff, 0x1,0x2) [ipro1000] (em) Using 1024 TX descriptors and 1024 RX descriptors [ipro1000] (em) allocated for 1 tx_queues [ipro1000] (em) allocated for 1 rx_queues [ipro1000] (em) bus_alloc_resource(1, [0], 0x0, 0xffffffffffffffff, 0x1,0x6) if_attach 0xffffffff8945fd20
Attachments (3)
Change History (29)
by , 16 months ago
Attachment: | syslog-57193.txt added |
---|
comment:1 by , 16 months ago
It seems that ipro1000 doesn't even fully init. These are same lines when it works. You can see two more lines about ipro1000 before start of package daemon stuff.
pci_reserve_device(0, 3, 0, ipro1000) if_initname(0xffffffff897be000, em, 24) [ipro1000] ipro1000: /dev/net/ipro1000/0 [ipro1000] (em) attach_pre capping queues at 1 [ipro1000] (em) bus_alloc_resource(3, [16], 0x0, 0xffffffffffffffff, 0x1,0x2) set MTRRs to: mtrr: 0: base: 0x7ffe0000, size: 0x20000, type: 0 mtrr: 1: base: 0xf8000000, size: 0x8000000, type: 0 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 1 [ipro1000] (em) bus_alloc_resource(4, [20], 0x0, 0xffffffffffffffff, 0x1,0x2) [ipro1000] (em) Using 1024 TX descriptors and 1024 RX descriptors [ipro1000] (em) allocated for 1 tx_queues [ipro1000] (em) allocated for 1 rx_queues [ipro1000] (em) bus_alloc_resource(1, [0], 0x0, 0xffffffffffffffff, 0x1,0x6) if_attach 0xffffffff898c2920 ipro1000: init_driver(0xffffffff81c39a70) loaded driver /boot/system/add-ons/kernel/drivers/dev/net/ipro1000 package_daemon: [3323706: 315] 2023-08-06 15:54:25 KERN: latest volume state: ...
Will try on my USB key to see if it also happens on real hardware.
comment:3 by , 16 months ago
I booted on VMware with ipro1000 on builds up through hrev57197 during testing, so I guess the GCC 13 upgrade is the most likely culprit to have broken something.
comment:4 by , 16 months ago
I updated my USB key to hrev57199 and all worked fine on real hardware but there ipro1000 is not in use. Both 32 bit and 64 bit installs were a bit slow to start but it's probably due to package installation scripts as I didn't update them for a little while.
Following your comment, I tested with hrev57197 nightly image in a VM and, indeed it was working at this stage.
comment:5 by , 16 months ago
Component: | - General → Drivers/Network/ipro1000 |
---|
So, I can't reproduce this problem when using ipro1000 in either VMware or QEMU.
What QEMU version are you running? What's the exact device specified in the QEMU command line? Can you try and get a KDL backtrace? Anything different in QEMU without KVM?
comment:6 by , 16 months ago
I tested on bare metal. ipro1000 works fine, downloaded a bunch of stuff.
comment:7 by , 16 months ago
QEMU emulator version 5.2.0 (qemu-5.2.0-4.mga8)
<interface type="network">
<mac address="52:54:00:5e:79:a4"/> <source network="Haiku-Net"/> <model type="e1000"/> <address type="pci" domain="0x0000" bus="0x00" slot="0x03" function="0x0"/>
</interface>
It doesn't crash, it just hanging forever without verbosing.
Well, the advantage is that with KVM you can use the virtual machine manager GUI https://virt-manager.org/ so there's no command line to deal with. The only thing that I had to set up manually was Shorewall config otherwise traffic is blocked and you can't access the net.
If it's a problem on my side then I will try to install a new VM with a recent nightly image. I have no important stuff on that VM. It allows me, from time to time, to try an app or to check the exact location of a sentence when I translate things on polyglot.
comment:8 by , 16 months ago
I'm running QEMU 7.0 here.
Can you test via the command line, and without KVM, and see if anything's different?
I may be able to glean more information by walking you through some kernel debugging steps, but that would have to happen over IRC/Matrix or something like that.
comment:9 by , 16 months ago
I'm running Haiku on a type 1 Hyper-v machine. I have screenshotted the backtrace below.
comment:10 by , 16 months ago
This is clearly a different problem. Please capture a full syslog using serial out on the VM, open a new ticket, and attach it there.
comment:11 by , 16 months ago
Well, the advantage is that with KVM you can use the virtual machine manager GUI https://virt-manager.org/ so there's no command line to deal with.
That is convenient to you, but to reproduce the issue, we need the exact configuration used, and the simplest way to get that is a command line we can copy and paste.
It is likely that virt-manager is including some command line arguments that end up causing compatibility problems, whereas qemu default settings don't.
So, do you have a way to extract the qemu command line and share that?
comment:12 by , 16 months ago
Milestone: | Unscheduled → R1/beta5 |
---|---|
Priority: | normal → high |
comment:13 by , 16 months ago
I can confirm this happens here as well. 57197 is fine, 57199 is not.
When booting under QEMU with KVM enabled:
extra data[0]: 0x0000000000000001 extra data[1]: 0x700f66026e0f660f extra data[2]: 0x31de4db70f44e0c8 extra data[3]: 0x0000000000000031 extra data[4]: 0x0000000000000000 extra data[5]: 0x0000000000000000 extra data[6]: 0x0000000000000000 extra data[7]: 0x0000000000000000 emulation failure 64888RAX=0000000000000e00 RBX=ffffffff81c7c000 RCX=ffffffff81c7e598 RDX=ffffffff81ddbe00 RSI=ffffffff81c7b84c RDI=0000000000000032 RBP=ffffffff81c7b930 RSP=ffffffff81c7b900 R8 =0000000000000008 R9 =0000000000000140 R10=0000000000000008 R11=000000000000007d R12=0000000000000000 R13=ffffffff82179738 R14=ffffffff82179738 R15=0000000000000030 RIP=ffffffff81d82afc RFL=00010286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 ffffffff 00c00000 CS =0008 0000000000000000 ffffffff 00a09900 DPL=0 CS64 [--A] SS =0010 0000000000000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0000 0000000000000000 ffffffff 00c00000 FS =0000 00007f73678f7000 ffffffff 00c00000 GS =0000 ffffffff82780400 ffffffff 00c00000 LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT TR =00f0 ffffffff801b5a00 00000068 00008b00 DPL=0 TSS64-busy GDT= ffffffff80210ce0 0000062f IDT= ffffffff8020fce0 00000fff CR0=80010031 CR2=000000b25795afb0 CR3=0000000010444000 CR4=003406e0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000d01 Code=00 48 8b 0b 83 39 01 48 8b 51 08 0f 84 4b 01 00 00 48 01 c2 <66> 0f 6e 02 66 0f 70 c8 e0 44 0f b7 4d de 31 c0 31 f6 66 0f d6 8b 1c 01 00 00 bf ff 00 00
When booting without KVM, it starts up but doesn't get to the desktop. The serial log shows that the same ipro1000/em driver was the last thing initializing.
The version of QEMU doesn't matter. The behavior is exactly the same on:
QEMU emulator version 7.2.1 QEMU emulator version 7.2.5 QEMU emulator version 8.0.92 (v8.1.0-rc2-80-g0450cf0897)
This happens without any virt-manager involved. The command-line that triggers it every time is: qemu-system-x86_64 -usbdevice tablet -smp 12 -m 2G -net user,hostfwd=tcp::25723-:22 -net nic -drive file=haiku64.raw,format=raw -boot menu=on -vga vmware -enable-kvm -cpu host,host-phys-bits
I usually add -serial mon:stdio to capture the debug output from the serial, but it's not exciting. Everything is normal until it crashes or hangs.
comment:14 by , 16 months ago
I just noticed that virtio
networking is supported.
I switched from the qemu default network interface (which is e1000 or e1000e) to virtio (-net nic,model=virtio-net-pci
) and it works fine. So the problem is that driver, for sure.
Just in case you need it, using the e1000 NIC, here are the last parts from the debug log, right when the ipro1000 driver crashing, but I'm not sure if it is helpful.
pci_reserve_device(0, 3, 0, ipro1000) if_initname(0xffffffff889c9000, em, 16) [ipro1000] ipro1000: /dev/net/ipro1000/0 [ipro1000] (em) attach_pre capping queues at 2 [ipro1000] (em) bus_alloc_resource(3, [16], 0x0, 0xffffffffffffffff, 0x1,0x2) set MTRRs to: mtrr: 0: base: 0x7ffe0000, size: 0x20000, type: 0 mtrr: 1: base: 0xfe000000, size: 0x2000000, type: 0 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 1 [ipro1000] (em) EM_NVM_PCIE_CTRL = 0x460b [ipro1000] (em) EEPROM V2.1-0 [ipro1000] (em) Using 1024 TX descriptors and 1024 RX descriptors [ipro1000] (em) msix_init qsets capped at 2 [ipro1000] (em) bus_alloc_resource(3, [28], 0x0, 0xffffffffffffffff, 0x1,0x2) set MTRRs to: mtrr: 0: base: 0x7ffe0000, size: 0x20000, type: 0 mtrr: 1: base: 0xfe000000, size: 0x2000000, type: 0 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 1 [ipro1000] (em) queue equality override not set, capping rx_queues at 1 and tx_queues at 1 [ipro1000] (em) Using 1 RX queues 1 TX queues set MTRRs to: mtrr: 0: base: 0x7ffe0000, size: 0x20000, type: 0 mtrr: 1: base: 0xfe000000, size: 0x2000000, type: 0 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 1 allocate_io_interrupt_vectors: allocated 2 vectors starting from 24 msi_allocate_vectors: allocated 2 vectors starting from 24 msix configured for 2 vectors [ipro1000] (em) Using MSI-X interrupts with 2 vectors [ipro1000] (em) allocated for 1 tx_queues [ipro1000] (em) allocated for 1 rx_queues [ipro1000] (em) bus_alloc_resource(1, [1], 0x0, 0xffffffffffffffff, 0x1,0x2) msi-x enabled: 0x8004 [ipro1000] (em) bus_alloc_resource(1, [2], 0x0, 0xffffffffffffffff, 0x1,0x2) msi-x enabled: 0x8004 if_attach 0xffffffff89954b20 KVM internal error. Suberror: 1
comment:15 by , 16 months ago
I switched from e1000 to rtl8139 and it also works. So it's definitely in ipro1000 init.
comment:17 by , 16 months ago
There should be a nightly image built with GCC 13.2. Could you retest?
by , 16 months ago
Attachment: | Config-xml-Qemu-network.txt added |
---|
XML config file of the network in qemu GUI
comment:18 by , 16 months ago
After updating the VM to hrev57214, ipro1000 is still hanging right after if_attach message. I tried to install a new VM with same hrev nightly iso with same result.
comment:19 by , 16 months ago
I can't find anything online for "emulation failure 64888".
Is there any way we can get the faulting instruction from QEMU debugger?
comment:20 by , 15 months ago
Yes, it appears so: using the command x/16i <address>
at the compat monitor. Can whoever can reproduce this please do that, using the address of RIP
from the registers dump (e.g. as seen in comment:13) and paste the output here?
comment:21 by , 14 months ago
Blocking: | 18593 added |
---|
comment:24 by , 14 months ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:25 by , 14 months ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:26 by , 14 months ago
Resolution: | → duplicate |
---|---|
Status: | reopened → closed |
Closing as "duplicate" instead of "fixed" as the problem has merely been mitigated.
Syslog when it's working (hrev57193)