Opened 2 years ago
Last modified 16 months ago
#17852 new bug
VirtualBox KDL "AT Keyboard 1 watcher" on boot
Reported by: | avanspector | Owned by: | nobody |
---|---|---|---|
Priority: | high | Milestone: | Unscheduled |
Component: | Drivers/Input/PS2/Keyboard | Version: | R1/Development |
Keywords: | Cc: | pulkomandy | |
Blocked By: | Blocking: | #17106, #17873, #18524 | |
Platform: | x86-64 |
Description
Both livecd and installed systems.
Attachments (2)
Change History (24)
by , 2 years ago
Attachment: | Снимок экрана 2022-07-27 181008.png added |
---|
follow-ups: 2 3 comment:1 by , 2 years ago
Component: | - General → Drivers/Input/PS2/Keyboard |
---|
comment:2 by , 2 years ago
comment:3 by , 2 years ago
Replying to waddlesplash:
What hrev, please?
Turns out it was crashing, because i gave the VM only 1 core. If give it 2 cores, works fine. Btw, is it necessary to be that way? Thought system is pretty light on things like this
comment:4 by , 2 years ago
It definitely shouldn't crash with only one core. In fact if anything I would expect a crash like this to happen with 2+ cores and not just one. That's extremely strange...
comment:5 by , 2 years ago
Blocking: | 17873 added |
---|
comment:6 by , 2 years ago
Cc: | added |
---|
There is now another report of this issue. CC'ing PulkoMandy who recently modified the PS/2 keyboard code, perhaps something was "shaken loose" there?
comment:7 by , 2 years ago
One thing I'm confused about.
The error says it is at address 0x864c6e8f in kernel space.
By looking at the register dump I see a register set to 0xffffffff864c6e8f (that would be a pointer to something allocated on the stack I think).
Was that pointer accidentally cast to 32 bits somewhere? Is that just a KDL display problem/quirk or is that really the root cause of this panic?
Can we get a disassembly of the code that's being run? (type "dis" at the KDL prompt).
comment:8 by , 2 years ago
Anyone who can reproduce this issue reliably, can you reply to PulkoMandy's request above (i.e. run "dis" at the KDL prompt and upload a new KDL log?)
by , 22 months ago
Attachment: | VirtualBox_Haiku_25_01_2023_18_21_22.png added |
---|
Running in Virtualbox 6.1, Windows 10 64, MSI laptop. Unable to perform dis, VM is frozen immediately after banner screen. Can only restart or shut down.
comment:9 by , 22 months ago
That makes sense, since the keyboard driver isn't working. Can VirtualBox emulate a serial port and allow sending the commands there?
comment:10 by , 22 months ago
it looks like the crash happens at https://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/bus_managers/ps2/ps2_dev.cpp#n434
r13
contains 0x100000000 which theorically shouldn't be a problem because the low 32-bit is 0 (the first output byte index), but on x86_64 the generated code is:
movzbl (%rax,%r13,1),%edi
so it uses the 64-bit value leading to an address like 0x864c6e8f.
The question is why r13
comes to the value of 0x100000000
. it is being set to 0xffffffffffffffff
, then add one.
This could simply be another VirtualBox bug. The exact VirtualBox version where this is reproduced would be interessant, while also trying with the last 6.1.x release. I couldn't find something telling in the changelog.
comment:11 by , 22 months ago
One case that would cause this and seems not unlikely to happen by accident. The setting of r13
uses a mov with sign extension:
From https://www.felixcloutier.com/x86/mov:
C7 /0 id MOV r/m32, imm32 MI Valid Valid Move imm32 to r/m32. REX.W + C7 /0 id MOV r/m64, imm32 MI Valid N.E. Move imm32 sign extended to 64-bits to r/m64.
As you can see, the normal version of C7 is a simple MOV. But the 64bit version with the REX.W prefix should sign-extend. This seems easy to overlook so my guess is Virtualbox somehow missed it? However I don't know how their internal works and couldn't find where they implement this instruction.
comment:12 by , 22 months ago
The problem in my case seems to be due to using a USB keyboard, since three keys of my MSI laptop are dead and can only be simulated using ALT-codes. I started VirtualBox one more time and it made it to the install screen (with the USB keys still plugged in). After the installation the error came back. I unplugged the USB keyboard and Haiku is starting up properly. I will have to replace the MSI keybed which I was avoiding due to the trouble. The laptop os out of warranty so sending it to CA is out of the question. Thanks for you attention. I hope this information helps someone.
comment:13 by , 22 months ago
For completeness, this is discussed on https://discuss.haiku-os.org/t/error-at-start-of-virtualbox-install-keyboard-related/13029 and it happened there with Virtualbox 6.1.30.
comment:14 by , 22 months ago
The host CPU would also be interessant to compare. Interessant is that clang tricks and computes everything from 0 instead of -1. We could also adjust the code to have the same behavior with gcc:
diff --git a/src/add-ons/kernel/bus_managers/ps2/ps2_dev.cpp b/src/add-ons/kernel/bus_managers/ps2/ps2_dev.cpp index 7637644427..df895c89e9 100644 --- a/src/add-ons/kernel/bus_managers/ps2/ps2_dev.cpp +++ b/src/add-ons/kernel/bus_managers/ps2/ps2_dev.cpp @@ -380,11 +380,10 @@ standard_command_timeout(ps2_dev* dev, uint8 cmd, const uint8* out, bigtime_t start; #endif int32 sem_count; - int i; TRACE("ps2: ps2_dev_command cmd 0x%02x, out-count %d, in-count %d, " "dev %s\n", cmd, out_count, in_count, dev->name); - for (i = 0; i < out_count; i++) + for (int i = 0; i < out_count; i++) TRACE("ps2: ps2_dev_command tx: 0x%02x\n", out[i]); res = get_sem_count(dev->result_sem, &sem_count); @@ -402,7 +401,7 @@ standard_command_timeout(ps2_dev* dev, uint8 cmd, const uint8* out, dev->result_buf = in; res = B_OK; - for (i = -1; res == B_OK && i < out_count; i++) { + for (int i = 0; res == B_OK && i < out_count + 1; i++) { atomic_and(&dev->flags, ~(PS2_FLAG_ACK | PS2_FLAG_NACK | PS2_FLAG_GETID | PS2_FLAG_RESEND)); @@ -422,7 +421,7 @@ standard_command_timeout(ps2_dev* dev, uint8 cmd, const uint8* out, res = ps2_wait_write(); if (res == B_OK) { - if (i == -1) { + if (i == 0) { if (cmd == PS2_CMD_GET_DEVICE_ID) atomic_or(&dev->flags, PS2_FLAG_CMD | PS2_FLAG_GETID); else if (cmd == PS2_CMD_RESEND) @@ -431,7 +430,7 @@ standard_command_timeout(ps2_dev* dev, uint8 cmd, const uint8* out, atomic_or(&dev->flags, PS2_FLAG_CMD); ps2_write_data(cmd); } else { - ps2_write_data(out[i]); + ps2_write_data(out[i - 1]); } } @@ -485,7 +484,7 @@ standard_command_timeout(ps2_dev* dev, uint8 cmd, const uint8* out, #ifdef TRACE_PS2_DEV TRACE("ps2: ps2_dev_command wait for input res 0x%08" B_PRIx32 ", " "wait-time %" B_PRId64 "\n", res, system_time() - start); - for (i = 0; i < in_count; i++) + for (int i = 0; i < in_count; i++) TRACE("ps2: ps2_dev_command rx: 0x%02x\n", in[i]); #endif }
comment:15 by , 22 months ago
I asked on Mastodon and someone confirmed they can reproduce the problem in VirtualBox 1.6.30 but not in the latest 7.x release.
https://mastodon.tetaneutral.net/@jookia@social.tchncs.de/109757725578214293
So, yes, we can workaround it, but it's likely that this version of Virtualbox will fail in other places as well?
comment:16 by , 22 months ago
Looks like Virtualbox fixes alike bugs lately, ie https://www.virtualbox.org/changeset/96860/vbox/trunk/
I suppose the bug isn't trivially in other places, because normally this would run on the cpu, but in this case, it looks like the code is interpreted by virtualbox.
comment:17 by , 22 months ago
Yes, reading the virtualbox manual (for version 6) it says that they will revert to software emulation for example when interrupts are disabled, and a lot more for kernel code than for user code.
I was looking at the code for the MOV instruction initializing R13 but did not find anything too strange. It is implemented here: https://www.virtualbox.org/browser/vbox/trunk/src/VBox/VMM/VMMAll/IEMAllInstructionsOneByte.cpp.h?rev=96860#L6473
The access to the operand is done using IEM_OPCODE_GET_NEXT_S32_SX_U64 which can be implemented in various different ways depending on both compile time and runtime factors. All it takes is one of these variants being incorrect. I checked some of them and they look ok, relying on the C compiler to do the sign extension by doing some casts between different types.
In VBox 1.6.30 this macro was defined in IEMAll.cpp but it is not there anymore. It ended up calling functions like https://www.virtualbox.org/browser/vbox/trunk/src/VBox/VMM/VMMAll/IEMAll.cpp?rev=96860#L1419 (or the non-slow variant if possible).
comment:18 by , 22 months ago
unofficial mirror if this helps: https://github.com/mdaniel/virtualbox-org-svn-vbox-trunk
comment:19 by , 22 months ago
Based on the stack trace (and the panic message confirms this), we should not be running in "interrupts disabled" mode here. So, are we in fact doing so? That sounds wrong. Alternatively, is there some other reason VirtualBox has chosen to use emulation instead of the recompiler? Maybe we can use this to identify why Haiku is so slow in VirtualBox.
comment:20 by , 22 months ago
Here is the full page of documentation about software virtualization. Interrupts are just one of many things that can trigger it: https://docs.oracle.com/en/virtualization/virtualbox/6.0/admin/swvirt-details.html
Note that this seems to be gone in version 7 of Virtualbox, I think the usage of hardware technologies largely replaced it.
comment:21 by , 16 months ago
Blocking: | 17106 added |
---|
comment:22 by , 16 months ago
Blocking: | 18524 added |
---|
What hrev, please?