Opened 3 years ago
Closed 2 years ago
#17595 closed bug (fixed)
Intel 11th gen slow performance
Reported by: | lkernan | Owned by: | tqh |
---|---|---|---|
Priority: | blocker | Milestone: | R1/beta4 |
Component: | Drivers/ACPI | Version: | R1/Development |
Keywords: | i5-1135G7 i7-1165G7 i7-1185G7 | Cc: | |
Blocked By: | Blocking: | #17594 | |
Platform: | x86-64 |
Description
Tonight I decided to do an install of the latest nightly on my Intel NUC 11th generation. It's the i3 model with an NVME boot drive that I installed Haiku onto.
From the moment it started the system fan went to full speed and even just moving the mouse or clicking menus was incredibly slow. Menus need multiple clicks before they respond. The task list menu showed the acpi_task item at 100%.
I saved a copy of syslog to a USB and it contains the following lines repeated continuously.
KERN: set MTRRs to: KERN: mtrr: 0: base: 0x40000000, size: 0x1000000, type: 0 KERN: mtrr: 1: base: 0x60000000, size: 0x20000000, type: 0 KERN: mtrr: 2: base: 0x80000000, size: 0x80000000, type: 0 KERN: mtrr: 3: base: 0x603d180000, size: 0x20000, type: 0 KERN: mtrr: 4: base: 0x4000000000, size: 0x10000000, type: 1
Attachments (17)
Change History (66)
by , 3 years ago
Attachment: | syslog.txt added |
---|
by , 3 years ago
Attachment: | listdev.txt added |
---|
by , 3 years ago
Attachment: | listusb.txt added |
---|
comment:1 by , 3 years ago
Component: | - General → Drivers/ACPI |
---|---|
Owner: | changed from | to
Version: | R1/beta3 → R1/Development |
comment:2 by , 3 years ago
Sounds like another one that decided to do locking between fw and hw differently. Which means embedded controller needs new lock handling. Surprised that Intel would do that, so maybe there is a newer standard.
Anyone interested in working on embedded controller in ACPI should take a look. Our embedded controller is similar to FreeBSD's but hasn't been synced in ages. As my test machines work fine, and Lenovo does their own non standard way. It is up for grabs for someone who wants to work on edge cases, and a lot of strange hardware.
by , 3 years ago
Attachment: | syslog-noacpi.txt added |
---|
comment:3 by , 3 years ago
I've added a syslog with ACPI disabled, hopefully that helps. That certainly stops the CPU usage, but the sluggish mouse input is still going on.
comment:4 by , 3 years ago
Interestingly, my new Framework laptop has the same issue. In my case the mouse and system moves *really* slow. I move my cursor and it slowly creeps the direction you move it.
Same result with a USB attached mouse and the internal trackpad. I checked the unhandled interrupt counts and it seems normal.
My system is based on the 11th gen i7-1165G7
comment:5 by , 3 years ago
Framework i7-1165G7:
- Same issue seen booting from USB and internal NVMe
- Booting via UEFI
- Disabling ACPI in the bootloader doesn't fix the issue
- Disabling SMAP and SMEP doesn't fix the issue
- Disabling SMP doesn't fix the issue
- Disabling IO-APIC doesn't fix the issue
- Disabling local APIC breaks boot (stalled on card icon)
- Disabling X2APIC doesn't fix the issue
comment:6 by , 3 years ago
Blocking: | 17594 added |
---|
comment:7 by , 3 years ago
Keywords: | i5-1135G7 i7-1165G7 i7-1185G7 added |
---|
Looks like we're starting to get a flood of these issues. The "sluggish mouse" issue seems to impact all Intel 11th Gen hardware from multiple vendors.
comment:8 by , 3 years ago
Milestone: | Unscheduled → R1/beta4 |
---|---|
Platform: | All → x86-64 |
Priority: | normal → blocker |
comment:9 by , 3 years ago
Summary: | Intel NUC 11th gen slow performance → Intel 11th gen slow performance |
---|
comment:10 by , 3 years ago
Witnessed in #17594 is a CPU speed of "45.59 Ghz" in pulse. That lends to estimates that this one is system timer related. We saw a similar issue on riscv64 when the system time speed was miscalculated.
I just confirmed on my framework laptop that pulse shows "41.68 Ghz"
comment:12 by , 3 years ago
I disabled the intel_pstates and intel_cstates power management add-ons. No change in behaviour (technically it's worse... no mouse movement at all)
Feels like the base system timer calculation is failing. The cstates / pstates stuff might be messing with the system speed making it "better".
comment:13 by , 3 years ago
I enabled tracing for the timer components in tree. syslog shows a long loop of:
arch_timer: arch_timer_set_hardware_timer: timeout 80295 arch_timer: arch_timer_set_hardware_timer: timeout 2000 apic: arch_smp_set_apic_timer: config 251, timeout 24571, tics/sec 570200000, tics 1140400 arch_timer: arch_timer_set_hardware_timer: timeout 80295 arch_timer: arch_timer_set_hardware_timer: timeout 2000
comment:14 by , 3 years ago
Raising my tracing up to kernel timer.cpp...
pages and pages of...
timer_Interrupt: calling hook 0xffffffff8009db90 for event 0xffffffff801b8888 add_timer: event 0xffffffff82e24668 cancel_timer: event 0xffffffff801b8888 add_timer: event 0xffffffff801b8888 cancel_timer: event 0xffffffff801b8d88 cancel_timer: event 0xffffffff801b8888 cancel_timer: event 0xffffffff801b8608 . . timer_interrupt: time 14662330, cpu 5 timer_interrupt: timer 14662435, cpu 1 . .
comment:15 by , 3 years ago
Are we talking about the CPU frequency measurements in https://git.haiku-os.org/haiku/tree/src/system/boot/arch/x86/arch_cpu.cpp or does this CPU freq value come from something else?
Since these CPUs are possibly the first to go past the 232Hz (4GHz) barrier, it wouldn't be surprising if there's simply an overflow somewhere?
comment:16 by , 3 years ago
4Ghz is really nothing special.
- 4,000,000,000 is 4Ghz in Hz.
- 2,147,483,647 is the maximum 32-bit unsigned int
- My Ryzen 9 5950X boosts to 4.9Ghz and works just fine with our code today.
- The i7 in question has a base clock of 2.80 GHz, and boosts to 4.7Ghz.
Maybe the gTimeConversionFactor is over 32-bits, but that doesn't make a lot of sense. https://git.haiku-os.org/haiku/tree/src/system/boot/arch/x86/arch_cpu.cpp#n233
comment:17 by , 3 years ago
I guess I really need the value from: https://git.haiku-os.org/haiku/tree/src/system/boot/arch/x86/arch_cpu.cpp#n296
comment:18 by , 3 years ago
2,147,483,647 is the maximum 32-bit unsigned int
That's signed.
The max value for unsigned is 4294967296 (a bit more than 4GHz). And yes, it could be intel specific.
There are various places where we deal with frequencies. For example the CPUID instruction also returns one: Processor Frequency Information Leaf in CPUID (see https://wukl.net/asm/x86/instr/CPUID), dynamic frequency meansurements, ...
I'm trying to understand which one of these is wrong first.
(I don't have the hardware yet, I ordered a new laptop but it was not delivered today as expected...)
comment:19 by , 3 years ago
Confirmed that Pulse gets its value from get_cpu_info which gets it from the bootloader. So it's indeed that arch_cpu code not working as expected.
follow-up: 21 comment:20 by , 3 years ago
I have received my new laptop. Fujitsu U7311 with Core i7 1165G7 CPU.
No slowness problem as far as I can see, the machine is working fine. I did not run compilation jobs yet, but played a bit with Mandelbrot to stress the CPU. It's working well.
So, this is not specific to the CPU, there's something else involved.
comment:21 by , 3 years ago
Replying to pulkomandy:
No slowness problem as far as I can see, the machine is working fine. I did not run
with x86_64 and EFI I suppose?
comment:24 by , 3 years ago
There are various options in the bios, including for switching between performance and powersaving modes. I will make some experiments after I have finished installing the machine and transferring my data from the previous one.
comment:28 by , 3 years ago
@pulkomandy Are you able to enable CSM on your machine? The two laptops I tested were class 3 only
comment:29 by , 3 years ago
@pulkomandy Are you able to enable CSM on your machine? The two laptops I tested were class 3 only
I have switched "Fast Boot" off, it makes no difference.
follow-up: 31 comment:30 by , 3 years ago
I've done some more testing on my NUC with the latest nightly image.
Firstly, About System shows it as an "Intel Gen Intel® Core™ i3 1115G4 39.42 Ghz"
It's also one of the ones where Intel hasn't included CSM support in the EFI anymore. It's not possible to turn on BIOS compatibility.
One thing that I have tried was to turn off "High Precision Event Timers" in the UEFI setup. That didn't totally fix the mouse sluggishness, but it made it way closer to normal. The mouse moves better with that off, but I still have to hold the button for about 3 seconds for it to register a click in a menu.
comment:31 by , 3 years ago
comment:32 by , 3 years ago
korli: It seems the same behaviour is there on hrev55969 (had to manually compile)
by , 3 years ago
Attachment: | syslog-huawei-55969.txt added |
---|
comment:33 by , 3 years ago
This issue still exists as of hrev56087 on my Framework laptop. track pad is having timing / movement issues, and cpu is reported as 41.16 Ghz
follow-up: 35 comment:34 by , 3 years ago
Please check with https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/1/hrev56134/x86_64/ whether it helps.
comment:35 by , 2 years ago
Replying to korli:
Please check with https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/1/hrev56134/x86_64/ whether it helps.
Sadly the same behavior persists as in the first post (no change). Thank you nevertheless
follow-up: 38 comment:36 by , 2 years ago
Thanks. Please try again with this image: https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/2/hrev56145/x86_64/ The EFI bootloader smp logs are enabled. Hopefully they'll end up in the syslog.
comment:37 by , 2 years ago
(Note that you will need to test with the full anyboot image in order to get the new logs from the bootloader.)
comment:38 by , 2 years ago
Replying to korli:
Thanks. Please try again with this image: https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/2/hrev56145/x86_64/ The EFI bootloader smp logs are enabled. Hopefully they'll end up in the syslog.
The same behavior as previous builds, I will upload the syslogs shortly
by , 2 years ago
Attachment: | syslog_huawei_30522.txt added |
---|
by , 2 years ago
Attachment: | screenshot30522.png added |
---|
by , 2 years ago
Attachment: | syslog30522afterconnectingmouse.txt added |
---|
follow-up: 40 comment:39 by , 2 years ago
Thanks. I enabled the wrong logs. Please try again with this image: https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/3/hrev56145/x86_64/
comment:40 by , 2 years ago
Replying to korli:
Thanks. I enabled the wrong logs. Please try again with this image: https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/3/hrev56145/x86_64/
No problem, new syslog should be attached shortly.
by , 2 years ago
Attachment: | syslog-31522.txt added |
---|
comment:41 by , 2 years ago
Thanks. I disabled gpt logs, it seems the cpu logs come first: https://haiku.movingborders.es/testbuild/Ifae8f2cea5aadc46b7591c4debc2dd247a787fb1/5/hrev56146/x86_64/
by , 2 years ago
Attachment: | hrev56253 Intel i7 16gb ram,png.png added |
---|
by , 2 years ago
Attachment: | hrev56253 syslog added |
---|
by , 2 years ago
Attachment: | listdev hrev56253 Intel i7 16gb ram.txt added |
---|
by , 2 years ago
Attachment: | listusb hrev56253 Intel i7 16gb ram.txt added |
---|
comment:42 by , 2 years ago
Hello, sorry for the late reply, it seems the builds have been erased, however I compiled the latest git with the patch 0001-kernel-libroot-apply-a-shift-on-rdtsc-on-higher-TSC-.patch
Syslog should be attached soon
by , 2 years ago
Attachment: | syslog 14922.txt added |
---|
comment:43 by , 2 years ago
KERN: calculating apic timer conversion factor KERN: APIC ticks/sec = 463028571
comment:44 by , 2 years ago
It seems for modern CPUs, Linux does not do this calculation anymore and they instead get the frequency from CPUID, see the various messages in this thread: https://lore.kernel.org/lkml/tip-2420a0b1798d7a78d1f9b395f09f3c80d92cc588@git.kernel.org/
The detected frequency of 463MHz seems strange, shouldn't this be the same as the CPU clock? Wouldn't it be useful to log the t1, t2 and count values in this function?
comment:45 by , 2 years ago
According to Intel docs the APIC timer should not be that fast, it should be around 20-25MHz and indeed they document that the frequency can be read from CPUID instructions if available there.
Source: section 10.5.4 of Intel architecture software developer manual, volume 3A (part 1).
So, should we try to get the frequency from there instead of measuring it?
comment:47 by , 2 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:48 by , 2 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
comment:49 by , 2 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Are you able to boot with ACPI disabled in the bootloader? Possibly a dupe of #14784.