#10784 closed bug (no change required)
My laptop turn off because GPU overheating
Reported by: | Premislaus | Owned by: | kallisti5 |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Drivers/Graphics/radeon_hd | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
My laptop turn off because GPU overheating. I have this problem on Linux with kernels before 3.13.
I ran in VESA mode because - #9894
I have two graphics cards in my laptop - Radeon HD 7520g and 7670m. My laptop is a Samsung NP355V5C-S05PL with A6-4400M APU.
Attachments (3)
Change History (19)
by , 11 years ago
by , 11 years ago
Attachment: | syslog.old added |
---|
by , 11 years ago
comment:2 by , 11 years ago
On Linux with kernel 3.12 I have 20 celsius degrees more than in fuckin' Windows. :/
comment:3 by , 11 years ago
No problems on Ubuntu 14.04 and Windows 8.1. On Ubuntu is even cooler than on Windows.
On Haiku cooling operates at full power, but the laptop gets hot. And turn off after few minutes.
comment:4 by , 11 years ago
Probably works "fine" in Vesa mode. I must blacklist radeon_hd accelerant and ude fail-safe video mode.
After some time I have KDL:
KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x71162ff0, ip 0x115b196, write 1, user 1, thread 0x153d KERN: vm_page_fault: thread "w:846:offscreen" (5437) in team "app_server" (520) tried to write address 0x71162ff0, ip 0x115b196 ("app_server_seg0ro" +0x82196) KERN: debug_server: Thread 5437 entered the debugger: Segment violation KERN: stack trace, current PC 0x115b196 HasClipping__C9DrawState + 0x6: KERN: (0x71163008) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163038) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163068) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163098) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711630c8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711630f8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163128) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163158) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163188) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711631b8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711631e8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163218) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163248) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163278) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711632a8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711632d8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163308) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163338) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163368) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163398) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711633c8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711633f8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163428) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163458) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163488) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711634b8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711634e8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163518) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163548) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163578) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711635a8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711635d8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163608) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163638) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163668) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163698) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711636c8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711636f8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163728) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163758) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163788) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711637b8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711637e8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163818) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163848) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163878) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711638a8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x711638d8) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163908) 0x115b1d3 HasClipping__C9DrawState + 0x43 KERN: (0x71163938) 0x115b1d3 HasClipping__C9DrawState + 0x43
comment:6 by , 11 years ago
Hm. We don't touch GPU Power Management for this very reason. (Letting the AtomBIOS manage the power on it's own) I'll reach out to my AMD contacts to see if it is a known issue.
It could be that the laptop vendor kept the voltages a bit too high to boost fps and didn't adjust the ASIC settings to match this adjustment. A lot of vendors only test their machines running "Windows and the stock drivers" thus missing these kinds of bugs. Linux likely doesn't see the issue as it takes over power management and likely has a wider safety threshold than the windows drivers.
The only solution in this case would be to take over the GPU power management, but a lot of code needs to be written to do that and a *LOT* of testing as we would be more likely to overheat a wider range of systems.
comment:7 by , 10 years ago
something feels off about your laptop given the error you saw booting with vesa. could you run memtest on your laptop just to rule memory corruption out? You can download most Linux ISO's and they will have a memtest boot option. (Ubuntu for example has one)
follow-up: 11 comment:9 by , 10 years ago
I'm not sure I've seen an overheating GPU cause a machine to shut down - usually you just get visual artifacts, and occasionally a hardware hang.
Thermal shutdown is usually a CPU feature, however, and depending on the CPU model, the thermal shutdown temp may vary - I've seen some set at 75C, while many of intels shutdown at 90C (I've had this happen, btw... when I disabled the thermal protection in the BIOS on a machine that didn't have the heatsink/fan properly seated on the CPU).
I would guess it could also happen if the northbridge chipset overheats - or maybe your laptop has some additional thermal protection built in that cuts power when it hits some specific case temp.
I don't suppose you have any way of tracking the various CPU/motherboard, etc. temps when you're approaching a shutdown event? Can you duplicate the behavior on say Linux with a heavy load applied to the machine?
comment:10 by , 8 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
This one seemed strange when reported. Since we don't touch the radeon hd power management, i'm going to attribute the issue to a quirk in the implementation. Linux had the same issue, but since I didn't see any linux quirks documented and GPU thermal management is left to the GPU, i'm going to close this one as not an issue with our driver but the hardware.
comment:11 by , 8 years ago
Replying to kallisti5:
This one seemed strange when reported. Since we don't touch the radeon hd power management, i'm going to attribute the issue to a quirk in the implementation. Linux had the same issue, but since I didn't see any linux quirks documented and GPU thermal management is left to the GPU, i'm going to close this one as not an issue with our driver but the hardware.
This ticket is still valid. My laptop turns off from time to time.
On Linux you have DPM for Radeon and powersaving for CPU.
Replying to umccullough:
I don't suppose you have any way of tracking the various CPU/motherboard, etc. temps when you're approaching a shutdown event? Can you duplicate the behavior on say Linux with a heavy load applied to the machine?
On Linux is slightly hotter than on Windows, but I don't had this problems since they introduced DPM.
https://wiki.archlinux.org/index.php/ATI#Dynamic_power_management
comment:12 by , 8 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
comment:13 by , 8 years ago
I think this ticket should be finally closed. I cleanup my laptop from dust and checked memory with propertiary memtest. For several days Haiku was good.
Haiku needs proper powermanagment. Under Haiku my laptop is a lot hotter than on Linux or Windows. During the idle, air from the fan is hot. This is why my laptop shutdowns from time to time - insane temps. But this is another ticket for Haiku.
comment:14 by , 8 years ago
Resolution: | → no change required |
---|---|
Status: | reopened → closed |
follow-up: 16 comment:15 by , 8 years ago
Have you checked if there are any firmware updates for your computer? It sounds like your firmware should not do that if they follow specs.
comment:16 by , 8 years ago
Replying to tqh:
Have you checked if there are any firmware updates for your computer? It sounds like your firmware should not do that if they follow specs.
I have latest EFI, and there is no more updates from Samsung for this particular laptop.
I think my CPU is constnat on 2,7 GHz - not reclocking.
Years ago this commit helped me on my old desktop - http://cgit.haiku-os.org/haiku/commit/?id=cc586f1655b94c248be58ba1752b42bc39fbaf03
Max 10 minutes and it turned off.
No problems on Windows 8.1.