Opened 10 years ago

Closed 10 years ago

#4152 closed bug (fixed)

acpi_battery: throw ACPI exceptions about GlobalLock acquisition/release failures

Reported by: phoudoin Owned by: czeidler
Priority: normal Milestone: R1
Component: Drivers Version: R1/pre-alpha1
Keywords: Cc: fredrik.holmqvist@…, bergep@…
Blocked By: Blocking: #3708
Has a Patch: no Platform: All

Description

While acpi_battery is loaded and watched (by PowerStatus, for example), several ACPI exceptions about GlobalLock access failures are traced in syslog, without other visible consequence.

Could be sign of a deeper (or evil?! ;-) ) problem in ACPI, though.

Attachments (4)

ACPI lock errors.txt (5.0 KB ) - added by phoudoin 10 years ago.
syslog excerpt showing ACPI exceptions
oshaiku.c_4152_fix.patch (425 bytes ) - added by phoudoin 10 years ago.
Patch to actually support ACPI Global locking strategy, which require to support ACPI_DO_NOT_WAIT (0) as required.
acpi.patch (50.3 KB ) - added by tqh 10 years ago.
acpi_dpc_loading.patch (2.3 KB ) - added by tqh 10 years ago.
move dpc loading earlier

Download all attachments as: .zip

Change History (40)

comment:1 by phoudoin, 10 years ago

That's under a fresh hrev31835 gcc2 alpha image, on a Samsung NC10 netbook.

I will attach a syslog dump as soon as:

  1. the Marvell Yukon 88E8040 PHY link is correctly handled to send it over network (ticket creation really soon now), or
  2. I will successfully install Atheros proof-of-concept wifi driver to send it over network, or
  3. I could save it to a fat-formatted usb key without be thrown into KDL by an assert on cluster != 0 in fat.c (see #3690)

by phoudoin, 10 years ago

Attachment: ACPI lock errors.txt added

syslog excerpt showing ACPI exceptions

in reply to:  1 ; comment:2 by korli, 10 years ago

Replying to phoudoin:

  1. the Marvell Yukon 88E8040 PHY link is correctly handled to send it over network (ticket creation really soon now), or

Another option is using a supported usb ethernet key, if there is an available usb slot. I do this myself since my internal e1000 network card doesn't work anymore in Haiku.

in reply to:  2 comment:3 by phoudoin, 10 years ago

Another option is using a supported usb ethernet key, if there is an available usb slot. I do this myself since my internal e1000 network card doesn't work anymore in Haiku.

Unfortunatly, I've managed to bought the D-Link USB ethernet adapter which, bad luck, is not based on Pegasus chip :-\

Instead, I found a 4th solution: save to a bfs usb key, and mount it on my desktop Haiku machine on which network works fine.

I should check if the fat KDL is part of R1 alpha blockers. Otherwise, any joe testing alpha will crash it as soon as he'll start to save some stuff on his usb stick - fat formatted.

comment:4 by czeidler, 10 years ago

Sadly I have no acpi global lock on my notbook and don't see this message. Maybe you can look into it?

comment:5 by tqh, 10 years ago

Cc: fredrik.holmqvist@… added

Also seeing this. Trying to see if I can find out anything. My ACPI seems to give garbage to Powerstatus at times also.

comment:6 by tqh, 10 years ago

AcpiOsWaitSemaphore with a timeout of 0 should return AE_TIME on B_WOULD_BLOCK. That makes ACPI hang at boot here though.

comment:7 by tqh, 10 years ago

From ACPI docs:

Implementation notes:
 1.  The implementation of this interface must support timeout values of zero. This is frequently
     used to determine if a call to the interface with an actual timeout value would block. In this
     case, AcpiOsWaitSemaphore must return either an E_OK if the units were obtained
     immediately, or an AE_TIME to indicate that the requested units are not available. Single
     threaded OSL implementations should always return AE_OK for this interface.

by phoudoin, 10 years ago

Attachment: oshaiku.c_4152_fix.patch added

Patch to actually support ACPI Global locking strategy, which require to support ACPI_DO_NOT_WAIT (0) as required.

comment:8 by phoudoin, 10 years ago

Sounds like 1) we don't actually honor ACPI global hardware lock and 2) when we do it, it's crashing on machine with ACPI global lock. How nice is that!

Anyway, for 1) the patch attached should do the job. I'm at office and can't test it, though.

For 2), maybe we should star looking at some ACPI runtime settings, like forcing methods serialization (see acglobal.h).

comment:9 by tqh, 10 years ago

I tested that and then it hung before the rocket icon. I'll look at the code and test more tonight though.

in reply to:  9 comment:10 by phoudoin, 10 years ago

Replying to tqh:

I tested that and then it hung before the rocket icon. I'll look at the code and test more tonight though.

So at modules/drivers loading times. I guess in safe mode it will works, but without any ACPI driver loaded, no PowerStatus...

comment:11 by tqh, 10 years ago

Yes, and the ACPI-tasks was waiting on semaphores. So might be locking issues still. Couldn't figure out if/what the boot-thread was up to though (how do I find it?).

comment:12 by tqh, 10 years ago

The boot thread (main2?) is also waiting on an acpi-sem so I need to see what's going on. I also added signaling to return error on errors, but no difference.

Reading api-spec at: http://www.acpica.org/documentation/

comment:13 by tqh, 10 years ago

Currently I have this issue, which I suspect is why it hangs on startup: KERN: module: Search for generic/dpc/v1 failed. KERN: failed to get dpc module KERN: driver bus_managers/acpi/root/driver_v1 init failed: General system error

comment:14 by tqh, 10 years ago

The issue:

KERN: module: Search for generic/dpc/v1 failed.
KERN: failed to get dpc module
KERN: driver bus_managers/acpi/root/driver_v1 init failed: General system error

(I hate this editor btw)

comment:15 by axeld, 10 years ago

The "dpc" module must be one of the boot modules, if ACPI is, I would guess.

comment:16 by tqh, 10 years ago

Adding it to boot modules removes the syslog error, but main thread is still waiting on a semaphore. So I'll see what else I can find.

comment:17 by tqh, 10 years ago

More info, we didn't call AcpiInitializeObjects, which is described as 'must be called'. Rebuilding so havn't tested yet:

This function completes initialization of the ACPICA Subsystem by initializing all ACPI Devices,
Operation Regions, Buffer Fields, Buffers, and Packages. It must be called and it should only be
called after a call to AcpiEnableSubsystem. The object cache is purged after these objects are
initialized, in case an overly large number of cached objects were created during initialization
(versus the size of the caches at runtime.)

in reply to:  17 comment:18 by phoudoin, 10 years ago

Replying to tqh:

More info, we didn't call AcpiInitializeObjects, which is described as 'must be called'.

I wonder how any attempt to use ACPI Global Lock could work if the FACS table is not loaded and _GL_ mutex object is not initialized. ..

comment:19 by tqh, 10 years ago

I'm working on it, I have a partly rewritten stack that is as far as the one in our repo. InitializeObjects still hangs but I think it's because the initializing order is wrong and some steps are even missing. If i disable acpi_embedded_controller and skip InitializeObjects I can boot my stack and it shows battery info view, although it doesn't update or show how much juice is left. Although I feel I'm getting out of scope for this bug. Btw why acpi_embedded_controller separate from busmanager, (installing the int handlers needs to be done in busmanager before we InitializeObjects)?

comment:20 by tqh, 10 years ago

Also I think the mutex type binary_sem as is default maybe wrong. Seems they are inited to 0 by ACPI while the win example impl creates them with 1. So I implemented them according to example. Working on int handlers and init order, which will probably do wonders...

in reply to:  15 comment:21 by phoudoin, 10 years ago

Replying to axeld:

The "dpc" module must be one of the boot modules, if ACPI is, I would guess.

Which means that ACPI doesn't gracefully handle DPC module not being available, which is another issue we must fix, I guess.

comment:22 by tqh, 10 years ago

There are a lot of issues to fix, but I think I'm close to having found most. For instance the signal semaphore code must use B_DO_NOT_RESCHEDULE as it's called from interrupt handlers. Atm I also suspect our use of dpc is wrong as well, as some deferred tasks may block, and as dpc serializes task the task that wakes the first one will never execute.

I have done a lot, but think it's outside the scope of this bug so not sure how to sync and report progress.

in reply to:  22 comment:23 by phoudoin, 10 years ago

Replying to tqh:

There are a lot of issues to fix, but I think I'm close to having found most. For instance the signal semaphore code must use B_DO_NOT_RESCHEDULE as it's called from interrupt handlers.

Indeed.

Atm I also suspect our use of dpc is wrong as well, as some deferred tasks may block, and as dpc serializes task the task that wakes the first one will never execute.

Argh. Okay, maybe we could write a generic acpi_tack_dpc_wrapper which simply fire & forget via spawn_kernel_thread() those tasks, intead.

I have done a lot, but think it's outside the scope of this bug so not sure how to sync and report progress.

Create a branch, if you have commit access. If not, post a patch file here against an indentified revision, and I'll create a branch dedicated to fixing ACPI.

comment:24 by mmlr, 10 years ago

Blocking: 3708 added

(In #3708) Then ACPI most probably causes interrupts to be lost. The thing is that ACPI is quite broken for many cases (tracked in #4152), therefore it's not enabled by default. I'm going to close this bug as a duplicate of #4152 as that one should improve the situation on that front, even though it is not actually a duplicate.

comment:25 by tqh, 10 years ago

Sorry for the late answer. I don't have commit access and I've used a later version ACPI as it is more portable. So I can't really provide a patch. A branch for ACPI would be best IMO. You can email me on my gmail, "fredrik holmqvist" with a dot thrown in.

comment:26 by thetick, 10 years ago

Cc: bergep@… added

by tqh, 10 years ago

Attachment: acpi.patch added

comment:27 by tqh, 10 years ago

Same patch as I sent to haiku-dev mailing list. I'd suggest commiting it, even though I'll work on cleaning it up more.

comment:28 by tqh, 10 years ago

Please try hrev33236 or later. Not sure if this alone fixes it as my laptop still other issues.

in reply to:  28 comment:29 by phoudoin, 10 years ago

Replying to tqh:

Please try hrev33236 or later. Not sure if this alone fixes it as my laptop still other issues.

Michael Weirauch's KDL screenshot seems to confirm that the issue now is that our ACPI more complete initialization process needs DPC queue being set up earlier: http://www.m-phasis.de/tmp/haiku/haiku-r33252-acpi-wo-ab-wo-aec-bt.jpg

  1. Device manager load ACPI root module
  2. acpi_module_std_opts(): load B_ACPI_MODULE *before* ACPI root module is inited, and DPC loaded and a queue created. gDPC value is NULL.
  3. acpi_std_ops(): B_ACPI_MODULE is inited
  4. AcpiInitializeObjects(): last ACPI init step, trigger by enabling GPE an interrupt, leading to AcpiOsExecute().
  5. AcpiOsExecute() call gDPC->queue_dpc()
  6. gDPC->queue_dpc is... tada, NULL + 0x14 offset. Hence the segfault at 0x14.

I love when KDL allow you to easily see struct offset :-)

gDPC & gDPCHandle should be set up earlier, in acpi_std_ops(), not at acpi_module_init() time. DPC should be a module dependency (it's not an optional one), and gDPCHandle queue created in acpi_std_ops(B_MODULE_INIT), *before* first Acpi init steps.

by tqh, 10 years ago

Attachment: acpi_dpc_loading.patch added

move dpc loading earlier

comment:30 by tqh, 10 years ago

Review and commit if ok. Works here at least, but no expert on module handling.

comment:31 by phoudoin, 10 years ago

Sorry, I've missed your patch. But I've commit a very similar change in hrev33264. Michael Weirauch reported that it fixed KDLs during boot:

https://lists.berlios.de/pipermail/haiku-commits/2009-September/020852.html

For the last issue (No handler for region), I add default address space handlers installation to the startup process in hrev33273. Duno if it works yet, but that the only extra startup steps done by FreeBSD...

in reply to:  19 comment:32 by phoudoin, 10 years ago

Replying to tqh:

Btw why acpi_embedded_controller separate from busmanager, (installing the int handlers needs to be done in busmanager before we InitializeObjects)?

I guess that as multiple embedded controllers could exists in the ACPI namespace, and each one needs to be *driven* separately. Via device manager, that what our acpi_embedded_controller does for every EC node thrown at him by ACPI bus manager.

Or, at least, that my guess.

in reply to:  31 comment:33 by tqh, 10 years ago

Replying to phoudoin:

For the last issue (No handler for region), I add default address space handlers installation to the startup process in hrev33273. Duno if it works yet, but that the only extra startup steps done by FreeBSD...

Just make sure it isn't done already by the ACPI initialization calls in acpi_busman.c.

in reply to:  31 comment:34 by michael.weirauch, 10 years ago

Replying to phoudoin:

For the last issue (No handler for region), I add default address space handlers installation to the startup process in hrev33273. Duno if it works yet, but that the only extra startup steps done by FreeBSD...

Still getting it. hrev33468-gcc4h2

2009-10-06 18:05:13 KERN: ACPI: SSDT 0xbfad2000 00274 (v01  PmRef  Cpu0Tst 00003000 INTL 20050624)
2009-10-06 18:05:13 KERN: ACPI: SSDT 0xbfad1000 00242 (v01  PmRef    ApTst 00003000 INTL 20050624)
2009-10-06 18:05:13 KERN: ACPI Error: No handler for Region [ECOR] (0x812797f8) [EmbeddedControl] (20090903/evregion-430)
2009-10-06 18:05:13 KERN: ACPI Error: Region EmbeddedControl(3) has no handler (20090903/exfldio-383)
2009-10-06 18:05:13 KERN: ACPI Error (psparse-0633): Method parse/execution failed [\_SB_.PCI0.LPC_.EC__.AC__._PSR] (Node 0x8127e708), AE_NOT_EXIST
2009-10-06 18:05:13 KERN: ACPI Error (psparse-0633): Method parse/execution failed [\_SB_._INI] (Node 0x8127a988), AE_NOT_EXIST
2009-10-06 18:05:13 KERN: ACPI Error: No handler for Region [ECOR] (0x812797f8) [EmbeddedControl] (20090903/evregion-430)
2009-10-06 18:05:13 KERN: ACPI Error: Region EmbeddedControl(3) has no handler (20090903/exfldio-383)
2009-10-06 18:05:13 KERN: ACPI Error (psparse-0633): Method parse/execution failed [\_SB_.PCI0.LPC_.EC__._INI] (Node 0x8127cf14), AE_NOT_EXIST
2009-10-06 18:05:13 KERN: publish device: node 0x812756e0, path acpi/namespace, module bus_managers/acpi/namespace/device_v1

comment:35 by jackburton, 10 years ago

Should this be closed, after hrev33781 ?

comment:36 by phoudoin, 10 years ago

Resolution: fixed
Status: newclosed

I'm closing it, but I guess Michael issue is not fixed yet, but as it's not related to ACPI global locking but address range handlers, it deserve its own ticket. Michael, feel free to open one. Or I'll, eventually.

Note: See TracTickets for help on using tickets.