Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#12549 closed bug (fixed)

KDL booting hrev49947 x86_gcc2

Reported by: kim1963 Owned by: waddlesplash
Priority: high Milestone: R1/beta1
Component: Drivers/ACPI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: x86

Description (last modified by waddlesplash)

Welcome to Kernel Debugging Land...
Thread 18 "main2" running on CPU 2
stack trace for thread 18 "main2"
    kernel stack: 0x81c2f000 to 0x81c33000
frame               caller     <image>:function + offset
 0 81c326d4 (+  32) 80148712   <kernel_x86> arch_debug_stack_trace() + 0x12
 1 81c326f4 (+  16) 800a7ef7   <kernel_x86> stack_trace_trampoline__FPv() + 0x0b
 2 81c32704 (+  12) 8013a49e   <kernel_x86> arch_debug_call_with_fault_handler() + 0x1b
 3 81c32710 (+  48) 800a9a1a   <kernel_x86> debug_call_with_fault_handler() + 0x5a
 4 81c32740 (+  64) 800a8113   <kernel_x86> kernel_debugger_loop__FPCcT0Pcl() + 0x217
 5 81c32780 (+  48) 800a848f   <kernel_x86> kernel_debugger_internal__FPCcT0Pcl() + 0x53
 6 81c327b0 (+  48) 800a9da6   <kernel_x86> panic() + 0x3a
 7 81c327e0 (+ 144) 8011ef7d   <kernel_x86> vm_page_fault() + 0x145
 8 81c32870 (+  80) 80149f43   <kernel_x86> x86_page_fault_exception() + 0x177
 9 81c328c0 (+  12) 8013ce5c   <kernel_x86> int_bottom() + 0x3c
kernel iframe at 0x81c328cc (end = 0x81c3291c)
 eax 0x8189b720    ebx 0x81899f6c     ecx 0x82a9a500  edx 0x73
 esi 0x656d5f6c    edi 0xa            ebp 0x81c32944  esp 0x81c32900
 eip 0x8186d9ad eflags 0x10206   
 vector: 0xe, error code: 0x0
10 81c328cc (+ 120) 8186d9ad   <acpi> AcpiNsBuildNormalizedPath() + 0x51
11 81c32944 (+  64) 8186d908   <acpi> AcpiNsHandleToPathname() + 0x44
12 81c32984 (+  48) 81870445   <acpi> AcpiGetName() + 0x65
13 81c329b4 (+  64) 81867033   <acpi> get_next_entry() + 0x9b
14 81c329f4 (+ 544) 818679f7   <acpi> acpi_enumerate_child_devices__FP11device_nodePCc() + 0x237
15 81c32c14 (+ 144) 81867b17   <acpi> acpi_module_register_child_devices__FPv() + 0x103
16 81c32ca4 (+  48) 800bc14d   <kernel_x86> Register__11device_nodeP11device_node() + 0x8d
17 81c32cd4 (+  64) 800baa62   <kernel_x86> register_node__FP11device_nodePCcPC11device_attrPC11io_resourcePP11device_node() + 0xfe
18 81c32d14 (+ 112) 818677b5   <acpi> acpi_module_register_device__FP11device_node() + 0x49
19 81c32d84 (+  64) 800bc8a9   <kernel_x86> _RegisterPath__11device_nodePCc() + 0x4d
20 81c32dc4 (+  96) 800bca9f   <kernel_x86> _RegisterDynamic__11device_nodeP11device_node() + 0x147
21 81c32e24 (+  48) 800bc177   <kernel_x86> Register__11device_nodeP11device_node() + 0xb7
22 81c32e54 (+  64) 800baa62   <kernel_x86> register_node__FP11device_nodePCcPC11device_attrPC11io_resourcePP11device_node() + 0xfe
23 81c32e94 (+ 192) 800bd519   <kernel_x86> init_node_tree__Fv() + 0x41
24 81c32f54 (+  16) 800bd6f1   <kernel_x86> device_manager_init() + 0xfd
25 81c32f64 (+  80) 80066942   <kernel_x86> main2__FPv() + 0x76
26 81c32fb4 (+  48) 80087563   <kernel_x86> common_thread_entry__FPv() + 0x3b
kdebug> 

Attachments (5)

previous_syslog (54.3 KB ) - added by kim1963 4 years ago.
previous_syslog.2 (54.3 KB ) - added by kim1963 4 years ago.
hrev49944 x86_gcc4 hybrid syslog (221.1 KB ) - added by HAL 4 years ago.
syslog successful boot of hrev49944 x86_gcc4 hybrid
previous49949_syslog (88.9 KB ) - added by kim1963 4 years ago.
previous49956_syslog (68.7 KB ) - added by kim1963 4 years ago.

Download all attachments as: .zip

Change History (18)

by kim1963, 4 years ago

Attachment: previous_syslog added

by kim1963, 4 years ago

Attachment: previous_syslog.2 added

comment:1 by waddlesplash, 4 years ago

Component: System/KernelDrivers/ACPI
Description: modified (diff)
Owner: changed from axeld to waddlesplash
Priority: blockerhigh
Status: newassigned

Not exactly sure what's going on here. All the functions here look normal; I guess I should look at the diffs to see what changed...

comment:2 by waddlesplash, 4 years ago

Can you retry with x86_gcc4 (hybrid OK, if you want), and see if the same thing happens? Thanks.

comment:3 by jua, 4 years ago

Same here, since upgrading it crashes with same KDL as above during boot.

I propose we revert the recent ACPI code update and go back to the version previously used (which is, as pointed out by korli, also the one used by FreeBSD).

comment:4 by rudolfc, 4 years ago

Hi,

A shame we have trouble with this version :-/ Anyhow, if we revert to the previous version, I would strongly suggest adding one patch to it being:

https://github.com/acpica/acpica/commit/a3267967c8bd3938e9aa98e8a267315cd609c477#diff-6c6887495e1022722758f5a9104b668d

This patch fixes ACPI not working on older systems (i.e. my laptop.. :-)

Bye!

Rudolf

comment:5 by HAL, 4 years ago

I have the same kdl booting hrev49950 in x86_64 and x86_gcc2.

in reply to:  2 comment:6 by HAL, 4 years ago

Replying to waddlesplash:

Can you retry with x86_gcc4 (hybrid OK, if you want), and see if the same thing happens? Thanks.

I tried with hrev49948 x86_gcc4 hybrid and got a very similar kdl at boot.

comment:7 by waddlesplash, 4 years ago

Can you upload a syslog from a successful boot (on an earlier rev)? Thanks. EDIT: Scratch that, don't need it.

Last edited 4 years ago by waddlesplash (previous) (diff)

by HAL, 4 years ago

syslog successful boot of hrev49944 x86_gcc4 hybrid

in reply to:  7 comment:8 by HAL, 4 years ago

Replying to waddlesplash:

Can you upload a syslog from a successful boot (on an earlier rev)? Thanks. EDIT: Scratch that, don't need it.

I did that before seeing your edit.

by kim1963, 4 years ago

Attachment: previous49949_syslog added

comment:9 by waddlesplash, 4 years ago

Can you try again with hrev49953 or later?

comment:10 by kim1963, 4 years ago

hrev49956 - KDL booting

by kim1963, 4 years ago

Attachment: previous49956_syslog added

comment:11 by waddlesplash, 4 years ago

A friend of mine who knows more about assembly than I do actually spent some time looking at this. Here's what he sent me:

Here is what I discovered to be true, looking at the backtrace and the disassembly:

  1. In AcpiNsBuildNormalizedPath() %esi is used for the variable NextNode.
  1. In the stack trace %esi=0x656d5f6c looks suspiciously like it is a pointer loaded after a buffer overrun. The ascii would be 'l_me'
  1. It is unlikely that the parameter, Node, is bad, because the caller used a handle validate routine.
  1. When walking the tree, NextNode is loaded from each .Parent in turn. This means that one of the .Parents is bad.
  1. %edi is used for the variable Length. At the time of the stack trace, it is 0xa. I think that means that it walked up at least two nodes before it hit the bad one.
  1. Inserting a little debugging code right there might help provide a clue to track down the problem. if (NextNode==0x656d5f6c), then you want to dump what is at FullPath so far, and then panic. Better yet, start at Node, and walk forward, dumping each block [Node,sizeof(*Node)].

comment:12 by waddlesplash, 4 years ago

Resolution: fixed
Status: assignedclosed

Reverted in hrev49963.

comment:13 by tqh, 4 years ago

This was a bug in ACPICA, I could reproduce the problem on my laptop with a tree from before it was reverted. ACPICA has since released a bug fixed version, and upgrading to that version made it go away:

Fixed a regression introduced in version 20151218 concerning the 
execution of so-called module-level ASL/AML code. Namespace objects 
created under a module-level If() construct were not properly/fully 
entered into the namespace and could cause an interpreter fault when 
accessed.

So do you want to re-revert? :) Your changes looks good, just got unlucky (and maybe wait a month or two after a release is out next time).

https://acpica.org/sites/acpica/files/changes_29.txt

Note: See TracTickets for help on using tickets.