Opened 17 months ago
Closed 6 months ago
#18454 closed bug (fixed)
ASSERT FAILED: address.minimum + address_length - 1 == address.maximum
Reported by: | Mas4hmad | Owned by: | nobody |
---|---|---|---|
Priority: | blocker | Milestone: | R1/beta5 |
Component: | Drivers | Version: | R1/Development |
Keywords: | boot-failure | Cc: | PulkoMandy, X512 |
Blocked By: | Blocking: | #18536, #18751, #18764, #18856 | |
Platform: | All |
Description (last modified by )
as suggested from korli, i create new ticked similar into #17369.
I have downloaded Haiku nightly image and tried to boot my device. but stuck at processing 'atom' image in screen (without debug yet).
When i open debug screen. it say
PackageVolumeInfo::_InitStates(): failed to parse activated-packages: No such file or directory
in mid-log and PackageVolumeInfo::LoadOldStates(): failed to open administrative directory: No such file or directory
in end-log.
There are another debug log shows in KDL. It says PANIC: ASSERT FAILED
and PANIC: did not find any boot partition
.
there is device info:
Machine: Lenovo ideapad flex 10
CPU: intel baytrail 64 bit. (UEFI compatible)
RAM: 2gb
ROM: 240gb SSD
boot media: haiku OS nightly anyboot hrev 57085 x86 gcc2 (32bit)
Attachments (29)
Change History (81)
comment:1 by , 17 months ago
comment:2 by , 17 months ago
if i use checklist 'enable on screen debug output' and 'continue booting', it'll show normal boot and keep stuck before 'atom' logo is colored and i can't do any further.
so i use 'display recent debug log' instead to grab the log.
by , 17 months ago
by , 17 months ago
by , 17 months ago
by , 17 months ago
by , 17 months ago
by , 17 months ago
by , 17 months ago
by , 17 months ago
comment:3 by , 17 months ago
Those error messages are harmless and not a problem here.
Please try booting with "enable on-screen debug output" and "disable on-screen paging." Then take a picture when the logs stop (or seem to stall.)
comment:4 by , 17 months ago
It seems my device forced to use 'graphical boot' and somehow can't show screen debug.
Even I had follow the instructions above and got only graphical boot screen without any indicator of progress.
Will try when have any serial cable.
Maybe that is the only choice to get a useful debug log (i think).
by , 17 months ago
by , 17 months ago
Attachment: | 1_ints.jpg added |
---|
by , 17 months ago
by , 17 months ago
by , 17 months ago
Attachment: | 4_ints.jpg added |
---|
comment:5 by , 17 months ago
Somehow the screen show KDL after half(-ish) minutes (in 0_kdl.jpg) and i try as can as be possible. It seems "assert failed"
After that, there are another KDL shows 'panic' (in 2_co.jpg)
comment:6 by , 17 months ago
Description: | modified (diff) |
---|
comment:7 by , 17 months ago
Component: | - General → Drivers |
---|---|
Keywords: | boot-failure added |
Platform: | x86 → All |
The first panic is strange, it comes from the ACPI probing logic for PCI. You may want to try booting with ACPI disabled in bootloader options.
The second panic is a more standard one and could have any number of cases, it may be related to the first one though.
comment:8 by , 17 months ago
Debug log i gathered is from booting up device with ventoy (which is known can boot though didn't found any media partition from other but Linux and windows os).
Will try boot with Rufus flasher (dd image) and look further
follow-up: 10 comment:9 by , 17 months ago
I think it's known that Haiku does not boot correctly with Ventoy. I see you have now booted successfully; can you replicate that success also with ACPI?
comment:10 by , 17 months ago
Replying to waddlesplash:
Can you replicate that success also with ACPI?
Yes. It could boot (with assert) And (maybe) that's will be different issue.
I thought that haiku will be stuck at booting for a while until i realize that several Linux distro (e.g. alpine) behave same at x86.
Alpine Linux take 20-ish minute at boot at x86.
comment:11 by , 17 months ago
Cc: | added |
---|---|
Summary: | Nightly Haiku OS boot stuck → ASSERT FAILED: address.minimum + address_length - 1 == address.maximum |
The assertion is a real issue. PulkoMandy was the one who edited this code and added some asserts, let's see if he or X512 has any ideas.
comment:12 by , 17 months ago
Apparently when minAddress_fixed or max Address_fixed is equals to ACPI_ADDRESS_FIXED, then the assert doesn't count, see here
comment:13 by , 17 months ago
The commit that added the assert: https://cgit.haiku-os.org/haiku/commit/src/add-ons/kernel?id=8be0a59e7780e1fc30d405702f3c071944cd7db9
I see FreeBSD also checks for length = 0, but we need to add these two other checks.
Basically, if we have 2 of the 3 variables (min, max, length), we can compute the third:
max = min + length length = max - min min = max - length
comment:14 by , 14 months ago
Same error on a HP ProBook 4510s. I'm able to boot (althought sometimes it hangs) disabiling ACPI.
comment:15 by , 14 months ago
Milestone: | Unscheduled → R1/beta5 |
---|
follow-up: 17 comment:16 by , 12 months ago
Can you test with https://review.haiku-os.org/c/haiku/+/7111 and provide an updated syslog from that version?
comment:17 by , 12 months ago
Replying to pulkomandy:
Can you test with https://review.haiku-os.org/c/haiku/+/7111 and provide an updated syslog from that version?
Unfortunately I can't build a full haiku image at the moment
comment:18 by , 12 months ago
You don't need to build it yourself, you can find test build isos here : https://haiku.movingborders.es/testbuild/
comment:19 by , 12 months ago
I tried this one:
https://haiku.movingborders.es/testbuild/release/master/hrev57382
but for some reason I can't see the output I'm supposed to see: the syslog ends here:
"initialize PCI controller from ACPI"
and then kernel debugger.
N.B: I look at the syslog from inside the kernel debugger.
follow-up: 21 comment:20 by , 12 months ago
You need toget the build from the sp crfic changeset, not from master, since it is not merged: https://haiku.movingborders.es/testbuild/I23d87da32779d22324f944b5b359390f523ec7a7/1/hrev57382/
This link is provided in the changeset comments by the build review so you can find it easily for each change
comment:21 by , 12 months ago
Replying to pulkomandy:
You need toget the build from the sp crfic changeset, not from master, since it is not merged: https://haiku.movingborders.es/testbuild/I23d87da32779d22324f944b5b359390f523ec7a7/1/hrev57382/
Thanks. Done!
comment:22 by , 12 months ago
is this PCI range need to be compare with other system e.g. linux?
if so, i can help to obtain from linux one and upload here to see comparison with haiku
comment:23 by , 12 months ago
So, the last range in that log is:
fff00000 to fbffffff
This doesn't make sense, the minimum is greater than the maximum. Also that would be a length of 0x3ff0001, but that is not at all what we get, instead we have fc100001.
I think at this point, all we can do is turn this ASSERT into an error log and ignore the range, since nothing about this range makes sense.
comment:24 by , 12 months ago
That's the last range in the list. Is it possible that somehow we read past the end of an array incorrectly?
comment:25 by , 12 months ago
It's the last printed one because we hit the assert and the enumeration stops, I think, not because it is the last in the list. But it's possible that we're not making enough checks on what type of entry it is (we just check IO vs memory but maybe there are other things to look for)
Also, we don't directly "loop" over the resources ourselves, we just call ACPI "walk_resources" on the _CRS table. I guess a dump of that table would be useful to see how it looks? Using a tool like acpidump (on Linux) or something like it. So it seems unlikely that the end of the table detection is a problem.
comment:27 by , 12 months ago
The Linux PCI ranges will be already processed from the ACPI ranges (you can see that all the adjoining ranges have been merged already.) So can you try to get a dump of the ACPI table directly?
comment:28 by , 12 months ago
Will try. At the moment I only have access to a system rescue CD live distro which doesn't have acpidump. Will try to get a full distro into this PC.
follow-up: 30 comment:29 by , 12 months ago
Oh great: the laptop display just died, so I can't try anything.
comment:30 by , 12 months ago
Replying to jackburton:
Oh great: the laptop display just died, so I can't try anything.
Do you have an external display at hand?
comment:31 by , 12 months ago
Just tried: it seems there's something else, since it doesn't work with external display, either. Maybe Mas4hmad could give this info, since he seems to have a similar laptop ?
by , 12 months ago
Attachment: | linux_dmesgpci added |
---|
by , 12 months ago
Attachment: | acpidump_lenoflex10.zip added |
---|
comment:32 by , 12 months ago
i don't have unique case like minimum address is greater from maximum address. from jackburton he have much zeroes more PCI Range than me and have negative range.
likely must have look data table from linux like waddlesplash said. so we can see comparison of data.
will try dump acpi with haiku now (until it boot to desktop successfully. need quarter to hour to do that T.T), but i'm afraid i cant AFAIR there are issue in https://github.com/haikuports/haikuports/issues/4005 doesn't closed yet
comment:33 by , 12 months ago
I don't need the ACPI dump to be done from Haiku, it would be the same as from Linux.
However it would be useful if you can test the Haiku build from here: https://haiku.movingborders.es/testbuild/I23d87da32779d22324f944b5b359390f523ec7a7/2/hrev57390/ and take a picture of the end of the syslog after it hits the assert (I know you already attached pictures, but this one has extra logs)
comment:34 by , 12 months ago
Actually, I see you already posted that (syslog.2).
So, here is what we get in Haiku:
401 KERN: PCI: range from ACPI [0(1),6f(1)] with length 70 402 KERN: PCI: range from ACPI [78(1),cf7(1)] with length c80 403 KERN: PCI: range from ACPI [d00(1),ffff(1)] with length f300 404 KERN: PCI: range from ACPI [a0000(1),bffff(1)] with length 20000 405 KERN: PCI: range from ACPI [c0000(1),dffff(1)] with length 20000 406 KERN: PCI: range from ACPI [e0000(1),fffff(1)] with length 20000 407 KERN: PCI: range from ACPI [0(1),0(1)] with length 0 408 KERN: PCI: range from ACPI [ffffffff(1),fffffffe(1)] with length 0 409 KERN: PCI: range from ACPI [80000000(1),907ffffe(1)] with length 10800000
This is what I see in dsdt,dsl in the acpidump:
WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x0000, // Granularity 0x0000, // Range Minimum 0x006F, // Range Maximum 0x0000, // Translation Offset 0x0070, // Length ,, , TypeStatic, DenseTranslation) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x0000, // Granularity 0x0078, // Range Minimum 0x0CF7, // Range Maximum 0x0000, // Translation Offset 0x0C80, // Length ,, , TypeStatic, DenseTranslation) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x0000, // Granularity 0x0D00, // Range Minimum 0xFFFF, // Range Maximum 0x0000, // Translation Offset 0xF300, // Length ,, , TypeStatic, DenseTranslation) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x000A0000, // Range Minimum 0x000BFFFF, // Range Maximum 0x00000000, // Translation Offset 0x00020000, // Length ,, , AddressRangeMemory, TypeStatic) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x000C0000, // Range Minimum 0x000DFFFF, // Range Maximum 0x00000000, // Translation Offset 0x00020000, // Length ,, , AddressRangeMemory, TypeStatic) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x000E0000, // Range Minimum 0x000FFFFF, // Range Maximum 0x00000000, // Translation Offset 0x00020000, // Length ,, , AddressRangeMemory, TypeStatic) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x7A000000, // Range Minimum 0x7A3FFFFF, // Range Maximum 0x00000000, // Translation Offset 0x00400000, // Length ,, _Y00, AddressRangeMemory, TypeStatic) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x7C000000, // Range Minimum 0x7FFFFFFF, // Range Maximum 0x00000000, // Translation Offset 0x04000000, // Length ,, _Y02, AddressRangeMemory, TypeStatic) DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, Cacheable, ReadWrite, 0x00000000, // Granularity 0x80000000, // Range Minimum 0xDFFFFFFF, // Range Maximum 0x00000000, // Translation Offset 0x60000000, // Length ,, _Y01, AddressRangeMemory, TypeStatic)
The first 6 entries match up, but then it's all wrong. It seems that maybe a buffer is only large enough for 8 entries, and when the table is too big, things just get confused?
comment:35 by , 12 months ago
comment:36 by , 12 months ago
Hmm, that's just for storing ranges though, not reading them, and the way it's accessed it looks like if there's more than 6 entries we will just overwrite one or more of them?
by , 12 months ago
the log after grabbing latest iso nightly build from pulkomandy
by , 11 months ago
by , 11 months ago
syslog after rebasing from https://review.haiku-os.org/c/haiku/+/7111
follow-up: 38 comment:37 by , 11 months ago
In the ACPI dump, I see that the non-working ranges have _Y00
, _Y01
and _Y02
in their flags. What are these? Could they prevent decoding with our code?
by , 11 months ago
syslog after rebasing from https://review.haiku-os.org/c/haiku/+/7111
by , 11 months ago
syslog after rebasing from https://review.haiku-os.org/c/haiku/+/7111, obtained from lenovo thinkpad x1 carbon 2014 for comparison.
comment:38 by , 11 months ago
as far as i understand, it would be 14th parameter from PCI Resource table (ideapad flex got RES0
and thinkpad x1 carbon got _CRS
)
Replying to pulkomandy:
In the ACPI dump, I see that the non-working ranges have
_Y00
,_Y01
and_Y02
in their flags. What are these? Could they prevent decoding with our code?
yes, it could be... from syslog.7 (with lenovo thinkpad x1 carbon) we can see from line 456 that i have _Y17
parameter cause have length 0, in line 457 i got _Y18
have same length until line 467 i got _Y22
parameter.
KERN: PCI: range from ACPI [0(1),cf7(1)] with length cf8 454 KERN: PCI: range from ACPI [d00(1),ffff(1)] with length f300 455 KERN: PCI: range from ACPI [a0000(1),bffff(1)] with length 20000 456 KERN: PCI: range from ACPI [c0000(1),c3fff(1)] with length 0 457 KERN: PCI: range from ACPI [c4000(1),c7fff(1)] with length 0 458 KERN: PCI: range from ACPI [c8000(1),cbfff(1)] with length 0 459 KERN: PCI: range from ACPI [cc000(1),cffff(1)] with length 0 460 KERN: PCI: range from ACPI [d0000(1),d3fff(1)] with length 0 461 KERN: PCI: range from ACPI [d4000(1),d7fff(1)] with length 0 462 KERN: PCI: range from ACPI [d8000(1),dbfff(1)] with length 0 463 KERN: PCI: range from ACPI [dc000(1),dffff(1)] with length 0 464 KERN: PCI: range from ACPI [e0000(1),e3fff(1)] with length 0 465 KERN: PCI: range from ACPI [e4000(1),e7fff(1)] with length 0 466 KERN: PCI: range from ACPI [e8000(1),ebfff(1)] with length 0 467 KERN: PCI: range from ACPI [ec000(1),effff(1)] with length 0 468 KERN: PCI: range from ACPI [fff00000(1),febfffff(1)] with length fed00000 469 KERN: PCI: range from ACPI [fed40000(1),fed4bfff(1)] with length c000
in line 468 i got _Y23
parameter, but it jumped to correct maximum address (with wrong range minimum and length)
but it can be weird for line 456 which i got right pci range and address eventough have _Y24
parameter.
is that pci range with that parameter could/might be treat separately? (as we can see from linux dmesgpci, the pci range with that parameter not seems there like pci range without that parameter. i mean, linux don't generally treat pci range with that parameter unlike other ones).
from thinkpad x1 carbon it's not hit ASSERT like should be like i got from ideapad flex 10
comment:39 by , 10 months ago
Blocking: | 18536 added |
---|
comment:40 by , 10 months ago
Please retest after hrev57511; this merges that patch, but in the meantime there were some fixes to ACPI and an upgrade of ACPICA so perhaps something might differ.
comment:41 by , 10 months ago
Hmm, there appears to be a "producer/consumer" flag here that FreeBSD checks, possibly ignoring the value depending: https://xref.landonf.org/source/xref/freebsd-current/sys/dev/acpica/acpi_resource.c#387
Do we need to do the same?
comment:42 by , 10 months ago
Ah: https://docs.kernel.org/PCI/acpi-info.html
ACPI defines a Consumer/Producer bit to distinguish the bridge registers ("Consumer") from the bridge apertures ("Producer") [4, 5], but early BIOSes didn't use that bit correctly. The result is that the current ACPI spec defines Consumer/Producer only for the Extended Address Space descriptors
follow-up: 47 comment:43 by , 10 months ago
Presumably the range in question:
KERN: PCI: range from ACPI [80000000(1),907ffffe(1)] with length 10800000
The maximum here differs from the minimum+length by 0x2. That seems suspicious?
comment:44 by , 10 months ago
Blocking: | 18751 added |
---|
comment:45 by , 10 months ago
Blocking: | 18764 added |
---|
comment:46 by , 10 months ago
Priority: | normal → blocker |
---|
More instances of this have shown up. Increasing priority as it's a boot-failure regression.
comment:47 by , 10 months ago
Replying to waddlesplash:
The maximum here differs from the minimum+length by 0x2. That seems suspicious?
from https://linux-hardware.org/?probe=ac4be1ce4d&log=dmesg the range looks legit:
[ 0.556923] pci_bus 0000:00: root bus resource [mem 0x80000000-0x907ffffe window]
comment:48 by , 8 months ago
Right, but the length doesn't seem to match it. So what's going on there?
comment:49 by , 8 months ago
Blocking: | 18856 added |
---|
comment:50 by , 7 months ago
Hello,
I have made a new fix, removing the assert and trying to handle anything the ACPI tables may have:
https://review.haiku-os.org/c/haiku/+/7581
Let me know if that works on the different machines.
by , 6 months ago
considered to be fixed by recent commit (hrev 57717). thanks for the fix
I fear that those end lines are only consequences of what happens before and won't help much to understand the problem. Try to play with boot options to get a complete syslog. One way to do that is to activate 'Enable on screen debug output' and take pictures of each page on screen. Then add the pics as attachments.