#5506 closed bug (fixed)
Kernel panic "heap configuration invalid - max bin count reached"
Reported by: | drcouzelis | Owned by: | mmlr |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | x86 |
Description
This development build failed to boot from a live CD on my computer:
haiku-nightly-hrev35650-x86gcc4hybrid-cd.zip 11:28PM 27th February, 2010
I attached a "screenshot" of the output from the "bt" command.
Here is a information about all of my computer hardware: hardware
Other notes: I decided to try the latest nightly build since the alpha 1 build has a graphical glitch on my computer. (the colors are messed up) In other words, the alpha 1 build boots and runs on my computer. I have not done any "install" yet; I have only run from a live CD. I haven't reported the graphical glitch from the alpha 1 build because I want to see if it still exists in the latest nightly build.
Please let me know if I can provide any further information. Thank you.
Attachments (3)
Change History (16)
by , 15 years ago
comment:1 by , 15 years ago
Component: | - General → System/Kernel |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:2 by , 15 years ago
follow-up: 4 comment:3 by , 15 years ago
Thank you very much for looking into it. Yes, it is always reproducible. I also get the same error on today's nightly "hrev35693".
Since alpha 1 boots, how about I download old nightly images and find out when it stopped being able to boot on my computer? In theory, that should allow you to pinpoint what the change was that causes the bug. Shall I do that?
Is there a test I can do to see if I have memory corruption? I use Arch Linux as my primary OS, by the way.
comment:4 by , 15 years ago
Replying to drcouzelis:
Since alpha 1 boots, how about I download old nightly images and find out when it stopped being able to boot on my computer? In theory, that should allow you to pinpoint what the change was that causes the bug. Shall I do that?
If you want to invest that amount of time then that'd certainly be very much appreciated.
Is there a test I can do to see if I have memory corruption? I use Arch Linux as my primary OS, by the way.
I more meant memory corruption due to Haiku bugs, not bad hardware. Of course that could theoretically be the case as well, though since the data in question really is static const and should be loaded and stay at a single location it'd be strange for it to work at first and then be gone later. A memory corruption due to some bug in the boot process is more likely.
What you can do is to dump the memory that supposedly contains the configuration to see what it ends up being. That way we might get an idea as to who's overwriting it. To do that execute these commands in the kernel debugger:
symbol sHeapClasses dw _ 8
That should give some pointer to a string (you can dump it with string <pointer>
if you're curious, it should read "small"), 0x32 -> 50 the initial percentage (unused in this case), 0x200 -> B_PAGE_SIZE / 8 the max allocation size, 0x1000 -> B_PAGE_SIZE the heap page size, 0x8 the min bin size, 0x4 the alignment, 0x8 the min count per page and 0x10 the max waste per page. If any of that doesn't match then something overwrote the config. If so please execute a dw _ 64
to get a bit more context, take another screenshot and attach it here.
follow-up: 6 comment:5 by , 15 years ago
by , 15 years ago
Attachment: | r34934-symbol.jpg added |
---|
follow-up: 7 comment:6 by , 15 years ago
Replying to drcouzelis:
I've narrowed it down to somewhere between hrev34837 (1 Jan) and hrev34934 (7 Jan). There were four builds inbetween them. I will download them while I sleep and check them out tomorrow.
Cool thanks for that!
I will attach a screenshot from the "dw" command. Everything seems to match up with what you said it should be.
Ok that's interesting. It indeed looks just as it should. Then I can only imagine that the class passed to the function would be incorrect, as the calculations are really all fixed and don't change from machine to machine.
Can you please run the following command:
call 12 -5
This should give the arguments passed to heap_create_allocator(). To make sure I'm not completely off, please run the first argument through the string
command. It should output "grow". The second argument is the allocation base which can vary, then the size which should be 0x100000 (1MB), then the heap_class pointer which is supposed to be equal to the pointer you get from symbol sHeapClasses
and then 0 (false) for not allocating the allocator structure on the heap.
comment:7 by , 15 years ago
Replying to mmlr:
then the size which should be 0x100000 (1MB)
At that point the base and size have already been changed though, so you will likely get these values off by 80 (0x50). Can you please also get a dump of the created allocator by running:
dw <second argument found above minus the 0x50 offset> 64
In case the area somehow got unreadable/writeable memory (device memory for example) it'd show here.
comment:8 by , 15 years ago
I've narrowed it down: hrev34909 (5 Jan) will boot but hrev34934 (7 Jan) fails to boot and has a kernel panic. There was no release on 6 Jan.
I will attach a screenshot with the output from the commands you requested.
Also, I recently read "Welcome to Kernel Debugging Land", so I can be ready to help more. :)
by , 15 years ago
Attachment: | r35718-call.jpg added |
---|
comment:9 by , 15 years ago
comment:10 by , 15 years ago
hrev35726 might have solved it then, which would point to a buggy BIOS.
comment:11 by , 15 years ago
comment:12 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:13 by , 15 years ago
Thank you very much! I now have Haiku installed on my computer. I will continue using it and try to submit helpful bug reports.
That's an extremely curious one! The allocator being created here is the grow heap, which uses a fixed configuration (the small heap one). It even uses a fixed size, which doesn't actually matter though as the size can't influence the configuration. The configuration is kept in the static sHeapClasses and is never modified. It had to be good at one point as heap_init already used it to set up the small heap before that point. I can only guess this to be memory corruption. Is this always reproducible? FWIW this exact image boots fine on QEMU here.