Opened 3 years ago
Last modified 4 months ago
#17031 reopened bug
Haiku doesn't boot unless SD slot empty — at Version 37
Reported by: | tojoko | Owned by: | pulkomandy |
---|---|---|---|
Priority: | high | Milestone: | Unscheduled |
Component: | Drivers/Disk/MMC | Version: | |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | x86 |
Description (last modified by )
From hrev54725, 20. Nov. 20 to 18.04.2021 something broke Haiku support for my old netbook (it did work before!) - while it now freezes on bootup.
i will try one more update to the latest nightly, but the bug seems quite constant to me right now.
king regards Tony
Change History (60)
by , 3 years ago
Attachment: | previous_syslog added |
---|
by , 3 years ago
Attachment: | syslog.old added |
---|
comment:1 by , 3 years ago
Description: | modified (diff) |
---|---|
Keywords: | hrev54725 removed |
Milestone: | Unscheduled → R1/beta3 |
Priority: | normal → blocker |
comment:2 by , 3 years ago
follow-ups: 5 8 comment:3 by , 3 years ago
In your logs I see:
- Lots of errors from USB devices. If possible, try to boot with removing USB devices
- You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?
follow-up: 10 comment:4 by , 3 years ago
Please try to boot while block listing the intel_cstates module.
by , 3 years ago
Attachment: | previous_syslog.2 added |
---|
by , 3 years ago
by , 3 years ago
Attachment: | syslog.2.old added |
---|
follow-up: 6 comment:5 by , 3 years ago
Replying to pulkomandy:
In your logs I see:
- Lots of errors from USB devices. If possible, try to boot with removing USB devices
- You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?
Last known working configuration was hrev54725
It hangs after the third boot icon.
comment:6 by , 3 years ago
Replying to tojoko:
Replying to pulkomandy:
In your logs I see:
- Lots of errors from USB devices. If possible, try to boot with removing USB devices
- You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?
Last known working configuration was hrev54725
First one know not working was / is hrev55193
It hangs after the third boot icon.
No, the 4th! - The one with the leaf.
by , 3 years ago
comment:8 by , 3 years ago
Replying to pulkomandy:
In your logs I see:
- Lots of errors from USB devices. If possible, try to boot with removing USB devices
- You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?
you were right - it boots fine without the sd-card. Am i the only one who consideres that weired?
comment:9 by , 3 years ago
Component: | - General → Drivers/Disk/MMC |
---|---|
Owner: | changed from | to
comment:10 by , 3 years ago
Replying to korli:
Please try to boot while block listing the intel_cstates module.
Hi and Thanks - but where am i supposed to find that option? ( i thought i do no about the boot-advanced options). sincerely
follow-up: 12 comment:11 by , 3 years ago
Proposed fix: https://review.haiku-os.org/c/haiku/+/4121
My understanding is that the intel_cstates module is not involved in this case, it's just the SD/MMC driver deadlocking.
The build bot should soon offer a test build at review.haiku-os.org that you can use to confirm the problem is fixed for you. Then this can be merged and integrated in beta3.
Thanks for testing!
comment:12 by , 3 years ago
Replying to pulkomandy:
Proposed fix: https://review.haiku-os.org/c/haiku/+/4121
My understanding is that the intel_cstates module is not involved in this case, it's just the SD/MMC driver deadlocking.
The build bot should soon offer a test build at review.haiku-os.org that you can use to confirm the problem is fixed for you. Then this can be merged and integrated in beta3.
Thanks for testing!
Thank you - but no improvment yet.
comment:13 by , 3 years ago
In that case, can you provide an updated syslog? Either with hrev55202 or later, or with the build from Gerrit.
comment:16 by , 3 years ago
Resolution: | → not reproducible |
---|---|
Status: | new → closed |
comment:17 by , 3 years ago
Milestone: | R1/beta3 → R1/beta4 |
---|---|
Priority: | blocker → high |
Resolution: | not reproducible |
Status: | closed → reopened |
Pretty clearly the original reporter does not agree that this is closed. Please don't close tickets this way.
Let's make it non-blocker for beta3 and hope we can soon hear back from tojoko with an updated syslog.
comment:18 by , 3 years ago
Apologies, I seem to have gotten confused and closed it rather than deprioritize it.
comment:19 by , 3 years ago
Sorry, i still own the computer but the Power supply got damaged by flood. I'll keep you posted.
comment:20 by , 3 years ago
It still doesn't boot, unless i remove the sd-card - then it boots fine.
i do have three syslogs (would expect two, one with failure, one with fullt boot).
kind regards
tony
by , 3 years ago
Attachment: | previous_syslog.3 added |
---|
by , 3 years ago
Attachment: | syslog.3.old added |
---|
by , 3 years ago
by , 3 years ago
Attachment: | es1370.log added |
---|
by , 3 years ago
After the latest update it doesn't boot anything anymore, with or without sd-card.
comment:21 by , 3 years ago
Still the same - i can boot (and update) haiku, if, and only if i remove the sd-card before - no other os with this problem on this hardware (win 10 and lubuntu both boot just fine :(.
comment:22 by , 3 years ago
Summary: | Haiku doesn't boot (anymore!) → Haiku doesn't boot unless SD slot empty |
---|
comment:24 by , 3 years ago
Unfortunately no improvement yet. Still the same. hangs on boot up with sd-card pluged in - boots fine when removed. could still be, that the sd card is somehow broken or mal formated? - But shouldn't end like this either way, i guess.
comment:25 by , 3 years ago
Hello, The syslog you uploaded this time contains, as far as I can see, 3 boot attempts with sd card disconnected. I see it from this log:
KERN: [33msdhci_pci:[0m Card not inserted, not powering on for now
In previous cases you had uploaded the previous_syslog file, this seems more useful as previous_syslog, previous_syslog.2 and previous_syslog.3 all contain some info about an SD card failing to work.
In previous_syslog.3:
1313 [33msdhci_pci:[0m supports_device(vid:8086 pid:811c) 1314 [33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008 1315 [33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **) 1316 [33msdhci_pci:[0m CALLED status_t register_child_devices(void *) 1317 [33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **) 1318 [33msdhci_pci:[0m Register SD bus at slot 1, using bar 0 1319 [33msdhci_pci:[0m interrupts count: 0 1320 [33msdhci_pci:[0m irq interrupt line: 22 1321 [33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz 1322 [33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *) 1323 module: Search for bus_managers/mmc_bus/device/v1 failed. 1324 [33msdhci_pci:[0m No vendor or device id attribute 1629 [33msdhci_pci:[0m supports_device(vid:8086 pid:811c) 1630 [33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008 1631 [33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **) 1632 [33msdhci_pci:[0m CALLED status_t register_child_devices(void *) 1633 [33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **) 1634 [33msdhci_pci:[0m Register SD bus at slot 1, using bar 0 1635 [33msdhci_pci:[0m interrupts count: 0 1636 set MTRRs to: 1637 mtrr: 0: base: 0x3f6b0000, size: 0x10000, type: 0 1638 mtrr: 1: base: 0x3f6c0000, size: 0x40000, type: 0 1639 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 0 1640 mtrr: 3: base: 0x3f800000, size: 0x800000, type: 1 1641 [33msdhci_pci:[0m irq interrupt line: 22 1642 [33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz 1643 [33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *) 1644 module: Search for bus_managers/mmc_bus/device/v1 failed. 1645 set MTRRs to: 1646 mtrr: 0: base: 0x3f6b0000, size: 0x10000, type: 0 1647 mtrr: 1: base: 0x3f6c0000, size: 0x40000, type: 0 1648 mtrr: 2: base: 0x80000000, size: 0x80000000, type: 0 1649 mtrr: 3: base: 0x3f800000, size: 0x800000, type: 1 1650 [33msdhci_pci:[0m No vendor or device id attribute
This one seems to have the mmc_bus driver disabled.
So the only info I have is from the older previous_syslog and previous_syslog.2:
659 [33msdhci_pci:[0m supports_device(vid:8086 pid:811c) 660 [33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008 661 [33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **) 662 [33msdhci_pci:[0m CALLED status_t register_child_devices(void *) 663 [33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **) 664 [33msdhci_pci:[0m Register SD bus at slot 1, using bar 0 665 [33msdhci_pci:[0m interrupts count: 0 666 [33msdhci_pci:[0m irq interrupt line: 22 667 [33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz 668 [33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *) 669 [33mmmc_bus:[0m CALLED status_t mmc_bus_init(device_node *, void **) 670 [33mmmc_bus:[0m CALLED MMCBus::MMCBus(device_node *) 671 [33msdhci_pci:[0m CALLED void set_scan_semaphore(void *, long int) 672 [33mmmc_bus:[0m MMC bus object created 673 [33mmmc_bus:[0m Reset the bus... 674 [33msdhci_pci:[0m ExecuteCommand(0, 0) 675 [33mmmc_disk:[0m CALLED float mmc_disk_supports_device(device_node *) 676 [33msdhci_pci:[0m interrupt function called 1 677 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m Command complete interrupt handled 678 [33mmmc_disk:[0m Could not get device type 679 [33msdhci_pci:[0m Command response available 680 [33msdhci_pci:[0m Command execution 0 complete 681 Highpoint-IDE: supports_device() 682 [33mmmc_bus:[0m CMD0 result: No error 683 Highpoint-IDE: supports_device() 684 KDiskDeviceManager::_Scan(/dev/disk/ata) 685 KDiskDeviceManager::_Scan(/dev/disk/ata/0) 686 KDiskDeviceManager::_Scan(/dev/disk/ata/0/master) 687 KDiskDeviceManager::_Scan(/dev/disk/ata/0/master/raw) 688 found device: /dev/disk/ata/0/master/raw 689 DMAResource@0x8280a880: low/high 0/100000000, max segment count 512, align 2, boundary 65536, max transfer 33553920, max segment size 33554432 690 slab memory manager: created area 0xde001000 (598) 691 slab memory manager: created area 0xdf001000 (599) 692 [33mmmc_bus:[0m Scanning the bus 693 [33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz 694 [33msdhci_pci:[0m CALLED void set_bus_width(void *, int) 695 [33msdhci_pci:[0m ExecuteCommand(8, 1aa) 696 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 697 [33msdhci_pci:[0m interrupt function called 1 698 [33msdhci_pci:[0m Command complete interrupt handled 699 [33msdhci_pci:[0m real status = 0 command line busy: 0 700 [33msdhci_pci:[0m Command response available 701 [33msdhci_pci:[0m Command execution 8 complete 702 [33msdhci_pci:[0m ExecuteCommand(55, 0) 703 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 704 [33msdhci_pci:[0m interrupt function called 1 705 [33msdhci_pci:[0m Command complete interrupt handled 706 [33msdhci_pci:[0m real status = 0 command line busy: 0 707 [33msdhci_pci:[0m Command response available 708 [33msdhci_pci:[0m Command execution 55 complete 709 [33msdhci_pci:[0m ExecuteCommand(41, 40ff8000) 710 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 711 [33msdhci_pci:[0m interrupt function called 1 712 [33msdhci_pci:[0m Command complete interrupt handled 713 [33msdhci_pci:[0m real status = 0 command line busy: 0 714 [33msdhci_pci:[0m Command response available 715 [33msdhci_pci:[0m Command execution 41 complete 716 [33mmmc_bus:[0m Card is busy
So far this looks normal.
Later on the card itself finishes initializing:
1155 [33msdhci_pci:[0m ExecuteCommand(55, 0) 1156 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 1157 [33msdhci_pci:[0m interrupt function called 1 1158 [33msdhci_pci:[0m Command complete interrupt handled 1159 [33msdhci_pci:[0m real status = 0 command line busy: 0 1160 [33msdhci_pci:[0m Command response available 1161 [33msdhci_pci:[0m Command execution 55 complete 1162 [33msdhci_pci:[0m ExecuteCommand(41, 40ff8000) 1163 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 1164 [33msdhci_pci:[0m interrupt function called 1 1165 [33msdhci_pci:[0m Command complete interrupt handled 1166 [33msdhci_pci:[0m real status = 0 command line busy: 0 1167 [33msdhci_pci:[0m Command response available 1168 [33msdhci_pci:[0m Command execution 41 complete 1169 [33mmmc_bus:[0m Voltage range: ff8000
We then finish initializing it:
1170 [33msdhci_pci:[0m ExecuteCommand(2, 0) 1171 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 1172 [33msdhci_pci:[0m real status = 0 command line busy: 1 1173 [33msdhci_pci:[0m interrupt function called 1 1174 [33msdhci_pci:[0m Command complete interrupt handled 1175 [33msdhci_pci:[0m Command response available 1176 [33msdhci_pci:[0m Command execution 2 complete 1177 [33msdhci_pci:[0m ExecuteCommand(3, 0) 1178 [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0 1179 [33msdhci_pci:[0m interrupt function called 1 1180 [33msdhci_pci:[0m Command complete interrupt handled 1181 [33msdhci_pci:[0m real status = 0 command line busy: 0 1182 [33msdhci_pci:[0m Command response available 1183 [33msdhci_pci:[0m Command execution 3 complete 1184 [33mmmc_bus:[0m RCA: b368 Status: 520 1185 [33mmmc_disk:[0m CALLED float mmc_disk_supports_device(device_node *) 1186 [33mmmc_disk:[0m SDHC card found, parent: 0x82b96328 1187 [33mmmc_disk:[0m CALLED status_t mmc_disk_register_device(device_node *) 1188 [33mmmc_disk:[0m CALLED status_t mmc_disk_init_driver(device_node *, void **) 1189 [33mmmc_disk:[0m MMC bus handle: 0x814749e0 bus_managers/mmc/driver_v1 1190 DMAResource@0x8280a780: low/high 0/100000000, max segment count 1, align 512, boundary 524288, max transfer 18446744073709551615, max segment size 523776 1191 btrfs [2479786: 16] invalid superblock! 1192 [33mmmc_disk:[0m MMC card device initialized for RCA b368 1193 [33mmmc_disk:[0m CALLED status_t mmc_disk_register_child_devices(void *) 1194 publish device: node 0x82b962d8, path disk/mmc/0/raw, module drivers/disk/mmc/mmc_disk/device_v1
So far so good...
Then we send another time the command 2 to check if there is any other SD card there (it's possible, but not common, to have multiple cards connected):
1195 [33msdhci_pci:[0m ExecuteCommand(2, 0) 1196 returned: -1 1197 [33msdhci_pci:[0m Wait for command complete... trying: file_systems/exfat/v1 1198 [33msdhci_pci:[0m command complete sem acquired, status: 0 1199 [33msdhci_pci:[0m real status = 18000 command line busy: 1
This command seems to confuse the card or the controller, we see "command line busy" which means the command did not complete as expected.
I see that we get "command complete sem acquired" without any message from the interrupt handler (normally there would be a "interrupt function called 1" log just before). In older versions of the driver there were problems with the handling of this synchronization, it could still be the case but I want to make sure I am looking at logs matching the current code of the driver.
Can you capture a log again, making sure the driver (sdhci_pci, mmc_bus and mmc_disk) is enabled, and a card inserted in the log that you get?
comment:26 by , 3 years ago
Well, sorry, i tried to go by the time stamps.
And thanks for the clearifications, i would like to do some more research on my own.
However, it boots fine until the center icon, the one i like to think of as the canadian flag. I don't end up in kdl, it just never gets finished, until i hit ctrl alt del.
by , 3 years ago
Attachment: | previous_syslog.5 added |
---|
The very latest proof of boot failure, if i'm not mistaken.
follow-up: 29 comment:27 by , 3 years ago
Hello,
Ok, so the situation did not change and there is indeed some strange decisions (from me) in the way the interrupts are handled.
Here is a change to put the interrupt handling in a more logical order: https://review.haiku-os.org/c/haiku/+/4985
Let me know if that helps with the problem or not (you can build it yourself or wait for madmax's bot to build the change and provide a test haiku image with it on Gerrit).
comment:29 by , 3 years ago
Replying to pulkomandy:
Hello,
Ok, so the situation did not change and there is indeed some strange decisions (from me) in the way the interrupts are handled.
Here is a change to put the interrupt handling in a more logical order: https://review.haiku-os.org/c/haiku/+/4985
Let me know if that helps with the problem or not (you can build it yourself or wait for madmax's bot to build the change and provide a test haiku image with it on Gerrit).
It took me some while to download and burn the image to an usb stick (only one available seemed tricky).
I thought, the solution would be, to keep track of the loops and as soon as #numberOfDevicesFound != #numberOfLoops to break the loop - if or not all sd-cards have been found.
However, i had hoped, i have better news - your solution is the same problem.
@bruno - i remove the sd-card for the bug reports.
cheers tony
follow-up: 31 comment:30 by , 3 years ago
Indeed, this didn't change a lot but it's a bit easier to check the execution flow.
So, something is not working right with the second CMD2. It should get no answer from the card but it should timeout in that case. On your controller, it seems it doesn't, and we get stuck there forever.
If we can't figure out a way to make this controller time out properly, we could also add a "quirk" to recognize this specific controller and limit it to a single SD card (which will be good enough in most cases, since setups with multiple SD cards on the same controller are very uncommon).
comment:31 by , 3 years ago
Replying to pulkomandy:
Indeed, this didn't change a lot but it's a bit easier to check the execution flow.
So, something is not working right with the second CMD2. It should get no answer from the card but it should timeout in that case. On your controller, it seems it doesn't, and we get stuck there forever.
If we can't figure out a way to make this controller time out properly, we could also add a "quirk" to recognize this specific controller and limit it to a single SD card (which will be good enough in most cases, since setups with multiple SD cards on the same controller are very uncommon).
But, it worked once, at least in hrev51109 - what makes me wonder, if i just should downgrade or switch to the last stable release / beta3!?
But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.
Might be, that my controller ist very uncommon - but maybe it's very uncommon to remain an sd-card in the notebook (normaly the are add ones for smart phones i would guess).
sincerely Tony
by , 3 years ago
Attachment: | previous_syslog.7 added |
---|
syslog of an old hrev51109 booting without any complications
follow-up: 33 comment:32 by , 3 years ago
But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.
I don't think so, we don't get to the point where we try to read anything from the SD card in this case.
Might be, that my controller ist very uncommon
Maybe, but the same driver could be useful later for the ARM port. The difficulty here is that there is a standard spec, but many different controllers implementing it. They can behave a bit different from each other, so depending on the exact controller used, some may be more forgiving to errors, and some may fail to follow the spec on some details. So it's hard to write code that will work everywhere. But as we fix problems with the first few controllers, I hope we will find out that the differences are always in the same areas, and the driver will easily support a lot more systems.
comment:33 by , 3 years ago
Replying to pulkomandy:
But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.
I don't think so, we don't get to the point where we try to read anything from the SD card in this case.
Right, i tried again and i can boot but can't access my sd-card on that old version of haiku.
Problem is, the usb-mouse troubles the old version and lets me end in kdl or some sort of endless-loop. That _isn't_ a problem in the latest nightly, unless i don't leave the sd-card pluged in during boot. (So i don't see any sence in filling a bug report anymore).
But i have to admit, i haven tried the latest release yet - although i'm somehow concerned, i do find one of those two problems there, again.
Might be, that my controller ist very uncommon
Maybe, but the same driver could be useful later for the ARM port. The difficulty here is that there is a standard spec, but many different controllers implementing it. They can behave a bit different from each other, so depending on the exact controller used, some may be more forgiving to errors, and some may fail to follow the spec on some details. So it's hard to write code that will work everywhere. But as we fix problems with the first few controllers, I hope we will find out that the differences are always in the same areas, and the driver will easily support a lot more systems.
ack
comment:34 by , 3 years ago
Hi, I think your syslog from hrev51109 predates the introduction of the SD/MMC drivers, I guess that explains why it works. It is not of much help for debugging, however.
I have made a new attempt at improving the situation by enabling more interrupts in https://review.haiku-os.org/c/haiku/+/5057 , let me know if that changes something.
comment:35 by , 3 years ago
Is your SD-CARD corrupted? Does it work on other system? Mine is corrupt and if I insert it Haiku will crash 32 and 64bit
comment:36 by , 3 years ago
Is your SD-CARD corrupted? Does it work on other system? Mine is corrupt and if I insert it Haiku will crash 32 and 64bit
Can you please stop mixing issues together? I have already mentionned to both of you that your bugs are not related. I am aware of the other bug you already reported and if there were the same, I would have closed one as duplicate of the other already.
In this case we don't get to the point where we even try to read anything from the filesystem on the card. So it doesn't matter what's on the card here.
by , 3 years ago
Now, it doesn't boot at all anymore. - However, i'm not sure if i'm usin' the right version.
comment:37 by , 2 years ago
Description: | modified (diff) |
---|
Sorry, it doesn't seem to be easy, if possible at all, to get a usefull syslog for me.
How long is the boot-up supposed to be?
Can you tell us which is the last known working revision and which one is the first known broken?
Where does the boot stop exactly? On the boot screen? If so, which icons are lit up?
Also, which version was used to create the syslogs you have attached?