Opened 3 years ago

Last modified 18 months ago

#17031 reopened bug

Haiku doesn't boot unless SD slot empty

Reported by: tojoko Owned by: pulkomandy
Priority: high Milestone: Unscheduled
Component: Drivers/Disk/MMC Version:
Keywords: Cc:
Blocked By: Blocking:
Platform: x86

Description (last modified by pulkomandy)

From hrev54725, 20. Nov. 20 to 18.04.2021 something broke Haiku support for my old netbook (it did work before!) - while it now freezes on bootup.

i will try one more update to the latest nightly, but the bug seems quite constant to me right now.

king regards Tony

Attachments (28)

previous_syslog (75.1 KB ) - added by tojoko 3 years ago.
syslog.old (512.0 KB ) - added by tojoko 3 years ago.
previous_syslog.2 (64.8 KB ) - added by tojoko 3 years ago.
syslog.2 (495.1 KB ) - added by tojoko 3 years ago.
syslog.2.old (512.0 KB ) - added by tojoko 3 years ago.
syslog (151.1 KB ) - added by tojoko 3 years ago.
previous_syslog.3 (76.9 KB ) - added by tojoko 3 years ago.
syslog.3.old (512.1 KB ) - added by tojoko 3 years ago.
syslog.3 (69.1 KB ) - added by tojoko 3 years ago.
es1370.log (109 bytes ) - added by tojoko 3 years ago.
syslog.4 (388.9 KB ) - added by tojoko 3 years ago.
After the latest update it doesn't boot anything anymore, with or without sd-card.
syslog.4.old (512.0 KB ) - added by tojoko 3 years ago.
And once more.
previous_syslog.4 (225.4 KB ) - added by tojoko 3 years ago.
last syslog of failed boot after last update
syslog.5.old (512.0 KB ) - added by tojoko 2 years ago.
still the same
syslog.5 (478.7 KB ) - added by tojoko 2 years ago.
the latest one (hrev55877)
previous_syslog.5 (48.1 KB ) - added by tojoko 2 years ago.
The very latest proof of boot failure, if i'm not mistaken.
previous_syslog.6 (48.8 KB ) - added by tojoko 2 years ago.
The very last and 'fixed' suggestion.
previous_syslog.7 (159.3 KB ) - added by tojoko 2 years ago.
syslog of an old hrev51109 booting without any complications
syslog.6 (265.4 KB ) - added by tojoko 2 years ago.
Now, it doesn't boot at all anymore. - However, i'm not sure if i'm usin' the right version.
syslog.7 (106.6 KB ) - added by tojoko 2 years ago.
No improvement yet.
syslog.6.old (512.0 KB ) - added by tojoko 2 years ago.
just another syslog
previous_syslog.8 (82.3 KB ) - added by tojoko 20 months ago.
hrev56364I
syslog.8 (294.6 KB ) - added by tojoko 20 months ago.
hrev56364II
syslog.9 (158.6 KB ) - added by tojoko 20 months ago.
hrev56306
syslog.10 (87.2 KB ) - added by tojoko 20 months ago.
One more try, thanks to @madmax
syslog.11 (305.4 KB ) - added by tojoko 20 months ago.
No, sorry, doesn't work.
previous_syslog.9 (68.3 KB ) - added by tojoko 20 months ago.
it just doesn't work
syslog.12 (91.5 KB ) - added by tojoko 20 months ago.
one more try

Change History (74)

by tojoko, 3 years ago

Attachment: previous_syslog added

by tojoko, 3 years ago

Attachment: syslog.old added

comment:1 by pulkomandy, 3 years ago

Description: modified (diff)
Keywords: hrev54725 removed
Milestone: UnscheduledR1/beta3
Priority: normalblocker

comment:2 by pulkomandy, 3 years ago

Can you tell us which is the last known working revision and which one is the first known broken?

Where does the boot stop exactly? On the boot screen? If so, which icons are lit up?

Also, which version was used to create the syslogs you have attached?

comment:3 by pulkomandy, 3 years ago

In your logs I see:

  • Lots of errors from USB devices. If possible, try to boot with removing USB devices
  • You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?

comment:4 by korli, 3 years ago

Please try to boot while block listing the intel_cstates module.

by tojoko, 3 years ago

Attachment: previous_syslog.2 added

by tojoko, 3 years ago

Attachment: syslog.2 added

by tojoko, 3 years ago

Attachment: syslog.2.old added

in reply to:  3 ; comment:5 by tojoko, 3 years ago

Replying to pulkomandy:

In your logs I see:

  • Lots of errors from USB devices. If possible, try to boot with removing USB devices
  • You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?

Last known working configuration was hrev54725

It hangs after the third boot icon.

in reply to:  5 comment:6 by tojoko, 3 years ago

Replying to tojoko:

Replying to pulkomandy:

In your logs I see:

  • Lots of errors from USB devices. If possible, try to boot with removing USB devices
  • You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?

Last known working configuration was hrev54725

First one know not working was / is hrev55193

It hangs after the third boot icon.

No, the 4th! - The one with the leaf.

Last edited 3 years ago by tojoko (previous) (diff)

by tojoko, 3 years ago

Attachment: syslog added

comment:7 by korli, 3 years ago

KERN: no valid cpufreq module found
KERN: no valid cpuidle module found

in reply to:  3 comment:8 by tojoko, 3 years ago

Replying to pulkomandy:

In your logs I see:

  • Lots of errors from USB devices. If possible, try to boot with removing USB devices
  • You have an SD/MMC reader and the MMC driver is trying to access it, but I see no response when trying to get the SD card capacity. Is there an SD card inserted? Can you try removing it? Can you try disabling the MMC driver from the boot menu?

you were right - it boots fine without the sd-card. Am i the only one who consideres that weired?

comment:9 by pulkomandy, 3 years ago

Component: - GeneralDrivers/Disk/MMC
Owner: changed from nobody to pulkomandy

in reply to:  4 comment:10 by tojoko, 3 years ago

Replying to korli:

Please try to boot while block listing the intel_cstates module.

Hi and Thanks - but where am i supposed to find that option? ( i thought i do no about the boot-advanced options). sincerely

comment:11 by pulkomandy, 3 years ago

Proposed fix: https://review.haiku-os.org/c/haiku/+/4121

My understanding is that the intel_cstates module is not involved in this case, it's just the SD/MMC driver deadlocking.

The build bot should soon offer a test build at review.haiku-os.org that you can use to confirm the problem is fixed for you. Then this can be merged and integrated in beta3.

Thanks for testing!

in reply to:  11 comment:12 by tojoko, 3 years ago

Replying to pulkomandy:

Proposed fix: https://review.haiku-os.org/c/haiku/+/4121

My understanding is that the intel_cstates module is not involved in this case, it's just the SD/MMC driver deadlocking.

The build bot should soon offer a test build at review.haiku-os.org that you can use to confirm the problem is fixed for you. Then this can be merged and integrated in beta3.

Thanks for testing!

Thank you - but no improvment yet.

comment:13 by pulkomandy, 3 years ago

In that case, can you provide an updated syslog? Either with hrev55202 or later, or with the build from Gerrit.

comment:14 by nielx, 3 years ago

Is this issue still a blocker for R1 beta 3?

comment:15 by Coldfirex, 3 years ago

It's been a month since tojoko replied so I would say no.

comment:16 by nielx, 3 years ago

Resolution: not reproducible
Status: newclosed

comment:17 by pulkomandy, 3 years ago

Milestone: R1/beta3R1/beta4
Priority: blockerhigh
Resolution: not reproducible
Status: closedreopened

Pretty clearly the original reporter does not agree that this is closed. Please don't close tickets this way.

Let's make it non-blocker for beta3 and hope we can soon hear back from tojoko with an updated syslog.

comment:18 by nielx, 3 years ago

Apologies, I seem to have gotten confused and closed it rather than deprioritize it.

comment:19 by tojoko, 3 years ago

Sorry, i still own the computer but the Power supply got damaged by flood. I'll keep you posted.

comment:20 by tojoko, 3 years ago

It still doesn't boot, unless i remove the sd-card - then it boots fine.

i do have three syslogs (would expect two, one with failure, one with fullt boot).

kind regards

tony

by tojoko, 3 years ago

Attachment: previous_syslog.3 added

by tojoko, 3 years ago

Attachment: syslog.3.old added

by tojoko, 3 years ago

Attachment: syslog.3 added

by tojoko, 3 years ago

Attachment: es1370.log added

by tojoko, 3 years ago

Attachment: syslog.4 added

After the latest update it doesn't boot anything anymore, with or without sd-card.

by tojoko, 3 years ago

Attachment: syslog.4.old added

And once more.

comment:21 by tojoko, 3 years ago

Still the same - i can boot (and update) haiku, if, and only if i remove the sd-card before - no other os with this problem on this hardware (win 10 and lubuntu both boot just fine :(.

by tojoko, 3 years ago

Attachment: previous_syslog.4 added

last syslog of failed boot after last update

by tojoko, 2 years ago

Attachment: syslog.5.old added

still the same

comment:22 by waddlesplash, 2 years ago

Summary: Haiku doesn't boot (anymore!)Haiku doesn't boot unless SD slot empty

comment:23 by korli, 2 years ago

Could you update and check? Thanks

comment:24 by tojoko, 2 years ago

Unfortunately no improvement yet. Still the same. hangs on boot up with sd-card pluged in - boots fine when removed. could still be, that the sd card is somehow broken or mal formated? - But shouldn't end like this either way, i guess.

by tojoko, 2 years ago

Attachment: syslog.5 added

the latest one (hrev55877)

comment:25 by pulkomandy, 2 years ago

Hello, The syslog you uploaded this time contains, as far as I can see, 3 boot attempts with sd card disconnected. I see it from this log:

	KERN: [33msdhci_pci:[0m Card not inserted, not powering on for now

In previous cases you had uploaded the previous_syslog file, this seems more useful as previous_syslog, previous_syslog.2 and previous_syslog.3 all contain some info about an SD card failing to work.

In previous_syslog.3:

1313	[33msdhci_pci:[0m supports_device(vid:8086 pid:811c)
1314	[33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008
1315	[33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **)
1316	[33msdhci_pci:[0m CALLED status_t register_child_devices(void *)
1317	[33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **)
1318	[33msdhci_pci:[0m Register SD bus at slot 1, using bar 0
1319	[33msdhci_pci:[0m interrupts count: 0
1320	[33msdhci_pci:[0m irq interrupt line: 22
1321	[33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz
1322	[33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *)
1323	module: Search for bus_managers/mmc_bus/device/v1 failed.
1324	[33msdhci_pci:[0m No vendor or device id attribute

1629	[33msdhci_pci:[0m supports_device(vid:8086 pid:811c)
1630	[33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008
1631	[33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **)
1632	[33msdhci_pci:[0m CALLED status_t register_child_devices(void *)
1633	[33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **)
1634	[33msdhci_pci:[0m Register SD bus at slot 1, using bar 0
1635	[33msdhci_pci:[0m interrupts count: 0
1636	set MTRRs to:
1637	  mtrr:  0: base: 0x3f6b0000, size:    0x10000, type: 0
1638	  mtrr:  1: base: 0x3f6c0000, size:    0x40000, type: 0
1639	  mtrr:  2: base: 0x80000000, size: 0x80000000, type: 0
1640	  mtrr:  3: base: 0x3f800000, size:   0x800000, type: 1
1641	[33msdhci_pci:[0m irq interrupt line: 22
1642	[33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz
1643	[33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *)
1644	module: Search for bus_managers/mmc_bus/device/v1 failed.
1645	set MTRRs to:
1646	  mtrr:  0: base: 0x3f6b0000, size:    0x10000, type: 0
1647	  mtrr:  1: base: 0x3f6c0000, size:    0x40000, type: 0
1648	  mtrr:  2: base: 0x80000000, size: 0x80000000, type: 0
1649	  mtrr:  3: base: 0x3f800000, size:   0x800000, type: 1
1650	[33msdhci_pci:[0m No vendor or device id attribute

This one seems to have the mmc_bus driver disabled.

So the only info I have is from the older previous_syslog and previous_syslog.2:

659	[33msdhci_pci:[0m supports_device(vid:8086 pid:811c)
660	[33msdhci_pci:[0m SDHCI Device found! Subtype: 0x0005, type: 0x0008
661	[33msdhci_pci:[0m CALLED status_t init_device(device_node *, void **)
662	[33msdhci_pci:[0m CALLED status_t register_child_devices(void *)
663	[33msdhci_pci:[0m CALLED status_t init_bus(device_node *, void **)
664	[33msdhci_pci:[0m Register SD bus at slot 1, using bar 0
665	[33msdhci_pci:[0m interrupts count: 0
666	[33msdhci_pci:[0m irq interrupt line: 22
667	[33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz
668	[33mmmc_bus:[0m CALLED status_t mmc_bus_added_device(device_node *)
669	[33mmmc_bus:[0m CALLED status_t mmc_bus_init(device_node *, void **)
670	[33mmmc_bus:[0m CALLED MMCBus::MMCBus(device_node *)
671	[33msdhci_pci:[0m CALLED void set_scan_semaphore(void *, long int)
672	[33mmmc_bus:[0m MMC bus object created
673	[33mmmc_bus:[0m Reset the bus...
674	[33msdhci_pci:[0m ExecuteCommand(0, 0)
675	[33mmmc_disk:[0m CALLED float mmc_disk_supports_device(device_node *)
676	[33msdhci_pci:[0m interrupt function called 1
677	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m Command complete interrupt handled
678	[33mmmc_disk:[0m Could not get device type
679	[33msdhci_pci:[0m Command response available
680	[33msdhci_pci:[0m Command execution 0 complete
681	Highpoint-IDE: supports_device()
682	[33mmmc_bus:[0m CMD0 result: No error
683	Highpoint-IDE: supports_device()
684	KDiskDeviceManager::_Scan(/dev/disk/ata)
685	KDiskDeviceManager::_Scan(/dev/disk/ata/0)
686	KDiskDeviceManager::_Scan(/dev/disk/ata/0/master)
687	KDiskDeviceManager::_Scan(/dev/disk/ata/0/master/raw)
688	  found device: /dev/disk/ata/0/master/raw
689	DMAResource@0x8280a880: low/high 0/100000000, max segment count 512, align 2, boundary 65536, max transfer 33553920, max segment size 33554432
690	slab memory manager: created area 0xde001000 (598)
691	slab memory manager: created area 0xdf001000 (599)
692	[33mmmc_bus:[0m Scanning the bus
693	[33msdhci_pci:[0m SDCLK frequency: 48MHz / 128 = 375kHz
694	[33msdhci_pci:[0m CALLED void set_bus_width(void *, int)
695	[33msdhci_pci:[0m ExecuteCommand(8, 1aa)
696	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
697	[33msdhci_pci:[0m interrupt function called 1
698	[33msdhci_pci:[0m Command complete interrupt handled
699	[33msdhci_pci:[0m real status = 0 command line busy: 0
700	[33msdhci_pci:[0m Command response available
701	[33msdhci_pci:[0m Command execution 8 complete
702	[33msdhci_pci:[0m ExecuteCommand(55, 0)
703	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
704	[33msdhci_pci:[0m interrupt function called 1
705	[33msdhci_pci:[0m Command complete interrupt handled
706	[33msdhci_pci:[0m real status = 0 command line busy: 0
707	[33msdhci_pci:[0m Command response available
708	[33msdhci_pci:[0m Command execution 55 complete
709	[33msdhci_pci:[0m ExecuteCommand(41, 40ff8000)
710	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
711	[33msdhci_pci:[0m interrupt function called 1
712	[33msdhci_pci:[0m Command complete interrupt handled
713	[33msdhci_pci:[0m real status = 0 command line busy: 0
714	[33msdhci_pci:[0m Command response available
715	[33msdhci_pci:[0m Command execution 41 complete
716	[33mmmc_bus:[0m Card is busy

So far this looks normal.

Later on the card itself finishes initializing:

1155	[33msdhci_pci:[0m ExecuteCommand(55, 0)
1156	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
1157	[33msdhci_pci:[0m interrupt function called 1
1158	[33msdhci_pci:[0m Command complete interrupt handled
1159	[33msdhci_pci:[0m real status = 0 command line busy: 0
1160	[33msdhci_pci:[0m Command response available
1161	[33msdhci_pci:[0m Command execution 55 complete
1162	[33msdhci_pci:[0m ExecuteCommand(41, 40ff8000)
1163	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
1164	[33msdhci_pci:[0m interrupt function called 1
1165	[33msdhci_pci:[0m Command complete interrupt handled
1166	[33msdhci_pci:[0m real status = 0 command line busy: 0
1167	[33msdhci_pci:[0m Command response available
1168	[33msdhci_pci:[0m Command execution 41 complete
1169	[33mmmc_bus:[0m Voltage range: ff8000

We then finish initializing it:

1170	[33msdhci_pci:[0m ExecuteCommand(2, 0)
1171	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
1172	[33msdhci_pci:[0m real status = 0 command line busy: 1
1173	[33msdhci_pci:[0m interrupt function called 1
1174	[33msdhci_pci:[0m Command complete interrupt handled
1175	[33msdhci_pci:[0m Command response available
1176	[33msdhci_pci:[0m Command execution 2 complete
1177	[33msdhci_pci:[0m ExecuteCommand(3, 0)
1178	[33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m command complete sem acquired, status: 0
1179	[33msdhci_pci:[0m interrupt function called 1
1180	[33msdhci_pci:[0m Command complete interrupt handled
1181	[33msdhci_pci:[0m real status = 0 command line busy: 0
1182	[33msdhci_pci:[0m Command response available
1183	[33msdhci_pci:[0m Command execution 3 complete
1184	[33mmmc_bus:[0m RCA: b368 Status: 520
1185	[33mmmc_disk:[0m CALLED float mmc_disk_supports_device(device_node *)
1186	[33mmmc_disk:[0m SDHC card found, parent: 0x82b96328
1187	[33mmmc_disk:[0m CALLED status_t mmc_disk_register_device(device_node *)
1188	[33mmmc_disk:[0m CALLED status_t mmc_disk_init_driver(device_node *, void **)
1189	[33mmmc_disk:[0m MMC bus handle: 0x814749e0 bus_managers/mmc/driver_v1
1190	DMAResource@0x8280a780: low/high 0/100000000, max segment count 1, align 512, boundary 524288, max transfer 18446744073709551615, max segment size 523776
1191	btrfs [2479786:    16] invalid superblock!
1192	[33mmmc_disk:[0m MMC card device initialized for RCA b368
1193	[33mmmc_disk:[0m CALLED status_t mmc_disk_register_child_devices(void *)
1194	publish device: node 0x82b962d8, path disk/mmc/0/raw, module drivers/disk/mmc/mmc_disk/device_v1

So far so good...

Then we send another time the command 2 to check if there is any other SD card there (it's possible, but not common, to have multiple cards connected):

1195	[33msdhci_pci:[0m ExecuteCommand(2, 0)
1196	  returned: -1
1197	[33msdhci_pci:[0m Wait for command complete...  trying: file_systems/exfat/v1
1198	[33msdhci_pci:[0m command complete sem acquired, status: 0
1199	[33msdhci_pci:[0m real status = 18000 command line busy: 1

This command seems to confuse the card or the controller, we see "command line busy" which means the command did not complete as expected.

I see that we get "command complete sem acquired" without any message from the interrupt handler (normally there would be a "interrupt function called 1" log just before). In older versions of the driver there were problems with the handling of this synchronization, it could still be the case but I want to make sure I am looking at logs matching the current code of the driver.

Can you capture a log again, making sure the driver (sdhci_pci, mmc_bus and mmc_disk) is enabled, and a card inserted in the log that you get?

comment:26 by tojoko, 2 years ago

Well, sorry, i tried to go by the time stamps.

And thanks for the clearifications, i would like to do some more research on my own.

However, it boots fine until the center icon, the one i like to think of as the canadian flag. I don't end up in kdl, it just never gets finished, until i hit ctrl alt del.

by tojoko, 2 years ago

Attachment: previous_syslog.5 added

The very latest proof of boot failure, if i'm not mistaken.

comment:27 by pulkomandy, 2 years ago

Hello,

Ok, so the situation did not change and there is indeed some strange decisions (from me) in the way the interrupts are handled.

Here is a change to put the interrupt handling in a more logical order: https://review.haiku-os.org/c/haiku/+/4985

Let me know if that helps with the problem or not (you can build it yourself or wait for madmax's bot to build the change and provide a test haiku image with it on Gerrit).

comment:28 by bruno, 2 years ago

Is your SD-Card slot empty?

in reply to:  27 comment:29 by tojoko, 2 years ago

Replying to pulkomandy:

Hello,

Ok, so the situation did not change and there is indeed some strange decisions (from me) in the way the interrupts are handled.

Here is a change to put the interrupt handling in a more logical order: https://review.haiku-os.org/c/haiku/+/4985

Let me know if that helps with the problem or not (you can build it yourself or wait for madmax's bot to build the change and provide a test haiku image with it on Gerrit).

It took me some while to download and burn the image to an usb stick (only one available seemed tricky).

I thought, the solution would be, to keep track of the loops and as soon as #numberOfDevicesFound != #numberOfLoops to break the loop - if or not all sd-cards have been found.

However, i had hoped, i have better news - your solution is the same problem.

@bruno - i remove the sd-card for the bug reports.

cheers tony

by tojoko, 2 years ago

Attachment: previous_syslog.6 added

The very last and 'fixed' suggestion.

comment:30 by pulkomandy, 2 years ago

Indeed, this didn't change a lot but it's a bit easier to check the execution flow.

So, something is not working right with the second CMD2. It should get no answer from the card but it should timeout in that case. On your controller, it seems it doesn't, and we get stuck there forever.

If we can't figure out a way to make this controller time out properly, we could also add a "quirk" to recognize this specific controller and limit it to a single SD card (which will be good enough in most cases, since setups with multiple SD cards on the same controller are very uncommon).

in reply to:  30 comment:31 by tojoko, 2 years ago

Replying to pulkomandy:

Indeed, this didn't change a lot but it's a bit easier to check the execution flow.

So, something is not working right with the second CMD2. It should get no answer from the card but it should timeout in that case. On your controller, it seems it doesn't, and we get stuck there forever.

If we can't figure out a way to make this controller time out properly, we could also add a "quirk" to recognize this specific controller and limit it to a single SD card (which will be good enough in most cases, since setups with multiple SD cards on the same controller are very uncommon).

But, it worked once, at least in hrev51109 - what makes me wonder, if i just should downgrade or switch to the last stable release / beta3!?

But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.

Might be, that my controller ist very uncommon - but maybe it's very uncommon to remain an sd-card in the notebook (normaly the are add ones for smart phones i would guess).

sincerely Tony

by tojoko, 2 years ago

Attachment: previous_syslog.7 added

syslog of an old hrev51109 booting without any complications

comment:32 by pulkomandy, 2 years ago

But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.

I don't think so, we don't get to the point where we try to read anything from the SD card in this case.

Might be, that my controller ist very uncommon

Maybe, but the same driver could be useful later for the ARM port. The difficulty here is that there is a standard spec, but many different controllers implementing it. They can behave a bit different from each other, so depending on the exact controller used, some may be more forgiving to errors, and some may fail to follow the spec on some details. So it's hard to write code that will work everywhere. But as we fix problems with the first few controllers, I hope we will find out that the differences are always in the same areas, and the driver will easily support a lot more systems.

in reply to:  32 comment:33 by tojoko, 2 years ago

Replying to pulkomandy:

But i'm not sure, if i remove the sd-card permanently or format it, i'm not sure if i can reproduce the error anymore.

I don't think so, we don't get to the point where we try to read anything from the SD card in this case.

Right, i tried again and i can boot but can't access my sd-card on that old version of haiku.

Problem is, the usb-mouse troubles the old version and lets me end in kdl or some sort of endless-loop. That _isn't_ a problem in the latest nightly, unless i don't leave the sd-card pluged in during boot. (So i don't see any sence in filling a bug report anymore).

But i have to admit, i haven tried the latest release yet - although i'm somehow concerned, i do find one of those two problems there, again.

Might be, that my controller ist very uncommon

Maybe, but the same driver could be useful later for the ARM port. The difficulty here is that there is a standard spec, but many different controllers implementing it. They can behave a bit different from each other, so depending on the exact controller used, some may be more forgiving to errors, and some may fail to follow the spec on some details. So it's hard to write code that will work everywhere. But as we fix problems with the first few controllers, I hope we will find out that the differences are always in the same areas, and the driver will easily support a lot more systems.

ack

comment:34 by pulkomandy, 2 years ago

Hi, I think your syslog from hrev51109 predates the introduction of the SD/MMC drivers, I guess that explains why it works. It is not of much help for debugging, however.

I have made a new attempt at improving the situation by enabling more interrupts in https://review.haiku-os.org/c/haiku/+/5057 , let me know if that changes something.

comment:35 by bruno, 2 years ago

Is your SD-CARD corrupted? Does it work on other system? Mine is corrupt and if I insert it Haiku will crash 32 and 64bit

comment:36 by pulkomandy, 2 years ago

Is your SD-CARD corrupted? Does it work on other system? Mine is corrupt and if I insert it Haiku will crash 32 and 64bit

Can you please stop mixing issues together? I have already mentionned to both of you that your bugs are not related. I am aware of the other bug you already reported and if there were the same, I would have closed one as duplicate of the other already.

In this case we don't get to the point where we even try to read anything from the filesystem on the card. So it doesn't matter what's on the card here.

by tojoko, 2 years ago

Attachment: syslog.6 added

Now, it doesn't boot at all anymore. - However, i'm not sure if i'm usin' the right version.

by tojoko, 2 years ago

Attachment: syslog.7 added

No improvement yet.

by tojoko, 2 years ago

Attachment: syslog.6.old added

just another syslog

by tojoko, 20 months ago

Attachment: previous_syslog.8 added

hrev56364I

by tojoko, 20 months ago

Attachment: syslog.8 added

hrev56364II

comment:37 by tojoko, 20 months ago

Description: modified (diff)

Sorry, it doesn't seem to be easy, if possible at all, to get a usefull syslog for me.

How long is the boot-up supposed to be?

comment:38 by pulkomandy, 20 months ago

Description: modified (diff)

The last syslog is, again, the other bug you had that was fixed a few weeks ago in hrev56348:

4697	KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x0, ip 0x609bd134, write 1, user 1, exec 0, thread 0x1a4
4698	KERN: debug_server: Thread 420 entered the debugger: Segment violation
4699	KERN: stack trace, current PC 0x609bd134  commpage_memcpy + 0x1c:
4700	KERN:   (0x7133a1cc)  0x113f43e  _CopyBackToFront__11HWInterfaceR7BRegion + 0xba
4701	KERN:   (0x7133a23c)  0x111e384  _CopyBackToFront__21AccelerantHWInterfaceR7BRegion + 0xdc
4702	KERN:   (0x7133a28c)  0x113f2b5  CopyBackToFront__11HWInterfaceRC5BRect + 0x28d
4703	KERN:   (0x7133a34c)  0x113f01c  Invalidate__11HWInterfaceRC5BRect + 0x58
4704	KERN:   (0x7133a37c)  0x113efab  InvalidateRegion__11HWInterfaceRC7BRegion + 0x5b
4705	KERN:   (0x7133a3c8)  0x1136e3d  FillRegion__13DrawingEngineR7BRegionRC9rgb_color + 0x1f5
4706	KERN:   (0x7133a458)  0x10ac684  _SetBackground__7DesktopR7BRegion + 0xe4
4707	KERN:   (0x7133a4b8)  0x10a44e3  Init__7Desktop + 0x40f
4708	KERN:   (0x7133a548)  0x109ac8c  _CreateDesktop__9AppServerUiPCc + 0xc8
4709	KERN:   (0x7133a598)  0x109aae4  MessageReceived__9AppServerP8BMessage + 0xc8
4710	KERN:   (0x7133a618)  0x1ac53ff  DispatchMessage__7BLooperP8BMessageP8BHandler + 0x5b
4711	KERN:   (0x7133a648)  0x1aba841  DispatchMessage__12BApplicationP8BMessageP8BHandler + 0x4e9
4712	KERN:   (0x7133a848)  0x1ac7169  task_looper__7BLooper + 0x205
4713	KERN:   (0x7133a888)  0x1ac5cc1  Loop__7BLooper + 0x65
4714	KERN:   (0x7133a8c8)  0x1ab91ba  Run__12BApplication + 0x2a
4715	KERN:   (0x7133a8f8)  0x109b074  main + 0x8c
4716	KERN:   (0x7133a938)  0x1099d5f  _start + 0x5b
4717	KERN:   (0x7133a968)  0x2235270  runtime_loader + 0x134

Your syslog.8 says you are running "Haiku revision: hrev55914"

All your other syslog are this or even older versions.

This means you are testing code in the state it was back in february.

Please make sure you are up to date to the latest version, otherwise the logs are not very helpful to me.

by tojoko, 20 months ago

Attachment: syslog.9 added

comment:39 by pulkomandy, 20 months ago

I see that my change https://review.haiku-os.org/c/haiku/+/5057 was not merged.

Are you set up to build Haiku yourself with this change and try the resulting image? Or should I provide you with a pre-built test image?

comment:41 by waddlesplash, 20 months ago

No, there was, but it's aged out and the build results have been deleted.

by tojoko, 20 months ago

Attachment: syslog.10 added

One more try, thanks to @madmax

comment:43 by pulkomandy, 20 months ago

Thanks, we're making some progress!

The command that was previously never finishing now gets a timeout:

719	KERN: [33msdhci_pci:[0m ExecuteCommand(2, 0)
720	KERN: [33msdhci_pci:[0m Wait for command complete...[33msdhci_pci:[0m interrupt function called 18000
721	KERN: [33msdhci_pci:[0m Command complete interrupt handled
722	KERN: [33msdhci_pci:[0m command complete sem acquired, status: 18000
723	KERN: [33msdhci_pci:[0m real status = 0 command line busy: 1
724	KERN: [33msdhci_pci:[0m Command response available
725	KERN: [33msdhci_pci:[0m Command execution failed 18000
726	KERN: [33msdhci_pci:[0m SDCLK frequency: 48MHz / 1 = 48000kHz

Which means the code can continue to the next command:

1059	KERN: [33msdhci_pci:[0m ExecuteCommand(9, b3680000)

but that fails because the timeout left the bus in an incorrect state:

1060	KERN: PANIC: Command execution impossible, command inhibit

In this log you had an SD card inserted. Can you also do a log with this version and without a card inserted?

in reply to:  43 comment:44 by tojoko, 20 months ago

In this log you had an SD card inserted. Can you also do a log with this version and without a card inserted?

Sure. But iirc i actually didn't have an SD-card inserted until the end - i still just can't boot with an sd-card plugged in and i don't get a log for that, because Haiku searches and finds the sd-card bevore the boot-volume (containing any log-file) is probably mounted.

I now would call that the actual bug, because i see no sence in looking for an sd-card and waiting to mount the boot volume, since it is unlikely that the sd-card contains the boot volume (besides Raspery-P.).

So, could you please ensure that the log-file is mounted during startup bevore any sd-card is located? - Otherwise we'll never get a proper bug syslog i'm afraid.

cheers, tojoko

by tojoko, 20 months ago

Attachment: syslog.11 added

No, sorry, doesn't work.

by tojoko, 20 months ago

Attachment: previous_syslog.9 added

it just doesn't work

by tojoko, 20 months ago

Attachment: syslog.12 added

one more try

comment:45 by pulkomandy, 20 months ago

But iirc i actually didn't have an SD-card inserted until the end

Well your log manages to run several commands on the SD card and then it breaks after checking if there is a second one. But it could as well be the controller not really following the standard.

I now would call that the actual bug, because i see no sence in looking for an sd-card and waiting to mount the boot volume

The code just scans all the possible disks to see what it can find. There is no reason to do this differently. I think it should have a timeout at some higher level if the driver fails (as it does now), but I don't know that area of the code very well so I will let someone else consider that. Until then I can try to fix the driver so it does the timeouts on its own, and more importantly, not make it go into the kernel debugger as it does now.

Otherwise we'll never get a proper bug syslog i'm afraid.

I have several ones to work from already. Another option is to boot with on-screen debug and take pictures but that will be more annoying to do if we need many pages of it for each attempt.

That's how it is with hardware and drivers, a lot of trial and error until we get everything exactly right. It's a lot easier to do when I have the hardware myself and I can test it directly, but in this case I can't just buy laptops with all possible sd card controllers in them.

comment:46 by waddlesplash, 18 months ago

Milestone: R1/beta4Unscheduled

Only one device apparently affected: moving out of beta milestone.

Note: See TracTickets for help on using tickets.