Opened 15 years ago

Closed 6 years ago

Last modified 6 years ago

#3802 closed bug (not reproducible)

KDL at boot (buffer underrun message in serial output)

Reported by: stpere Owned by: mmlr
Priority: normal Milestone: R1
Component: Drivers/Disk Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

I'm running (or rather trying to run :) hrev30316 on real hardware. I get a KDL at boot at the fourth icon, and I get a PANIC message saying that it didn't find any boot partition.

I'm using the new ATA bus_manager, because the legacy one didn't work on my hardware when I first tried a few days ago.

It was working a few revisions ago (I could binary search if needed) on the same hardware, with the ATA bus_manager.

I wired myself a NULL modem to get serial output, I attached the result.

I spotted "buffer underrun" messages which seems significant. There seems to be something wrong with DMA in that situation, but I won't try to diagnosis any further :)

Attachments (2)

serial output (26.1 KB ) - added by stpere 15 years ago.
serial output 2 (26.3 KB ) - added by stpere 15 years ago.
Serial output with continue commented

Download all attachments as: .zip

Change History (12)

comment:1 by anevilyak, 15 years ago

Component: - GeneralDrivers/Disk
Owner: changed from axeld to mmlr

by stpere, 15 years ago

Attachment: serial output added

in reply to:  description ; comment:2 by mmlr, 15 years ago

Status: newassigned

Replying to stpere:

It was working a few revisions ago (I could binary search if needed) on the same hardware, with the ATA bus_manager.

It was hrev30286. If it worked before that it means the diagnostic code is one more thing we can't trust...

I wired myself a NULL modem to get serial output, I attached the result.

It seems that the output is a bit garbled, is it possible that you are running this on a SMP machine? The text is cut at some places and intermixed with different text at others. If you see something like this, try running with SMP disabled so the output gets cleaner.

I spotted "buffer underrun" messages which seems significant. There seems to be something wrong with DMA in that situation, but I won't try to diagnosis any further :)

Nah, that's just because a DMA transfer that was already set up was aborted due to an error when sending the request itself. The real question would be why it fails to send the request. It's possible that the device is in an inconsistent state because an earlier command timed out apparently. It's possible that the timeout happened because of a race condition in the current interrupt handling in the ide_adapter. Fixing that requires breaking the old IDE bus_manager though, so we can only do it after switching to ATA.

Can you please remove the "continue" at line 348 in "src/add-ons/kernel/bus_managers/ata/ATAChannel.cpp" and see if this gets it working again? Please also try to then capture a new serial output (preferrable without missing parts).

in reply to:  2 comment:3 by mmlr, 15 years ago

Replying to mmlr:

It seems that the output is a bit garbled, is it possible that you are running this on a SMP machine? The text is cut at some places and intermixed with different text at others.

I take that back, after reloading the attachment it looks all fine now - strange, sorry for the noise.

by stpere, 15 years ago

Attachment: serial output 2 added

Serial output with continue commented

comment:4 by stpere, 15 years ago

Sorry for the garbled output file, my serial port was set to read 115200 7N1, it was wrong, of course. DeadYak noticed it was clipped and I did a new one and did hope to update it before you noticed, but it was a "race condition" :)

Ok, I commented out the continue at the line you specified. It didn't boot more, but the serial output is a few lines longer. Can I do something else?

comment:5 by mmlr, 15 years ago

Can you please use the "ints" command after the panic and attach that as well? What you are seeing is an interrupt gone missing. It's exactly the symptom if interrupts never arrive, which can have a few different reasons. The ints output should tell if it's possibly a shared interrupt problem.

comment:6 by scottmc, 13 years ago

Can you recheck this with a recent Haiku build? It may have been fixed recently.

comment:7 by stpere, 13 years ago

Hi, unfortunately I don't have access to that machine anymore, so can't confirm nor deny! Should it be closed?

comment:8 by scottmc, 13 years ago

Blocking: 7665 added

comment:9 by waddlesplash, 6 years ago

Resolution: not reproducible
Status: in-progressclosed

comment:10 by waddlesplash, 6 years ago

Blocking: 7665 removed
Note: See TracTickets for help on using tickets.