Opened 2 years ago

Last modified 15 months ago

#17522 new bug

KDL on scsi_disk when opening DriveSetup (installer)

Reported by: victroniko Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: Drivers/Disk/SCSI Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

As described in https://discuss.haiku-os.org/t/strange-crash-in-drivesetup/11689 - I'll copy/paste the main text here:

Starting DriveSetup either from the installer or live environment insta-crashes the system, landing at KDL. Same behaviour in Beta3 and various random nigthlies as far back as hrev55100 (ran out of patience there).

System is a Xeon E5440 on a P5G41T-M mainboard. In BIOS, leaving the SATA ports in Compatible-mode (IDE) instead of AHCI avoids the error, but then I don't see any drives connected anymore. As things are, installing Haiku on this system is not possible.

Apologies for the blurry syslog pic, can't leave KDL as the error loops forever and doesn't seem to make it to the log file either.

Attachments (3)

IMG_20211218_135716.jpg (584.4 KB ) - added by victroniko 2 years ago.
IMG_20221228_235802_306.jpg (2.1 MB ) - added by victroniko 16 months ago.
syslog.old (512.0 KB ) - added by victroniko 15 months ago.

Change History (10)

by victroniko, 2 years ago

Attachment: IMG_20211218_135716.jpg added

comment:1 by waddlesplash, 2 years ago

Component: Drivers/DiskDrivers/Disk/SCSI

comment:2 by victroniko, 16 months ago

(Tested with current R4 beta)

Reporting again from a totally different system (AMD mobo/CPU/GPU) but similar drive layout:

  • New KingBank 240Gb SSD on SATA1, empty
  • Same WD 500Gb HDD on SATA2, single NTFS partition

With this HDD connected, DriveSetup crashes consistently. Error is exactly the same. This makes Haiku still awkward to install for me, as I have to physically disconnect this drive for the installer to proceed and reconnect it when done (can't open DriveSetup ever, also). Said HDD is my local data storage I keep between upgrades and I can't do without it. It was analyzed and tested good with common diagnostic tools.

I strongly feel Haiku "gets dizzy" with something in the partition layout on this drive, but I can't figure what/why. It's simply a correctly NTFS formatted partition on an MBR-style drive, that every other OS recognizes fine (even KolibriOS can mount it r/o).

As before, syslog fails to catch the error so I'm forced to attach another fresh pic of KDL. Please advise on how to debug and I'll do my best.

by victroniko, 16 months ago

Attachment: IMG_20221228_235802_306.jpg added

comment:3 by diver, 16 months ago

Platform: x86-64All

comment:4 by korli, 16 months ago

You could try to check the syslog (command 'syslog') in KDL whether anything wrong is shown before the crash (provide screenshots accordingly).

Last edited 16 months ago by korli (previous) (diff)

comment:5 by victroniko, 15 months ago

Attaching a full syslog.old - problem is, I don't know if messages from a crashed-session appear here at all. I looked for clues in this log and a live nightly (hrev56691) before forcing a crash, but couldn't pinpoint a particular line/section nor found anything wrong.

Things change a bit when tinkering at the bootloader. "Display current log" option shows this:

KERN: options = 1
KERN: No APM available.
KERN: smp: using ACPI to detect MP configuration
KERN: smp: local apic address is 0xfee00000
KERN: smp: found local APIC with id 16
KERN: smp: found local APIC with id 17
KERN: smp: found local APIC with id 18
KERN: smp: found local APIC with id 19
KERN: smp: found local APIC with id 20
KERN: smp: found local APIC with id 21
KERN: smp: found io APIC with id 0 and address 0xfec00000
KERN: VESA version = 3.0, capabilities 1
KERN: OEM string: AMD ATOMBIOS

/
/ snip... too long, irrelevant VESA stuff
/ 

KERN: Welcome to the Haiku boot loader!
KERN: Haiku revision: hrev56578+59
KERN: number of drives: 3
KERN: add_partitions_for(0x00105400, mountFS = no)
KERN: add_partitions_for(fd = 0, mountFS = no)
KERN: 0x001056e8 Partition::Partition
KERN: 0x001056e8 Partition::Scan()
KERN: check for partitioning_system: GUID Partition Map
KERN: check for partitioning_system: Intel Partition Map
KERN:   priority: 810
KERN: check for partitioning_system: Intel Extended Partition
KERN: 0x00105868 Partition::Partition
KERN: 0x001056e8 Partition::AddChild 0x00105868
KERN: 0x00105868 Partition::SetParent 0x001056e8
KERN: new child partition!
KERN: 0x00105930 Partition::Partition
KERN: 0x001056e8 Partition::AddChild 0x00105930
KERN: 0x00105930 Partition::SetParent 0x001056e8
KERN: new child partition!
KERN: 0x001056e8 Partition::Scan(): scan child 0x00105868 (start = 6291456, size = 1468006400, parent = 0x001056e8)!
KERN: 0x00105868 Partition::Scan()
KERN: check for partitioning_system: GUID Partition Map
KERN: check for partitioning_system: Intel Partition Map
KERN: check for partitioning_system: Intel Extended Partition
KERN: 0x001056e8 Partition::Scan(): scan child 0x00105930 (start = 1474297856, size = 2949120, parent = 0x001056e8)!
KERN: 0x00105930 Partition::Scan()
KERN: check for partitioning_system: GUID Partition Map
KERN: check for partitioning_system: Intel Partition Map
KERN: check for partitioning_system: Intel Extended Partition
KERN: 0x001056e8 Partition::~Partition
KERN: 0x00105868 Partition::SetParent 0x00000000
KERN: 0x00105930 Partition::SetParent 0x00000000
KERN: boot partition offset: 6291456
KERN: 0x00105868 Partition::_Mount check for file_system: BFS Filesystem
KERN: PackageVolumeInfo::SetTo()
KERN: PackageVolumeInfo::_InitState(): failed to parse activated-packages: No such file or directory
KERN: add_partitions_for(0x001054c0, mountFS = yes)
KERN: add_partitions_for(fd = 3, mountFS = yes)
KERN: 0x00105f78 Partition::Partition
KERN: 0x00105f78 Partition::Scan()
KERN: check for partitioning_system: GUID Partition Map
KERN: EFI header: EFI PART
KERN: EFI revision: 10000
KERN: header size: 92
KERN: header CRC: ea7c03e4
KERN: absolute block: 1
KERN: alternate block: 234441647
KERN: first usable block: 34
KERN: last usable block: 234441614
KERN: disk GUID: 2193ee0a-5dda-674f-8d92-d11d54792f5d
KERN: entries block: 2
KERN: entry size:  128
KERN: entry count: 128
KERN: entries CRC: ab54d286
KERN: EFI header: EFI PART
KERN: EFI revision: 10000
KERN: header size: 92
KERN: header CRC: 3174e4fa
KERN: absolute block: 234441647
KERN: alternate block: 1
KERN: first usable block: 34
KERN: last usable block: 234441614
KERN: disk GUID: 2193ee0a-5dda-674f-8d92-d11d54792f5d
KERN: entries block: 234441615
KERN: entry size:  128
KERN: entry count: 128
KERN: entries CRC: ab54d286
KERN:   priority: 959
KERN: check for partitioning_system: Intel Partition Map
KERN: intel: Found GPT signature, ignoring.
KERN: check for partitioning_system: Intel Extended Partition
KERN: efi_gpt_scan_partition(cookie = 0x0010b190)
KERN: 0x00105f78 Partition::~Partition
KERN: add_partitions_for(0x00105580, mountFS = yes)
KERN: add_partitions_for(fd = 3, mountFS = yes)
KERN: 0x00105f78 Partition::Partition
KERN: 0x00105f78 Partition::Scan()
KERN: check for partitioning_system: GUID Partition Map
KERN: check for partitioning_system: Intel Partition Map
KERN: could not find parent partition.
KERN:   priority: 810
KERN: check for partitioning_system: Intel Extended Partition
KERN: creating partition failed: could not find partition.
KERN: Partitioning module `Intel Partition Map' recognized the partition, but failed to scan it
KERN: 0x00105f78 Partition::~Partition

Last section is the only thing I find suspicious. Anyway, below is the full log.

by victroniko, 15 months ago

Attachment: syslog.old added

comment:6 by victroniko, 15 months ago

By accident I found a solution to this problem, it's ridiculously simple but I still consider it a bug. It had no relation with mentioned HDD at all. My machine has an LS-120 drive attached to PATA0 bus as master, used mainly for old floppy data exchange/backup (think Macintosh, Amiga etc) because of its 2x speed vs a regular drive.

  • With a floppy in the drive, not a single problem.
  • With no floppy, kernel panic.

This drive is ATAPI compliant like a removable DVD-CD one, and should be treated as such. Am I right to assume this is not the case?

comment:7 by pulkomandy, 15 months ago

So here is the actually relevant pars of the syslog (searching for "ata"):

3113	KERN: PCI-ATA: Controller in legacy mode: cmd 0x1f0, ctrl 0x3f6, irq 14
3114	KERN: PCI-ATA: init channel...
3115	KERN: PCI-ATA: channel index 0
3116	KERN: PCI-ATA: bus master base 0xf000
3117	KERN: PCI-ATA: init channel done
3118	KERN: ata 0: _DevicePresent: device 0, presence 1
3119	KERN: ata 0: _DevicePresent: device 1, presence 0
3120	KERN: ata 0: deviceMask 1
3121	KERN: ata 0: probing device 0
3122	KERN: ata 0: signature of device 0: 0xeb14
3123	KERN: atapi 0-0: model number: LS-120 VER5   00              UHD Floppy
3124	KERN: atapi 0-0: serial number: 0717W9A01587
3125	KERN: atapi 0-0: firmware rev.: F527M5AE
3126	KERN: atapi 0-0: using DMA mode 0x01
3127	KERN: ata 0: identified ATAPI device 0
3128	KERN: ata 0: ignoring device 1
3129	KERN: publish device: node 0xffffffff8084d970, path disk/ata/0/master/raw, module drivers/disk/scsi/scsi_disk/device_v1
3130	KERN: atapi 0-0 error: invalid target lun 1
3131	KERN: atapi 0-0 error: invalid target lun 2
3132	KERN: atapi 0-0 error: invalid target lun 3
3133	KERN: atapi 0-0 error: invalid target lun 4
3134	KERN: atapi 0-0 error: invalid target lun 5
3135	KERN: atapi 0-0 error: invalid target lun 6
3136	KERN: atapi 0-0 error: invalid target lun 7
3137	KERN: ata 0 error: target device not present
3138	KERN: ata 0 error: invalid target device
3139	KERN: Last message repeated 12 times.
3140	KERN: PCI-ATA: Controller in legacy mode: cmd 0x170, ctrl 0x376, irq 15
3141	KERN: PCI-ATA: init channel...
3142	KERN: PCI-ATA: channel index 1
3143	KERN: PCI-ATA: bus master base 0xf008
3144	KERN: PCI-ATA: init channel done
3145	KERN: ata 1: _DevicePresent: device selection failed for device 0
3146	KERN: ata 1: _DevicePresent: device 1, presence 0
3147	KERN: ata 1: deviceMask 0
3148	KERN: ata 1: ignoring device 0
3149	KERN: ata 1: ignoring device 1
3150	KERN: ata 1 error: target device not present
3151	Last message repeated 1 time
3152	KERN: ata 1 error: invalid target device

It's indeed detected as an ATAPI device and then SCSI commands (encapsulated into ATAPI) are used for it. Apparently the SCSI stack has a bug and ends up doing a NULL pointer dereference.

Note: See TracTickets for help on using tickets.