Opened 12 years ago

Closed 12 years ago

#1434 closed bug (fixed)

device manager loads driver too often

Reported by: marcusoverhagen Owned by: axeld
Priority: normal Milestone: R1/alpha1
Component: System/Kernel Version: R1/pre-alpha1
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

[23:05] <dr_evil> hi _stippi_ [23:05] <_stippi_> hi dr_evil [23:06] <_stippi_> I hope you are making good progress now? [23:06] <_stippi_> :-) [23:06] <dr_evil> _stippi_ no, there is a strange bug in the device manager, concerning loading the driver [23:07] * hUMUNGUs has joined #haiku [23:07] <_stippi_> dr_evil: still? [23:08] <_stippi_> I thought it already loaded the driver. :-( [23:08] <dr_evil> yes, it is loaded 4 times for the same PCI device [23:08] <_stippi_> grrr [23:09] <_stippi_> but when I scaned the tickets today, I saw you already had some PCI manager tickets assigned to you... :-) [23:09] <dr_evil> the PCI manager is working perfectly well, at the R5 level :) [23:10] <dr_evil> let me show you something fyi [23:10] <cps1966> maybe it has multi functions [23:10] <CIA-22> marcusoverhagen * hrev22099 /haiku/trunk/src/add-ons/kernel/busses/scsi/ahci/ (ahci_controller.cpp ahci_controller.h ahci_sim.cpp): added a workaround to prevent loading the driver multiple times for the same device [23:11] <_stippi_> oh wait I'm in Haiku [23:11] <dr_evil> no, wait [23:11] <_stippi_> do you have a URL for the diff? I mean berlios? [23:12] <dr_evil> I wanted to show you something diffrent: http://overhagen.de/temp/ahci-serial.txt [23:12] <dr_evil> it starts with [23:12] <dr_evil> [35mahci:[0m controller found! vendor 0x8086, device 0x7111 [23:12] * stargater has joined #haiku [23:12] <stargater> hi [23:13] <dr_evil> the driver is loaded for the first time: [23:13] <dr_evil> [35mahci:[0m controller found! vendor 0x8086, device 0x7111 [23:13] * DeadYak has quit IRC ("using sirc version 2.211+KSIRC/1.3.12") [23:13] <dr_evil> [34mahci:[0m AHCIController::Init 0:7:1 vendor 8086, device 7111 [23:13] <dr_evil> and then unloaded [23:13] <dr_evil> [34mahci:[0m AHCIController::Uninit [23:13] <dr_evil> then it's loaded again: [23:13] <dr_evil> AHCIController::Init 0:7:1 vendor 8086, device 7111 [23:13] * PulkoMandy has quit IRC (Remote closed the connection) [23:14] <stargater> dr_evil: thats not so good ? [23:14] <dr_evil> but this time, the scsi stack doesn't try to scan devices!?! [23:14] <dr_evil> and later, we get this: [23:14] <dr_evil> [34mahci:[0m AHCIController::Init 0:7:1 vendor 8086, device 7111 [23:14] <dr_evil> AHCIController::Init ERROR: an instance for object 0:7:1 already exists [23:14] <dr_evil> init driver failed (node 0x90ac6100, busses/scsi/ahci/sim/v1): General system error [23:14] <dr_evil> init driver failed (node 0x90ac6180, bus_managers/scsi/bus/v1): General system error [23:14] <dr_evil> [34mahci:[0m ahci_sim_init_bus, userCookie 0x90ac8000 [23:14] <dr_evil> AHCIController::Init ERROR: getting PCI info failed! [23:14] <dr_evil> init driver failed (node 0x90ac6100, busses/scsi/ahci/sim/v1): General system error [23:14] <dr_evil> init driver failed (node 0x90ac6180, bus_managers/scsi/bus/v1): General system error [23:14] <dr_evil> init driver failed (node 0x90ac6200, bus_managers/scsi/bus/raw): General system error [23:15] <dr_evil> the driver wasn't unloaded, but gets loaded again, which failes because I added a workarond# [23:15] <dr_evil> then it's loaded again, which seems to fail because the PCI manager pointer is invalid [23:15] <_stippi_> I am afraid I can't help much [23:16] <_stippi_> I have absolutely no knowledge of the code, didn't even read any of it [23:16] <dr_evil> yes ok, I'll file a bug report :/

Attachments (3)

ahci-serial.txt (38.2 KB ) - added by marcusoverhagen 12 years ago.
serial-port-2.txt (38.2 KB ) - added by marcusoverhagen 12 years ago.
devicemanager.txt (99.9 KB ) - added by marcusoverhagen 12 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 by marcusoverhagen, 12 years ago

I'm not sure if this is related to getting loaded for the standard ide non ahci) controller when doing a test build for vmware. (#if 1 in ahci.c changed to 0)

I don't even know why ahci is loaded at all, when isa device claims to support it better.

module: busses/scsi/ahci/device_v1, support: 0.5 module: busses/ide/generic_ide_pci/device_v1, support: 0.3 module: busses/ide/legacy_sata/device_v1, support: 0 module: busses/ide/silicon_image_3112/device_v1, support: 0 [...] module: busses/ide/ide_isa/device_v1, support: 0.6

comment:2 by axeld, 12 years ago

The problem is obviously caused by rescanning the busses in device_manager_rescan_bus(). We could just disable that for the time being - this would disable all non-boot new-style drivers, though.

comment:3 by marcusoverhagen, 12 years ago

Hi Axel,

thank you for the information. I'll try disabling the rescanning and see how it works out for me, although I don't know which n on-boot new-style drivers we have.

I also noticed that it's corrent that both ahci and ide_isa are loaded. They are connected to different busses (PCI and IDE), so the device manager is doing the right thing.

One thing is still appears to be a bug in the device manager, though.

Somewhere in http://overhagen.de/temp/ahci-serial.txt you will find: AHCIController::Init ERROR: getting PCI info failed''

At that place, ahci_sim_init_bus hasn't managed to call the ahci_init_driver, but still appears to continue initializing the AHCIController. I think that gDeviceManager->init_driver might have failed to iniitalize the partent, but returned B_OK.

	TRACE("ahci_sim_init_bus, userCookie %p\n", userCookie);

	// initialize parent (the bus) to get the PCI interface and device
	parent = gDeviceManager->get_parent(node);
	status = gDeviceManager->init_driver(parent, &pciDevice, NULL, NULL);
	gDeviceManager->put_device_node(parent);
	if (status != B_OK)
		return status;

	controller =  new(std::nothrow) AHCIController(node, pciDevice);
	if (!controller)
		return B_NO_MEMORY;
	status = controller->Init();

comment:4 by axeld, 12 years ago

Indeed, if the driver is initialized already, it will just fill in the cookie and interface pointers, but will ignore the userCookie and return B_OK.

Looks like that userCookie stuff wasn't thought trough that much (or was never thought to actually return values) - either way, that's just another reason to get rid of it.

by marcusoverhagen, 12 years ago

Attachment: ahci-serial.txt added

by marcusoverhagen, 12 years ago

Attachment: serial-port-2.txt added

comment:5 by marcusoverhagen, 12 years ago

I removed the rescanning from device_manager_init_post_modules and put in the "device_manager_init_post_modules: NOT rescanning" dprintf output.

Obviously, AHCIController::Init can be seen 4 times before that line, so it's not a bug that happens when rescanning.

see serial-port-2.txt

by marcusoverhagen, 12 years ago

Attachment: devicemanager.txt added

comment:6 by marcusoverhagen, 12 years ago

This is turning into a serious problem now. As can be seen in the devicemanager.txt attachment, the board has two ahci controller devices, "0:31:2" and "5:0:0".

The device manager loads the driver multiple times (1st error) and since the userCookie isn't always set correct, the pciDevice pointer is sometimes invalid and sometimes crashes.

The pointer 0x90926300 in this crash isn't correct.

ahci: ahci_sim_init_bus: pciDevice 0x90926300 ahci: AHCIController::Init 0:0:0 vendor 8086, device 277c vm_soft_fault: kernel thread accessing invalid user memory! vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x31, ip 0x802c12c1, write 0, user 0, thread 0x9 CPU 1 halted! PANIC: vm_page_fault: unhandled page fault in kernel space at 0x31, ip 0x802c12c1

comment:7 by marcusoverhagen, 12 years ago

Milestone: R1R1/alpha

comment:8 by korli, 12 years ago

Resolution: fixed
Status: newclosed

Hopefully fixed in hrev22726.

Note: See TracTickets for help on using tickets.