Opened 8 years ago

Closed 7 years ago

#7898 closed bug (fixed)

Executing install-wifi-firmwares.sh results in page fault

Reported by: taos Owned by: mmlr
Priority: normal Milestone: R1
Component: Drivers/Network/ipw2100 Version: R1/Development
Keywords: ipw2100 Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

Using hrev42569 (gcc2hybrid).

When executing the install-wifi-firmwares.sh script and choosing "I agree to the licenses. Install firmwares." in the alert window, haiku enters KDL (PANIC: page fault, but interrupts were disabled..., for more information see KDL.jpg).

This is the first time I tried to use this script to install the firmware files, so I can't really be sure in which revision this problem occurred for the first time.

Attachments (5)

KDL.jpg (112.4 KB ) - added by taos 8 years ago.
KDL_ints.jpg (122.6 KB ) - added by taos 8 years ago.
Ints_before.jpg (109.5 KB ) - added by taos 8 years ago.
Ints_after.jpg (72.7 KB ) - added by taos 8 years ago.
KDL+Terminal.jpg (121.6 KB ) - added by taos 8 years ago.

Download all attachments as: .zip

Change History (22)

by taos, 8 years ago

Attachment: KDL.jpg added

comment:1 by anevilyak, 8 years ago

Component: - GeneralSystem/Kernel
Owner: changed from nobody to axeld

comment:2 by anevilyak, 8 years ago

Oddly enough, the actual KDL looks completely unrelated to the wifi firmware script, the registrar's just asking for information on a team.

comment:3 by bonefish, 8 years ago

The _user_get_team_info() is unrelated. As can be seen in the stack trace, it just happens to be preempted by a hardware interrupt (vector 42). The fault address is the eip, which is likely caused by a jump/return to not mapped or non-executable memory. The return case would likely be due to a stack corruption. The jump case could be bad function pointer or a virtual function call on an invalid/deleted object. I don't know what the script does, but if it causes a driver to be unloaded and that driver doesn't unregister its hardware interrupt callback, a call to already unmapped code would happen.

I suppose the output of ints would be interesting. Furthermore, if this is a call of a bad/stale function pointer (the likely case IMO), the innermost function (i.e. the one doing the call) is not shown in the stack trace (since the called function is responsible for creating a stack frame referring back to the caller). The return address will be on the stack, though. It should be right after the topmost iframe, i.e. in the stack trace where it says "kernel iframe at ... (end = ...)", the end address should be the stack address at which the return is stored (in the case 0x81709d1c). The return address can be read via expr *0x81709d1c. Passing the return address as an argument to ls will print the function it belongs to.

comment:4 by taos, 8 years ago

I've attached the new KDL including output of ints, expr *....., and ls ... (see KDL_ints.jpg).

Here the last lines:

kdebug> expr *0x81709f20
2147840315 (0x8005713b)
kdebug> ls 0x8005713b
0x8005713b = int_io_interrupt_handler + 0x6b (kernel_x86)

by taos, 8 years ago

Attachment: KDL_ints.jpg added

comment:5 by bonefish, 8 years ago

That does indeed look like a left-over I/O interrupt handler function of an already unloaded driver/kernel module. Comparing the ints output from before and after the crash might help to find the culprit. Though it could be a driver/module that had just been loaded and unloaded again. In that case it might be necessary to enable the debug output for the kernel module code.

comment:6 by taos, 8 years ago

After comparing ints output before (see Ints_before.jpg) and after (see Ints_after.jpg) the crash, I think the problem seems related to the ipw2100 driver. The script is at the stage of installing the firmware for ipw2100 when the crash occurs (see KDL+Terminal.jpg).

The following excerpt of install-wifi-firmwares.sh shows the more interesting parts relevant to the installation of the ipw2100 firmware:

[...]

function UnlinkDriver()
{
	# remove the driver's symlink
	rm -f "${driversDir}/dev/net/${driver}"
}

function SymlinkDriver()
{
	# restore the driver's symlink
	cd "${driversDir}/dev/net/"
	ln -sf "../../bin/${driver}" "${driver}"
}

[...]

function SetFirmwarePermissions()
{
	cd ${firmwareDir}/${driver}/
	for file in * ; do
		if [ "$file" != "$driver" ] ; then
			chmod a=r $file
		fi
	done
}

[...]

function PreFirmwareInstallation()
{
	echo "Installing firmware for ${driver} ..."
	mkdir -p "${firmwareDir}/${driver}"
	UnlinkDriver
}

function PostFirmwareInstallation()
{
	SetFirmwarePermissions
	SymlinkDriver
	CleanTemporaryFiles
	echo "... firmware for ${driver} has been installed."
}

function InstallIpw2100()
{
	driver='ipw2100'
	PreFirmwareInstallation

	# Extract contents.
	local file='ipw2100-fw-1.3.tgz'

	# In case the file doesn's exist.
	if [ ! -e ${firmwareDir}/${driver}/$file ] ; then
		url='http://ipw2100.sourceforge.net/firmware.php?fid=4'
		OpenIntelFirmwareWebpage $url $file
	fi
	# TODO: handle when $file hasn't been saved in ${firmwareDir}/${driver}/

	# Install the firmware & license file by extracting in place.
	cd "${firmwareDir}/${driver}"
	gunzip < "$file" | tar xf -

	PostFirmwareInstallation
}

As I can't find any extracted firmware files I suppose the crash happened during the first step "PreFirmwareInstallation".

by taos, 8 years ago

Attachment: Ints_before.jpg added

by taos, 8 years ago

Attachment: Ints_after.jpg added

by taos, 8 years ago

Attachment: KDL+Terminal.jpg added

in reply to:  6 comment:7 by bonefish, 8 years ago

Component: System/KernelDrivers/Network
Owner: changed from axeld to mmlr
Status: newassigned

Replying to taos:

As I can't find any extracted firmware files I suppose the crash happened during the first step "PreFirmwareInstallation".

Since the system crashes, the files might just not have been written to disk yet. So I don't think this is any indicator.

The I/O interrupt handler is registered in the IPW2100 constructor and would be unregistered in the destructor. In ipw2100_open(), however, the IPW2100 object is leaked when IPW2100::Open() fails. From a first glance IPW2100::Open() also loads the firmware.

comment:8 by taos, 8 years ago

When playing around with install-wifi-firmwares.sh I noticed that the crash is only prevented when both UnlinkDriver (line 163) in PreFirmwareInstallation() and SymlinkDriver (line 170) in PostFirmwareInstallation() are commented out.

Last edited 8 years ago by taos (previous) (diff)

comment:9 by anevilyak, 8 years ago

That's to be expected, those two lines are what cause the driver to be reloaded (as it needs to be in order to make use of the firmware).

comment:10 by taos, 8 years ago

So a workaround without solving the underlying problem would be to execute the script only if the affected driver isn't loaded in the first place - e.g. when booting in safe mode?

Does this unloading/reloading issue only affect the ipw2100 driver, or will executing the script also result in a visit to KDL on systems with different hardware that need firmware installed (e.g. iprowifixxxx, ralinkwifi, etc)? Some of the other firmware files must be first downloaded (difficult if you don't have a working network connection after booting in safe mode). IIRC, ipw2100 is one of the few native non-BSD network drivers, could this explain a different behaviour when unloading/reloading (assuming the other drivers aren't affected)?

BTW, I was never able to actually see any wireless networks even after installing the firmware for ipw2100 (/dev/net/ipw2100/0 only shows a generic IP address - 169.254.0.x). If others are able to connect to a network, maybe the described problem is limited to my hardware - no one else ever reported a bug AFAIK.

in reply to:  10 comment:11 by anevilyak, 8 years ago

Replying to taos:

Does this unloading/reloading issue only affect the ipw2100 driver, or will executing the script also result in a visit to KDL on systems with different hardware that need firmware installed (e.g. iprowifixxxx, ralinkwifi, etc)? Some of the other firmware files must be first downloaded (difficult if you don't have a working network connection after booting in safe mode). IIRC, ipw2100 is one of the few native non-BSD network drivers, could this explain a different behaviour when unloading/reloading (assuming the other drivers aren't affected)?

Without more information it's difficult to ascertain if this is a bug in a specific driver, multiple drivers, or in the kernel's driver reloading functionality, so that one can't really be answered at this point. Furthermore, it'd only impact systems where the driver finds any supported hardware at all, since it otherwise wouldn't get as far as installing an interrupt handler.

comment:12 by bonefish, 8 years ago

As written in comment:7 there definitely is a bug in ipw2100_open() (leaking the IPW2100 object when the Open() on it fails), which could result in exactly the reported problem. Whether that actually causes the issue in this case I cannot stringently deduce from the given information (it's certainly possible that there's another leak or other bug as well, though this is the prime suspect). At any rate this needs to be fixed anyway, so we can just see whether the issue goes away afterwards.

comment:13 by taos, 8 years ago

Keywords: ipw2100 added

Today, I had the chance to experiment a little bit with a laptop with intel pro wireless 2915 ABG chip that needs the iprowifi 2200 firmware: no KDL or other problems when executing the script - so unloading/reloading a driver by removing/creating a symlink doesn't seem a problem in itself. So, ipw2100 seeems the culprit as already written by bonefish before.

comment:14 by diver, 8 years ago

Component: Drivers/NetworkDrivers/Network/ipw2100

comment:15 by mmlr, 8 years ago

As mentioned in #7938 just now, the same applies here: Don't bother investigating this. The native driver has the mentioned issues, but even if it didn't it doesn't use the FreeBSD compatibility layer. Since the current userland wireless interface uses the FreeBSD ioctls to communicate with the drivers, this would have to be "ported" into the native driver to make it work. Since there is also a, presumably better tested, FreeBSD driver for this device already, the native one can simply be dumped. I'll do that as soon as I get around to it.

comment:16 by mmlr, 8 years ago

As an update: I've enabled the iprowifi2100 driver locally and tested it with the machine I used when developing the ipw2100 driver. While the driver seems to bascially run, it unfortunately doesn't react to any connection attempts and is therefore currently unusable. It is possible that the driver simply needs to be updated to a more current version or there might be some subtle bug hiding. I'll need to further look into this.

comment:17 by modeenf, 7 years ago

Resolution: fixed
Status: assignedclosed

Chould be fixed in Hrev44218

Note: See TracTickets for help on using tickets.