Opened 6 years ago

Closed 4 years ago

#14790 closed bug (fixed)

Random KDL on Acer Apire one ZG5

Reported by: AlienSoldier Owned by: waddlesplash
Priority: high Milestone: Unscheduled
Component: Drivers/Network/atheroswifi Version: R1/Development
Keywords: KDL Cc:
Blocked By: Blocking:
Platform: x86

Description

As i am currently testing this computer on a daily use basis i will report a bunch of problems with it in the future, get used to that :)

I get a random KDL 2-3 time a day. Attached is a scrrenshot.

I don't know if this is network related, i only got it connected to internet so far with wi-fi. I will have to test for some days with ethernet cable to see if this also happen.

Impossible to exit or continue from that KDL.

hrev52698+1

Attachments (7)

Aspire One KDL.bmp (943.3 KB ) - added by AlienSoldier 6 years ago.
wpa_supplicant-497-debug-09-01-2019-01-59-21.report (18.3 KB ) - added by AlienSoldier 6 years ago.
wpa supplicant crash
wpa_supplicant-966-debug-09-01-2019-16-44-52.report (10.5 KB ) - added by humdinger 6 years ago.
similar but slightlyy different(?)
wpa_supplicant-1003-debug-09-01-2019-16-45-27.report (9.6 KB ) - added by humdinger 6 years ago.
Unable to retrieve disassembly
Aspire One KDL strike back.jpg (1.4 MB ) - added by AlienSoldier 6 years ago.
atheroswifi.7z (2.1 MB ) - added by waddlesplash 6 years ago.
IMG_0440.JPG (1.6 MB ) - added by AlienSoldier 6 years ago.
picture with more verbosing atheroswifi file

Change History (44)

by AlienSoldier, 6 years ago

Attachment: Aspire One KDL.bmp added

comment:1 by waddlesplash, 6 years ago

Component: - GeneralDrivers/Network/atheroswifi
Owner: changed from nobody to waddlesplash

comment:2 by waddlesplash, 6 years ago

It's a bug in the FreeBSD WiFi driver, indeed.

comment:3 by waddlesplash, 6 years ago

Please retest after hrev52726.

comment:4 by AlienSoldier, 6 years ago

driver no longer start now. Attached is the crashing report. No KDL happen so far, just a crash of the wpa supplicant.

by AlienSoldier, 6 years ago

wpa supplicant crash

comment:5 by waddlesplash, 6 years ago

The printfs in question are here: https://github.com/haiku/wpa_supplicant/blob/haiku/src/drivers/driver_bsd.c#L1569

I'm a little stumped as to how this could cause a crash on one invocation but not another; either it should always crash or never crash (and WPA/WPA2 works just fine here.)

comment:6 by AlienSoldier, 6 years ago

And yet it does :) I see a bunch on printf, but i don't know what "vprintf" does.

Could it be someting reacting different on 32 and 64 bit? Something to do with the amount of RAM or reacting speed of the specific hardware?

I updated from software updater instead of from a clean image file if that could make a difference.

Currently that laptop is set as french, i guess this as no effect but as some word can be longer perhaps it can cause a buffer-array overflow as a printf is involved (i am really digging deep here but when stumped everything might be it).

comment:7 by AlienSoldier, 6 years ago

Also, could it be that key or previous password structuresetting file changed with the new driver and that a file need to be deleted and recreated by the user?

comment:8 by waddlesplash, 6 years ago

No, this is in wpa_supplicant which did not change at all with the new drivers.

comment:9 by humdinger, 6 years ago

I see the same crashing wpa_supplicant after updating from hrev52714 to hrev52731.
I tried deleting ~/config/settings/system/keystore/keystore_database with no success. Besides the same crash report, I also got two different ones that I'll attach.

Unable to retrieve disassembly for IP 0x480: address not contained in any valid image.

probably not helpful and

			Frame memory:
				[0x7a569eb4]  ....   18 0f a8 00
		0x7a569ed8	0x9e604b	pthread_testcancel + 0x13 
		0x7a569ef8	0xa4e911	write + 0x31 
		0x7a569f28	0x9fec38	_IO_new_file_write + 0x34 
		0x7a569f58	0x9fe1ff	new_do_write + 0x7f 
		0x7a569f88	0x9fe163	_IO_new_do_write + 0x27 
		0x7a569fb8	0x9fe469	_IO_new_file_overflow + 0xbd 
		0x7a569ff8	0x9fed72	_IO_file_xsputn + 0xfe 
		0x7a56b708	0xa19375	vfprintf + 0x1ad 
		0x7a56b738	0xa14c3b	printf + 0x27 
		0x7a56b778	0x11ddea4	wpa_printf + 0x40 
		0x7a56b7d8	0x120a73f	wpa_driver_bsd_capa + 0x73 
		00000000	0x000480	? 

by humdinger, 6 years ago

similar but slightlyy different(?)

by humdinger, 6 years ago

Unable to retrieve disassembly

comment:10 by humdinger, 6 years ago

Sorry, I just now saw the title of the ticket... This should probably be a new ticket? I have an iprowifi4965, 16 GiB or RAM, i7 running 32bit Haiku.

comment:11 by waddlesplash, 6 years ago

Yes, please make a new ticket for the wpa_supplicant crash, as this seems to be entirely unrelated.

comment:12 by humdinger, 6 years ago

Sorry, I just now saw the title of the ticket... This should probably be a new ticket? I have an iprowifi4965, 16 GiB or RAM, i7 running 32bit Haiku.

comment:13 by waddlesplash, 6 years ago

wpa_supplicant crash fixed in hrev52736; so please retest and see if the KDL is fixed or not.

comment:14 by AlienSoldier, 6 years ago

wpa_supplicant indeed fixed, very nice, i will test for 1-2 days to see if the KDL is also gone or if it changed frequency.

comment:15 by AlienSoldier, 6 years ago

Did not saw a KDL in two days, i think the problem is gone with the updated driver.

comment:16 by waddlesplash, 6 years ago

Resolution: fixed
Status: newclosed

Hooray!

comment:17 by AlienSoldier, 6 years ago

Sadly need to be re-opened, happened 2 times today. almost the same KDL screen, i still add a screenshot in case the added line can help:

by AlienSoldier, 6 years ago

comment:18 by waddlesplash, 6 years ago

Resolution: fixed
Status: closedreopened

comment:19 by adrian, 6 years ago

hi! So I'd like to fix it, but I need to know what the line number of the code where sta_input is dying. Can you work with some HaikuOS devs to figure out how to extract the line number when this fails next? That way I can go see what's going on. Thanks!

comment:20 by waddlesplash, 6 years ago

s/HaikuOS/Haiku/. :-p

Yes, I can make a debug build of the atheroswifi driver for AlienSoldier to test.

comment:21 by AlienSoldier, 6 years ago

I will be glad to test anything :)

by waddlesplash, 6 years ago

Attachment: atheroswifi.7z added

comment:22 by waddlesplash, 6 years ago

Here's the driver, its sha256sum (extracted) should be 0d9ccc480dfeb9e57a1404ac7dd17d38d5619c5cd733edfcd0b6e6cd9b734cb7.

I think you know the drill: blacklist old driver, reboot, put this in non-packaged/add-ons/kernel/drivers/bin, use WiFi.

comment:23 by AlienSoldier, 6 years ago

installed, seem fine do far. Will that just spew more detail about the crash when it kdl or will it only write to a file?

comment:24 by waddlesplash, 6 years ago

The backtrace will be slightly different, so just take a new picture. And as this binary has debugging info, we'll be able to determine what line the crash is occuring on from the fault address.

by AlienSoldier, 6 years ago

Attachment: IMG_0440.JPG added

picture with more verbosing atheroswifi file

comment:25 by AlienSoldier, 6 years ago

Unless i miss something, it look very similar. I used this info to blacklist the normal driver: https://www.haiku-os.org/blog/barrett/2013-12-15_how_permanently_blacklist_package_file/

Before that occurance it crashed once but the person who took the photo forgot to remove the flash making it unreadable.

comment:26 by waddlesplash, 6 years ago

$ addr2line -e atheroswifi 0x179C77
src/libs/compat/freebsd_wlan/net80211/ieee80211_sta.c:779

ieee80211_sta.c:779.

comment:27 by waddlesplash, 6 years ago

This may be fixed in hrev53135: there was a logic error in crypto_decap which caused a memory leak on FreeBSD, but due to the asserts in our mbuf implementation, may have also caused this bug. Please retest after that.

comment:28 by AlienSoldier, 6 years ago

Currently do a few weeks test to see if it happen again, will report in 2-3 week.

comment:29 by diver, 5 years ago

So how did the test go? :)

comment:30 by korli, 4 years ago

can this be closed as fixed?

comment:31 by AlienSoldier, 4 years ago

I don't think it was fixed especially that no obvious cause was ever found. We could close it and i will reopen if needed. I have not updated it lately because i don't want to have the no icon at boot i have with my other laptop.

Got one KDL today but it was in the input server, so other KDL might happen before this one does (i can never have 24h before a crash with that laptop).

So closing, propably for now, fixed sadly probably not as i can't have a significan uptime long enough.

comment:32 by waddlesplash, 4 years ago

Is the input_server crash reported anywhere?

comment:33 by AlienSoldier, 4 years ago

Not yet. Like i said it currently have an older version from around last summer. I don't want to create ticket for things that may already be solved.

comment:34 by AlienSoldier, 4 years ago

as #16670 seem fixed i updated the laptop, i will finally be able to launch a good week of test again.

comment:35 by korli, 4 years ago

can this be closed as fixed?

comment:36 by AlienSoldier, 4 years ago

Yes close it. Will reopen this winter when i have more time if a commit bring this back.

comment:37 by korli, 4 years ago

Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.