#17458 closed bug (fixed)
no wifi anymore after latest update
Reported by: | tojoko | Owned by: | waddlesplash |
---|---|---|---|
Priority: | high | Milestone: | Unscheduled |
Component: | Network & Internet/Wireless | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
i don't know if that is any help, but Alex said it probably would. hrev55697 still works like charm (x86).
Attachments (3)
Change History (26)
by , 3 years ago
Attachment: | syslog.old added |
---|
by , 3 years ago
Attachment: | previous_syslog added |
---|
comment:1 by , 3 years ago
Description: | modified (diff) |
---|
comment:2 by , 3 years ago
It appears no DHCP ever runs because ieee80211_notify_node_join never is triggered?
threedeyes says several Russian users are seeing this after updating on Telegram. x86_gcc2 and x86_64
comment:3 by , 3 years ago
Milestone: | Unscheduled → R1/beta4 |
---|---|
Platform: | x86 → All |
Priority: | normal → critical |
Version: | → R1/Development |
Probably a GCC 11 regression, I'd wager.
comment:4 by , 3 years ago
comment:5 by , 3 years ago
kallisti5 confirms by testing that adding SubDirCcFlags -O0 ;
to the net80211 Jamfile fixes this. So, it seems we have some kind of miscompilation somehow.
comment:6 by , 3 years ago
diff --git a/src/libs/compat/freebsd_wlan/net80211/Jamfile b/src/libs/compat/freebsd_wlan/net80211/Jamfile index a5086aed36..aab04c0445 100644 --- a/src/libs/compat/freebsd_wlan/net80211/Jamfile +++ b/src/libs/compat/freebsd_wlan/net80211/Jamfile @@ -12,7 +12,7 @@ Includes [ FGristFiles kernel_c++_structs.h ] : <src!system!kernel>kernel_c++_struct_sizes.h ; SubDirCcFlags [ FDefines _KERNEL=1 FBSD_DRIVER=1 ] - -Wno-format -Wno-unused -Wno-uninitialized ; + -Wno-format -Wno-unused -Wno-uninitialized -O0 ; SubDirC++Flags [ FDefines _KERNEL=1 FBSD_DRIVER=1 ] ; SEARCH_SOURCE += [ FDirName $(SUBDIR) .. crypto rijndael ] ;
solves the issue for me. -O1
doesn't work... only -O0
We need to dig into why.
comment:7 by , 3 years ago
I am having the same problem with iprowifi4965.
Is it possible to commit the workaround for now? I understand that it is important to find out why this fails, but working on a Haiku laptop without Wifi is a bit hard.
comment:8 by , 3 years ago
I committed the workaround in hrev55711.
The FreeBSD developers haven't run into anything similar (they use Clang for building now by default however, and it may not have whatever optimization or bug this is yet.) So we are on our own to debug this, it seems.
I'll try to take a look before too long, I imagine -O0 makes quite a performance difference...
comment:9 by , 3 years ago
Priority: | critical → blocker |
---|
comment:10 by , 3 years ago
Seems fixed to me, at least works for me now fine, again - please confirm.
comment:12 by , 3 years ago
For anyone who gets "stuck" between hrev55706 and hrev55711 in the nightly images:
- Reboot, enter bootloader menu
- Choose the Haiku volume, select a previous package activation state
- Reboot, confirm wifi once again working
- Upgrade to hrev55711 or later.
This one is going to remain open as a reminder for us to investigate why we have to disable all optimization on the net80211 freebsd stack after upgrading to GCC11.
comment:13 by , 3 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:14 by , 3 years ago
It appears no DHCP ever runs because ieee80211_notify_node_join never is triggered?
I think the difference in the logs starts much earlier than that.
They are similar until this line:
3035 KERN: [net/atheroswifi/0] sta_newstate: INIT -> SCAN (0) 5863 KERN: [net/atheroswifi/0] sta_newstate: INIT -> SCAN (0)
In the log before the update this is followed by a lot of media_change. Then eventually we get to
3888 KERN: ieee80211_notify_scan_done
But in the new log this notify_scan_done happens immediately. Then nothing happens after it. It looks like the scan for networks doesn't actually start happening? and we never get back to the INIT state or go to any other state.
by , 3 years ago
Attachment: | wlan_scan_verbose.txt added |
---|
Syslog with extra tracing for scan enabled
comment:15 by , 3 years ago
Attached a syslog including extra traces (added IEEE80211_MSG_SCAN to iv_debug). The scan appears to complete normally, but nothing is triggered after that.
The tracing isn't great, there is a lot of "KERN: " headers often for just one or a few characters. Shouldn't there be some line buffering on this?
comment:16 by , 3 years ago
I tracked down the one file that is the real culprit here and reduced the -O0 in hrev55920.
comment:17 by , 3 years ago
The function in question appears to be:
static void scan_end(struct ieee80211_scan_state *ss, int scandone)
Compiling just it with -O0, via
#pragma GCC optimize ("-O0") ... #pragma GCC reset_options
produces working scan results again.
comment:18 by , 3 years ago
Even just between -O0 and -O1 and disabling a number of optimizations, the assembly differences are so large that it is hard to make out exactly what changed to cause this problem...
comment:19 by , 3 years ago
That's kind of expected in this case? It would probably be optimizations removing code after determining that it makes no sense due to undefined behavior?
gcc documents the specific optimizations activated by gcc in -O1 mode: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html so the next step would be figuring out which combination of these is involved here. And hopefully have a simpler diff when we change just one optimization setting between working and non-working.
comment:20 by , 3 years ago
Those optimizations are actually an incomplete list; I've discovered when bisecting such regressions before. Instead there is an option in GCC that displays all optimization passes and which ones are disabled and which ones are not.
I already ran this code when compiled with -O0 -fsanitize=undefined
, and got nothing of interest from it (just some signed integer underflows in other parts of the code.)
At some point it is just not worth spending more time on this. The -O0
is now reduced to a single file instead of all of them, and GCC 11 has quite a lot of acknowledged bugs. FreeBSD builds with Clang now, so they don't care about this kind of thing.
comment:21 by , 3 years ago
Milestone: | R1/beta4 → Unscheduled |
---|---|
Priority: | blocker → high |
comment:22 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
I think I am just going to call this one "fixed" and leave the workaround in place rather than trying to investigate further.
comment:23 by , 3 years ago
works now 32 and 64bit tested sometimes still loosing connection and reconects
hrev55697
hrev55706