Opened 10 years ago

Closed 10 years ago

Last modified 10 years ago

#12156 closed bug (fixed)

netstack: can't configure network anymore

Reported by: jessicah Owned by: axeld
Priority: high Milestone: R1/beta1
Component: Network & Internet/Stack Version: R1/Development
Keywords: Cc: vidrep, X512, 3dEyes
Blocked By: Blocking: #12161, #12164, #12169, #12179
Platform: All

Description

As of hrev49293, my PC can no longer configure the network. Enabling tracing in add-ons/kernel/network/stack/interfaces.cpp, it appears that at the very least, configuring of the local interface is somewhat iffy, and for some reason, ethernet device never gets configured at all.

syslog before changes:

KERN: net: Interface 0xd7dd2680: new "/dev/net/ipro1000/0", device interface 0xd8576c88
KERN: net: InterfaceAddress 0xd85617c0: init interface 0xd7dd2688, domain 0xd7e52b00
KERN: net: Interface 0xd7dd2680: create domain datalink for domain 0xd7e52b00
KERN: net: InterfaceAddress 0xd8534104: init interface 0x00000000, domain 0x00000000
KERN: net: Interface 0xd7dd2680: flags changed from 0 to 1
KERN: net: Interface 0xd7dd2480: new "loop", device interface 0x82b7a860
KERN: net: InterfaceAddress 0xd89cd6c0: init interface 0xd7dd2488, domain 0xd7e52b00
KERN: net: Interface 0xd7dd2480: create domain datalink for domain 0xd7e52b00
KERN: net: InterfaceAddress 0xd89b1834: init interface 0x00000000, domain 0x00000000
KERN: net: update_interface_address(address 0xd89cd6c0, option 8902, oldAddress <none>, newAddress 127.0.0.1)
KERN: net: Interface 0xd7dd2480: flags changed from 0 to 9
KERN: net: Interface 0xd7dd2480: create domain datalink for domain 0xd8553640
KERN: net: InterfaceAddress 0xd89b1b94: init interface 0x00000000, domain 0x00000000
KERN: net: InterfaceAddress 0xd89cd570: init interface 0xd7dd2488, domain 0xd8553640
KERN: net: update_interface_address(address 0xd89cd570, option 8902, oldAddress <none>, newAddress ::1)
KERN: net: Interface 0xd7dd2680: flags changed from 1 to 4001
-- DHCP Discover
KERN: net:   /dev/net/ipro1000/0 matches
-- DHCP Offer received
-- DHCP Request
KERN: net:   /dev/net/ipro1000/0 matches
-- DHCP Acknowledged
KERN: net: update_interface_address(address 0xd85617c0, option 8902, oldAddress <none>, newAddress 10.20.30.10)
KERN: net: update_interface_address(address 0xd85617c0, option 8915, oldAddress 255.0.0.0, newAddress 255.255.255.0)
KERN: net: Interface 0xd7dd2680: flags changed from 4001 to 2001

syslog after changes:

KERN: net: Interface 0xd7dd1600: new "/dev/net/ipro1000/0", device interface 0x82bc5b78
KERN: net: InterfaceAddress 0xd8558670: init interface 0xd7dd1608, domain 0xd7e5b080
KERN: net: Interface 0xd7dd1600: create domain datalink for domain 0xd7e5b080
KERN: net: InterfaceAddress 0xd8550524: init interface 0x00000000, domain 0x00000000
KERN: net: Interface 0xd7dd1600: flags changed from 0 to 1
KERN: net: Interface 0xd7dd1000: new "loop", device interface 0x82b73998
KERN: net: InterfaceAddress 0xd873c750: init interface 0xd7dd1008, domain 0xd7e5b080
KERN: net: Interface 0xd7dd1000: create domain datalink for domain 0xd7e5b080
KERN: net: InterfaceAddress 0xd8549b2c: init interface 0x00000000, domain 0x00000000
KERN: net: update_interface_address(address 0xd873c750, option 8902, oldAddress <none>, newAddress 127.0.0.1)
KERN: net: Interface 0xd7dd1000: flags changed from 0 to 9
KERN: net: update_interface_address(address 0xd873c750, option 8902, oldAddress 127.0.0.1, newAddress <none>)
KERN: net: address is NULL! option is set addr
KERN: net: Interface 0xd7dd1600: flags changed from 1 to 4001
-- DHCP Discover
KERN: net:   /dev/net/ipro1000/0 matches
-- DHCP Offer received
-- DHCP Requests sent repeatedly, as interface is never updated

In the second syslog, notice that a second datalink domain is not created for IPv6, and the second update ends up unsetting the IPv4 local address. You can also see here the errant "fix" for a NULL address introduced in https://cgit.haiku-os.org/haiku/commit/?id=dd0ea64b. This fix should be reverted, as it is clearly hiding a bug rather than fixing a genuine issue.

The ethernet device never receives any calls to update the interface at all.

Attachments (3)

working.png (44.2 KB ) - added by kallisti5 10 years ago.
working
syslog (194.2 KB ) - added by kallisti5 10 years ago.
syslog of network stack with lots of tracing as of f0b2d
infinite.png (135.8 KB ) - added by kallisti5 10 years ago.

Download all attachments as: .zip

Change History (41)

comment:1 by jessicah, 10 years ago

Blocking: 12161 added

comment:2 by jessicah, 10 years ago

Cc: vidrep added

comment:4 by jessicah, 10 years ago

hrev49303 fixes the loop device; I know have both inet & inet6 addresses configured; but ethernet is still broken :(

comment:5 by pulkomandy, 10 years ago

Blocking: 12164 added

comment:6 by kallisti5, 10 years ago

This issue still exists on hrev49303. I've tested with the ipv6 add-on disabled at boot as well and see the same issue.

comment:7 by pulkomandy, 10 years ago

The IPv6 problem with local address waas a separate one and was solved. The remaining problem is probably not related to IPv6 at all.

It may be the same issue as #11607 and/or #10437, which both predates the netresolv changes. Maybe the changes only made the problem easier to reproduce?

comment:8 by vidrep, 10 years ago

Until the netresolv changes, I have never had a DHCP issue on either of my PC's. Are there any config logs I can upload to compare working/non-working configuration?

comment:9 by pulkomandy, 10 years ago

The DHCP logs are sent to syslog, you will probably get something similar to what's in the ticket already: DHCP discover, then an offer is received from the DHCP server, and then DHCP requests are sent, but there is never a DHCP acknowledge.

Maybe a wireshark or tcpdump capture of the DHCP messages (working vs nonworking) would help.

comment:10 by kallisti5, 10 years ago

I was luckily able to get a tcpdump of it working *and* not working within a few minutes with a virtualbox host-only network.

sudo tcpdump -i vboxnet0 > dump working:

10:32:55.829063 IP 192.168.1.1.mdns > 224.0.0.251.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)
10:32:56.818139 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:32:56.818145 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:32:56.818231 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:32:56.818235 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:32:56.819057 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:32:56.819060 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:32:56.819119 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:32:56.819123 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:32:57.697517 IP6 fe80::800:27ff:fe00:0.mdns > ff02::fb.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)
10:33:22.326433 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:33:22.326438 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:33:22.326501 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:33:22.326504 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:33:22.327779 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:33:22.327782 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:33:22.327801 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:33:22.327803 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548

sudo tcpdump -i vboxnet0 > dump2 not working

10:34:41.990947 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:34:41.990953 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:34:41.991038 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:34:41.991040 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:34:41.992674 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:34:41.992677 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:34:41.992698 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:34:41.992700 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:36:07.833240 IP 192.168.1.1.mdns > 224.0.0.251.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)
10:36:09.700992 IP6 fe80::800:27ff:fe00:0.mdns > ff02::fb.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)

sudo tcpdump -i vboxnet0 > dump3 working, cable disconnect, cable connect, not-working

10:40:23.833611 IP 192.168.1.1.mdns > 224.0.0.251.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)
10:40:24.250140 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:40:24.250144 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 265
10:40:24.250206 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:40:24.250209 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:40:24.251008 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:40:24.251010 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 08:00:27:08:e3:b3 (oui Unknown), length 277
10:40:24.251052 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:40:24.251055 IP 192.168.1.100.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 548
10:40:25.701901 IP6 fe80::800:27ff:fe00:0.mdns > ff02::fb.mdns: 0 [2q] PTR (QM)? _ipps._tcp.local. PTR (QM)? _ipp._tcp.local. (45)

Not a whole-lot there... but definitely a lot less traffic when it didn't work. Going to try a wireshark next.

comment:11 by kallisti5, 10 years ago

Just did a wireshark. So I know whats going on now...

  • Working: everything normal, see DHCP traffic (attached)
  • Not-working: no packets sent by network card. The interface is silent while haiku is configuring.

by kallisti5, 10 years ago

Attachment: working.png added

working

comment:12 by kallisti5, 10 years ago

Reproducibility Connecting and disconnecting the network cable in virtualbox (connect, wait for autoconfig to complete, disconnect, wait for no link)

  • Works, DHCP
  • Doesn't work, no traffic, 169.* address
  • Works, DHCP
  • Doesn't work, no traffic, 169.* address
  • Works, DHCP
  • Doesn't work, no traffic, 169.* address

Reproduction rate seems even 50/50 split. That likely means something is not set properly that gets set at the end of a call.

comment:13 by klauke, 10 years ago

This helped me

ifconfig /dev/net/iprowifi4965/0 down ifconfig loop down ifconfig loop 127.0.0.1 255.0.0.0 up ifconfig /dev/net/iprowifi4965/0 172.16.1.181 255.255.255.0 up route add /dev/net/iprowifi4965/0 default gw 172.16.1.1

comment:15 by kallisti5, 10 years ago

some old libbind.so references...

$ grep -R "libbind.so" *
kits/network/init.cpp:			if (!strcmp(name, "libbind.so")
tests/add-ons/kernel/network/ppp/ipcp/Jamfile:	libbind.so 

comment:16 by pulkomandy, 10 years ago

There wasn't a libbind.so before my changes, and there still isn't one. All the libbind sources are merged in libnetwork.so. The libbind check in init.cpp is used to detect BONE apps and decide wether to trigger R5 compatible hacks.

The PPP add-on is not even compiled.

Last edited 10 years ago by pulkomandy (previous) (diff)

comment:17 by jessicah, 10 years ago

Cc: X512 added

comment:18 by jessicah, 10 years ago

FWIW, I seem to now be able to set up a static IP from the command line as follows:

~> ipconfig /dev/net/ipro1000/0 inet 10.20.30.10
~> route add /dev/net/ipro1000/0 default gw 10.20.30.250

However, two things:

  1. DHCP still continues trying to acquire the address... not entirely sure why this is
  2. The ping command doesn't work
    ~> ping google.com
    error: unknown protocol icmp: error -2147459069 [No such file or directory]
    

An strace shows it trying to open /boot/system/settings/network/protocols, and I've checked this folder, and these files are indeed missing. This is on hrev49309.

comment:19 by pulkomandy, 10 years ago

> grep icmp /Haiku-TEST/system/settings/network/protocols 
icmp           1 ICMP         # Internet Control Message              [RFC792]
ipv6-icmp     58 IPv6-ICMP    # ICMP for IPv6                         [RFC2460]

This file is added to the image in build/jam/images/definitions/minimum.

comment:20 by axeld, 10 years ago

Ah, I guess this may explain some of the issues; a system update won't put the files there, they will only be there on a new image. Apparently a lot of things of the API will fail if those aren't available. Also, they don't really belong to "settings", rather into "data" (so you could overwrite them by putting files to non-packaged, if you have to, and they would then be part of the .hpkg).

Adrien, could you please reestablish the former mechanism of having the (stripped down) contents of those files built into libnetwork.so as defaults? I didn't implement that just for the fun of it to see if go to waste.

comment:21 by jessicah, 10 years ago

Blocking: 12169 added

comment:22 by jessicah, 10 years ago

Cc: 3dEyes added

comment:23 by pulkomandy, 10 years ago

I didn't notice it was built-in before, that sounds like the right thing to do.

I'm not sure the file should go in data however, in the case there is a built-in one I think the extra file in settings could be used to customize things (add extra protocols, etc). In that sense it would not be "data" anymore.

comment:24 by axeld, 10 years ago

In the launch_daemon, the files that define the services/jobs are all in data. But you can customize them in settings, if you want to. I find that makes the most sense, as you can easily update/complete the data files, and still be able to customize the experience.

I think it's a similar situation for these files.

comment:25 by vidrep, 10 years ago

I did a fresh install of hrev49292, then pkgman full sync to hrev49313. Still no DHCP.

comment:26 by vidrep, 10 years ago

I downloaded and did a fresh install of hrev49313 x86_gcc2. DHCP is still not being configured. All fields are blank.

comment:27 by X512, 10 years ago

hrev49313 x86_gcc2, original anyboot image, boot from USB. DHCP still don't work.

comment:28 by markh, 10 years ago

hrev49318 x86_gcc2, no DHCP either. Setting up a static link from the terminal does work.

by kallisti5, 10 years ago

Attachment: syslog added

syslog of network stack with lots of tracing as of f0b2d

comment:29 by X512, 10 years ago

Sometimes similar problem is present in earlier Haiku revisions such as hrev49248.

comment:30 by kallisti5, 10 years ago

I'm pretty sure getifaddrs is forming an invalid ifaddrs.. (attached)

I'm now getting a seg. violation here: http://cgit.haiku-os.org/haiku/tree/src/kits/network/netresolv/irs/getaddrinfo.c#n1096

    for (ifa = ifaddrs; ifa != NULL; ifa = ifa->ifa_next)

(see contents attached)

Also, I think this is a bug: http://cgit.haiku-os.org/haiku/tree/src/kits/network/netresolv/irs/getaddrinfo.c#n1092

We check for == -1 here, but error is anything < 0

by kallisti5, 10 years ago

Attachment: infinite.png added

comment:31 by kallisti5, 10 years ago

(btw, changing the check mentioned at the bottom of comment:#30 to < 0 didn't fix it :-) )

comment:32 by waddlesplash, 10 years ago

Blocking: 12179 added

comment:33 by axeld, 10 years ago

If it returns anything but -1 on error, it would be broken. Such functions set errno for the real error.

comment:34 by axeld, 10 years ago

Nice catch, kallisti5! I just looked into that function, and it is completely broken. I'm in the process of fixing it.

comment:35 by kallisti5, 10 years ago

Nice fix! Testing hrev49332 now (i'll do DHCP and static network configs). I spent an hour or two looking at that code and it just didn't seem right to me in a lot of ways... however I was lacking too much knowledge on the network stack to fix (and I didn't have the free time to dig into it)

comment:36 by kallisti5, 10 years ago

Confirmed resolved! Tested DHCP, static, and a few disables / enables. IPv6 untested still.

There is an issue with a missing symbol in sshd "innetgr" i'll open another ticket for that.

comment:37 by kallisti5, 10 years ago

Resolution: fixed
Status: newclosed

comment:38 by diver, 10 years ago

Still something is broken regarding DHCP. When I start Haiku DHCP doesn't work:

DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 8 secs (try 1)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 16 secs (try 2)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 32 secs (try 3)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 64 secs (try 4)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 64 secs (try 5)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 4629 secs (try 6)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Timeout shift: 2314 secs (try 7)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST to 192.168.0.1:67

After doing this hey net_server quit ; /system/servers/net_server DHCP suddenly works:

DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_DISCOVER to 255.255.255.255:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Received DHCP_OFFER from 192.168.0.1
DAEMON 'DHCP':   your_address: 192.168.0.6
DAEMON 'DHCP':   server: 192.168.0.1
DAEMON 'DHCP':   lease time: 25200 seconds
DAEMON 'DHCP':   nameserver[0]: 192.168.0.1
DAEMON 'DHCP':   gateway: 192.168.0.1
DAEMON 'DHCP':   subnet: 255.255.255.0
DAEMON 'DHCP':   UNKNOWN OPTION 252 (0xfc)
DAEMON 'DHCP': /dev/net/ipro1000/0: Send DHCP_REQUEST for 192.168.0.6 to 255.255.255.255:67
DAEMON 'DHCP': /dev/net/ipro1000/0: Received DHCP_ACK from 192.168.0.1
DAEMON 'DHCP':   server: 192.168.0.1
DAEMON 'DHCP':   lease time: 25200 seconds
DAEMON 'DHCP':   nameserver[0]: 192.168.0.1
DAEMON 'DHCP':   gateway: 192.168.0.1
DAEMON 'DHCP':   subnet: 255.255.255.0
DAEMON 'DHCP':   UNKNOWN OPTION 252 (0xfc)
DAEMON 'DHCP': /dev/net/ipro1000/0: DHCP status = No error
Note: See TracTickets for help on using tickets.