Opened 13 years ago

Closed 11 years ago

#7040 closed bug (fixed)

[net_server] interface stops working after turning it off and on

Reported by: diver Owned by: axeld
Priority: normal Milestone: R1
Component: Servers/net_server Version: R1/Development
Keywords: Cc: shade, mmlr
Blocked By: Blocking: #9695
Platform: All

Description (last modified by diver)

This is hrev39980, gcc4hybrid in VirtualBox 3.0.12 and 4.0.

After disabling/enabling network interface in Network preflet or deletion/creation it via ifconfig it doesn't obtain correct ip via DHCP anymore. Static mode doesn't work as well. Errors counter in ifconfig continues registering errors.

Default and working DHCP settings from VirtualBox:

~> ifconfig /dev/net/ipro1000/0
/dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: 10.0.2.15, Bcast: 10.255.255.255, Mask: 255.255.255.0
        MTU: 1500, Metric: 0, up broadcast link auto-configured
        Receive: 4 packets, 0 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 2 packets, 0 errors, 606 bytes, 0 mcasts, 0 dropped
        Collisions: 0

~> ifconfig del /dev/net/ipro1000/0 
~> ifconfig /dev/net/ipro1000/0 auto-config up
~> ifconfig /dev/net/ipro1000/0
/dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: , Bcast: , Mask: 
        MTU: 1500, Metric: 0, up broadcast link
        Receive: 4 packets, 344 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 2 packets, 0 errors, 606 bytes, 0 mcasts, 0 dropped
        Collisions: 0


After about 1 minute:

~> ifconfig /dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: 169.254.0.107, Bcast: 169.254.255.255, Mask: 255.255.0.0
        MTU: 1500, Metric: 0, up broadcast link auto-configured
        Receive: 4 packets, 25670 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 11 packets, 0 errors, 3279 bytes, 0 mcasts, 0 dropped
        Collisions: 0


syslog part:

KERN: [net/ipro1000/0] compat_close()
KERN: [net/ipro1000/0] compat_free()
KERN: [net/ipro1000/0] compat_open(0x2)
KERN: printf(`ifmedia_ioctl: switching %s to ',...)
KERN: printf(`Ethernet',...)
KERN: printf(` %s',...)
KERN: printf(`%s
KERN: ',...)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 0)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 0)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 2 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 2 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP for /dev/net/ipro1000/0, status: Operation timed out

Attachments (6)

default_settings.png (15.6 KB ) - added by diver 13 years ago.
bogus_settings.png (16.1 KB ) - added by diver 13 years ago.
private_ip.png (15.8 KB ) - added by diver 13 years ago.
net_device_interface-before-delete.png (7.3 KB ) - added by diver 11 years ago.
net_device_interface-after-delete.png (6.9 KB ) - added by diver 11 years ago.
0001-Fix-7049.patch (1.3 KB ) - added by diver 11 years ago.

Download all attachments as: .zip

Change History (18)

by diver, 13 years ago

Attachment: default_settings.png added

by diver, 13 years ago

Attachment: bogus_settings.png added

by diver, 13 years ago

Attachment: private_ip.png added

comment:1 by diver, 13 years ago

Still here in hrev41752.

comment:2 by diver, 11 years ago

Cc: shade added

Still present in hrev45428. Some notes, though:

after ifconfig del net_server can't bring up the interface:

ifconfig del /dev/net/ipro1000/0 
ifconfig /dev/net/ipro1000/0 up
ifconfig: Couldn't add interface: General system error 

removing both ipv6 modules workarounds this new issue.

For some reason deleting an interface leaves it in net_device_interface list:

ifconfig del /dev/net/ipro1000/0

As can be seen the net_device_interface is ref counted and obviously doesn't reach 0.

That may be the reason why neither DHCP nor static work.

ifconfig /dev/net/ipro1000/0 auto-config up
[...]
DHCP failed miserably!
DAEMON 'DHCP': DHCP for /dev/net/ipro1000/0, status: Operation timed out 
Version 0, edited 11 years ago by diver (next)

comment:3 by diver, 11 years ago

ifconfig up (after del) works until the interface is configured. Maybe hrev43721 is somehow related to it.

Also:

<mmlr_thinkpad> actually the preflet uses ioctls directly while ifconfig simply uses BNetworkInterfaceAddress 
<mmlr_thinkpad> and the latter simply handles the case of not having any address 
<mmlr_thinkpad> so the preflet should be reworked to use the API as well, instead of trying to do ioctls directly

comment:4 by diver, 11 years ago

Description: modified (diff)
Summary: [net_server] DHCP doesn't work after interface deletion[net_server] interface stops working after turning it off and on

comment:5 by diver, 11 years ago

Cc: mmlr added

by diver, 11 years ago

Attachment: 0001-Fix-7049.patch added

comment:6 by diver, 11 years ago

patch: 01

comment:7 by diver, 11 years ago

mmlr_thinkpad: the patch is incomplete it should also put the device interface in the error case a few lines above the device_interface reference leak is real in any case and the above patch, augmented by an additional put in the error case, should fix that alternatively the user_memcpy could be moved above the user_request_get_device_interface() call, so there wouldn't be a need to do another put

mmlr_thinkpad: as I said, it is incomplete in the sense that it doesn't put the reference in the error case or alternatively, the copy could be moved above getting the device_interface so that this is not needed

jshade: mmlr_thinkpad: in case of error "interface" is null and not need to put

mmlr_thinkpad: no it isn't only in case the device_interface cannot be found if copying the request from userland fails, then interface is leaked hence the suggestion of moving copying of the request above getting the device_interface http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/network/stack/link.cpp#n470 that's what I'm talking about

jshade: but there is another similar error in "symmetry" of interface->Acqure/ReleaseReference() use but this happen in case of configuring both ipv4 ipv6 on interface in this case if we delete configured intf we cannot add its back DEVICE_CLOSED affect on data transfet after intf re-add *transfer after intf added the driver is reopened but it return error on "read" as it "closed"

mmlr_thinkpad: jshade: would you say not closing the driver on bringing an interface down would make sense in general? it's how freebsd behaves, but we break that as we close the driver and then bring it up again I've worked around that with compat ioctls that behave similar to freebsd up/down for wireless drivers and wpa_supplicant but if this is a problem for wired drivers as well, then we should make a version of the ethernet module that doesn't close the driver at all and use that one for all freebsd drivers

jshade: on close we set device_closed bit but not unload driver and not unset flag after open device again. see compat_open()/compat_close()

mmlr_thinkpad: jshade: yes, from the driver side not however if we set the interface down then the ethernet add-on in the net stack will close/free the driver I'm wondering if it wouldn't simply be better to only close/free the driver on interface deletion, not on brining it down

comment:8 by phoudoin, 11 years ago

mmlr_thinkpad: I'm wondering if it wouldn't simply be better to only close/free the driver on interface > deletion, not on brining it down

+1 It make clearly no sense to unload a network *device* driver on *interface* up/down semantic, as the fence between device and interface semantic was made to have more control on what and how data is eventually send over a network device.

When the last interface using the device is deleted, then close the driver make sense. But not when an interface is simply turned down.

comment:9 by axeld, 11 years ago

While I originally implemented it this way, I have no recollection why - I assume that it didn't make any sense to keep the driver loaded in this case. If that was the only reason, and since that is not true for WLAN drivers, I have to agree that this should be changed.

comment:10 by diver, 11 years ago

Blocking: 9695 added

comment:11 by axeld, 11 years ago

Missing put_device_interface() calls fixed in hrev45737.

comment:12 by phoudoin, 11 years ago

Resolution: fixed
Status: newclosed

Missing DEVICE_CLOSED flag clearing in compat_open() fixed in hrev46023. While it doesn't redesign ethernet device module regarding itf down == close device, it does fix the issue here.

Note: See TracTickets for help on using tickets.