Opened 3 years ago

Closed 8 months ago

#7040 closed bug (fixed)

[net_server] interface stops working after turning it off and on

Reported by: diver Owned by: axeld
Priority: normal Milestone: R1
Component: Servers/net_server Version: R1/Development
Keywords: Cc: shade, mmlr
Blocked By: Blocking: #9695
Has a Patch: yes Platform: All

Description (last modified by diver)

This is hrev39980, gcc4hybrid in VirtualBox 3.0.12 and 4.0.

After disabling/enabling network interface in Network preflet or deletion/creation it via ifconfig it doesn't obtain correct ip via DHCP anymore. Static mode doesn't work as well. Errors counter in ifconfig continues registering errors.

Default and working DHCP settings from VirtualBox:

~> ifconfig /dev/net/ipro1000/0
/dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: 10.0.2.15, Bcast: 10.255.255.255, Mask: 255.255.255.0
        MTU: 1500, Metric: 0, up broadcast link auto-configured
        Receive: 4 packets, 0 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 2 packets, 0 errors, 606 bytes, 0 mcasts, 0 dropped
        Collisions: 0


~> ifconfig del /dev/net/ipro1000/0 
~> ifconfig /dev/net/ipro1000/0 auto-config up
~> ifconfig /dev/net/ipro1000/0
/dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: , Bcast: , Mask: 
        MTU: 1500, Metric: 0, up broadcast link
        Receive: 4 packets, 344 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 2 packets, 0 errors, 606 bytes, 0 mcasts, 0 dropped
        Collisions: 0


After about 1 minute:

~> ifconfig /dev/net/ipro1000/0
        Hardware type: Ethernet, Address: 08:00:27:c7:d8:5b
        Media type: 1 GBit, 1000BASE-T
        inet addr: 169.254.0.107, Bcast: 169.254.255.255, Mask: 255.255.0.0
        MTU: 1500, Metric: 0, up broadcast link auto-configured
        Receive: 4 packets, 25670 errors, 1300 bytes, 0 mcasts, 0 dropped
        Transmit: 11 packets, 0 errors, 3279 bytes, 0 mcasts, 0 dropped
        Collisions: 0


syslog part:

KERN: [net/ipro1000/0] compat_close()
KERN: [net/ipro1000/0] compat_free()
KERN: [net/ipro1000/0] compat_open(0x2)
KERN: printf(`ifmedia_ioctl: switching %s to ',...)
KERN: printf(`Ethernet',...)
KERN: printf(` %s',...)
KERN: printf(`%s
KERN: ',...)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 0)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 0)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 2 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 1)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 2 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 4 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP timeout shift for /dev/net/ipro1000/0: 8 secs (try 2)
DAEMON 'DHCP': DHCP send message DHCP_DISCOVER for /dev/net/ipro1000/0
DAEMON 'DHCP': DHCP for /dev/net/ipro1000/0, status: Operation timed out

Attachments (6)

default_settings.png (15.6 KB) - added by diver 3 years ago.
bogus_settings.png (16.1 KB) - added by diver 3 years ago.
private_ip.png (15.8 KB) - added by diver 3 years ago.
net_device_interface-before-delete.png (7.3 KB) - added by diver 13 months ago.
net_device_interface-after-delete.png (6.9 KB) - added by diver 13 months ago.
0001-Fix-7049.patch (1.3 KB) - added by diver 13 months ago.

Download all attachments as: .zip

Change History (18)

Changed 3 years ago by diver

Changed 3 years ago by diver

Changed 3 years ago by diver

comment:1 Changed 3 years ago by diver

Still here in hrev41752.

comment:2 Changed 13 months ago by diver

  • Cc shade added

Still present in hrev45428. Some notes, though:

after ifconfig del net_server can't bring up the interface:

ifconfig del /dev/net/ipro1000/0 
ifconfig /dev/net/ipro1000/0 up
ifconfig: Couldn't add interface: General system error 

removing /system/add-ons/kernel/network/protocols/ipv6 module workarounds this new issue.

For some reason deleting an interface leaves it in net_device_interface list:


ifconfig del /dev/net/ipro1000/0


As can be seen the net_device_interface is ref counted and obviously doesn't reach 0.

That may be the reason why neither DHCP nor static work.

ifconfig /dev/net/ipro1000/0 auto-config up
[...]
DHCP failed miserably!
DAEMON 'DHCP': DHCP for /dev/net/ipro1000/0, status: Operation timed out 
Last edited 13 months ago by diver (previous) (diff)

Changed 13 months ago by diver

Changed 13 months ago by diver

comment:3 Changed 13 months ago by diver

ifconfig up (after del) works until the interface is configured. Maybe hrev43721 is somehow related to it.

Also:

<mmlr_thinkpad> actually the preflet uses ioctls directly while ifconfig simply uses BNetworkInterfaceAddress 
<mmlr_thinkpad> and the latter simply handles the case of not having any address 
<mmlr_thinkpad> so the preflet should be reworked to use the API as well, instead of trying to do ioctls directly

comment:4 Changed 13 months ago by diver

  • Description modified (diff)
  • Summary changed from [net_server] DHCP doesn't work after interface deletion to [net_server] interface stops working after turning it off and on

comment:5 Changed 13 months ago by diver

  • Cc mmlr added

Changed 13 months ago by diver

comment:6 Changed 13 months ago by diver

  • Has a Patch set

comment:7 Changed 13 months ago by diver

mmlr_thinkpad: the patch is incomplete
it should also put the device interface in the error case a few lines above
the device_interface reference leak is real in any case
and the above patch, augmented by an additional put in the error case, should fix that
alternatively the user_memcpy could be moved above the user_request_get_device_interface() call, so there wouldn't be a need to do another put

mmlr_thinkpad: as I said, it is incomplete in the sense that it doesn't put the reference in the error case
or alternatively, the copy could be moved above getting the device_interface so that this is not needed

jshade: mmlr_thinkpad: in case of error "interface" is null and not need to put

mmlr_thinkpad: no it isn't
only in case the device_interface cannot be found
if copying the request from userland fails, then interface is leaked
hence the suggestion of moving copying of the request above getting the device_interface
http://cgit.haiku-os.org/haiku/tree/src/add-ons/kernel/network/stack/link.cpp#n470
that's what I'm talking about

jshade: but there is another similar error
in "symmetry" of interface->Acqure/ReleaseReference() use
but this happen in case of configuring both ipv4 ipv6 on interface
in this case if we delete configured intf we cannot add its back
DEVICE_CLOSED affect on data transfet after intf re-add
*transfer
after intf added the driver is reopened but it return error on "read" as it "closed"

mmlr_thinkpad:
jshade: would you say not closing the driver on bringing an interface down would make sense in general?
it's how freebsd behaves, but we break that as we close the driver and then bring it up again
I've worked around that with compat ioctls that behave similar to freebsd up/down for wireless drivers and wpa_supplicant
but if this is a problem for wired drivers as well, then we should make a version of the ethernet module that doesn't close the driver at all
and use that one for all freebsd drivers

jshade: on close we set device_closed bit but not unload driver and not unset flag after open device again. see compat_open()/compat_close()

mmlr_thinkpad: jshade: yes, from the driver side not
however if we set the interface down then the ethernet add-on in the net stack will close/free the driver
I'm wondering if it wouldn't simply be better to only close/free the driver on interface deletion, not on brining it down

comment:8 Changed 13 months ago by phoudoin

mmlr_thinkpad:
I'm wondering if it wouldn't simply be better to only close/free the driver on interface > deletion, not on brining it down

+1
It make clearly no sense to unload a network *device* driver on *interface* up/down semantic, as the fence between device and interface semantic was made to have more control on what and how data is eventually send over a network device.

When the last interface using the device is deleted, then close the driver make sense.
But not when an interface is simply turned down.

comment:9 Changed 13 months ago by axeld

While I originally implemented it this way, I have no recollection why - I assume that it didn't make any sense to keep the driver loaded in this case. If that was the only reason, and since that is not true for WLAN drivers, I have to agree that this should be changed.

comment:10 Changed 12 months ago by diver

  • Blocking 9695 added

comment:11 Changed 11 months ago by axeld

Missing put_device_interface() calls fixed in hrev45737.

comment:12 Changed 8 months ago by phoudoin

  • Resolution set to fixed
  • Status changed from new to closed

Missing DEVICE_CLOSED flag clearing in compat_open() fixed in hrev46023.
While it doesn't redesign ethernet device module regarding itf down == close device, it does fix the issue here.

Note: See TracTickets for help on using tickets.