Opened 13 years ago

Closed 8 years ago

#8293 closed bug (fixed)

BNetworkAddress needs to check if there is an available IPv6 connection.

Reported by: kallisti5 Owned by: axeld
Priority: high Milestone: R1/beta1
Component: Network & Internet/Stack Version: R1/Development
Keywords: Cc: stippi, phoudoin
Blocked By: Blocking: #9269, #10033, #12173, #12174, #12186, #12188, #12233
Platform: All

Description

BNetworkAddress needs to check if there is an available IPv6 connection before choosing a AAAA dns record over an A record.

Attachments (3)

libnetwork.diff (2.7 KB ) - added by donn 13 years ago.
libnetwork patch
Network.diff (5.8 KB ) - added by donn 13 years ago.
Network pref patch
getaddrinfo.cpp (586 bytes ) - added by axeld 8 years ago.
Test program to demonstrate the remaining problem.

Download all attachments as: .zip

Change History (41)

comment:1 by kallisti5, 13 years ago

post hrev43681:

~> /boot/system/servers/mail_daemon
Connect
SocketConnection to server ssl.unixzen.com:993
Server resolves to [2001:470:1f10:b1::2]:993
Connect: Connect Error: General system error
account name kallisti5@unixzen.com, id 1327204053, in 0x18056d40, out 0x18026dd0

No IPv6 addresses:

~> ifconfig -a
loop    Hardware type: Local Loopback, Address: none
        inet addr: 127.0.0.1, Mask: 255.0.0.0
        MTU: 16384, Metric: 0, up loopback link
        Receive: 0 packets, 0 errors, 0 bytes, 0 mcasts, 0 dropped
        Transmit: 0 packets, 0 errors, 0 bytes, 0 mcasts, 0 dropped
        Collisions: 0

/dev/net/broadcom570x/0
        Hardware type: Ethernet, Address: 00:12:3f:c4:90:5f
        Media type: 1 GBit, 1000BASE-T
        inet addr: 10.10.10.169, Bcast: 10.10.10.255, Mask: 255.255.255.0
        MTU: 1500, Metric: 0, up broadcast link auto-configured
        Receive: 3271 packets, 0 errors, 2539957 bytes, 0 mcasts, 0 dropped
        Transmit: 1616 packets, 0 errors, 145359 bytes, 0 mcasts, 0 dropped
        Collisions: 0
~> route list inet6
~> 

However, ssl.unixzen.com has an AAAA and an A record.

comment:2 by kallisti5, 13 years ago

Version 0, edited 13 years ago by kallisti5 (next)

comment:4 by axeld, 13 years ago

The BNetworkAddressResolver class just uses libbind's getaddrinfo(), and that's the one that would be the one to change. In the mean time, we could reapply a patch that prevents returning IPv6 addresses altogether.

comment:5 by kallisti5, 13 years ago

Hmm, from the looks of things we *should* already be checking the availability of IPv6 in getaddrinfo...

Called: http://cgit.haiku-os.org/haiku/tree/src/kits/network/libbind/irs/getaddrinfo.c#n549

called via: http://cgit.haiku-os.org/haiku/tree/src/kits/network/libnetapi/NetworkAddressResolver.cpp#n150

AI_ADDRCONFIG set if SetTo flag B_UNCONFIGURED_ADDRESS_FAMILIES not set.

As flags == 0, it looks like libbind + the network kit is doing everything right there.

The big question then is the following working properly in libbind...

!addrconfig(pai->ai_family))

http://cgit.haiku-os.org/haiku/tree/src/kits/network/libbind/irs/getaddrinfo.c#n1085

comment:6 by axeld, 13 years ago

addrconfig() only tests if you can create a socket for the specified family, not if there is an address configured. It will succeed on Haiku if the IPv6 module is installed.

Not sure if that's supposed to fail on other platforms, but such a socket is also used to configure an address in the first place (same as at least the BSDs, IIRC).

comment:7 by donn, 13 years ago

If this isn't the same problem, it seems closely related. WebPositive, as of hrev43668, spends about 15 seconds on futile IPv6 name resolution attempts, when no IPv6 is configured (as shown in ifconfig / Network pref.)

I fixed the same problem in an application that uses libcurl, with a curl option that specifies IPv4 only. I'm not saying, though, that WebPositive ought to be fixed the same way. I guess it should continue to try to support IPv6 as appropriate, but the resolver library should know better than to send IPv6 queries when it isn't configured, so it should return right away, not after a 5 or 10 second select timeout.

That might be through settings/network/irs.conf - I don't have one, and my attempt to create one only broke name resolution altogether.

comment:8 by stippi, 13 years ago

Cc: stippi added

comment:9 by donn, 13 years ago

Here's an idea, expressed as a patch. resolv.conf already has an "inet6" option that more or less means "IPv6 only." Let's add an "inet4" option, also, and interpret it to mean, specifically, that there shall be no IPv6 lookups, in getaddrinfo(), when the family is PF_UNSPEC (i.e., caller didn't specifically ask for PF_INET6.)

The current behavior (PF_UNSPEC causes IPv6 lookups) remains the default. Specify IPv4 only via "options inet4" in resolv.conf.

It could be overkill - I don't know if anyone is actually using IPv6 successfully, if not then it's overkill for sure right now. And of course it means you have to know to add that "inet4" option - would need to add that to network pref. It does solve the problem with WebPositive.

comment:10 by donn, 13 years ago

patch: 01

by donn, 13 years ago

Attachment: libnetwork.diff added

libnetwork patch

by donn, 13 years ago

Attachment: Network.diff added

Network pref patch

comment:11 by axeld, 13 years ago

I've only really looked at the first patch, and that looks good. I'm all for applying it, although I would closely match the 'inet6' option in semantics (I haven't looked at the sources to be able to tell).

However, I don't think this should be the solution to the original problem, though; getaddrinfo() should only return an IPv6 address if there is a configured IPv6 interface (and vice versa), no matter what resolv.conf says.

in reply to:  11 ; comment:12 by kallisti5, 13 years ago

Replying to axeld:

However, I don't think this should be the solution to the original problem, though; getaddrinfo() should only return an IPv6 address if there is a configured IPv6 interface (and vice versa), no matter what resolv.conf says.

This raises some interesting questions.

Do we return the IPv6 address if:

A) we detect a configured IPv6 interface?

B) we detect a configured IPv6 global scope ip on an interface?

C) we detect an IPv6 gateway?

  • A is definitely tripped up when you have link local addresses.
  • C may prevent usage of valid IPv6 addresses if they resolve on the local lan.

Seems the only one that makes sense is B. Axel, do we have address scope?

comment:13 by donn, 13 years ago

Re semantics of inet4 - to be precisely symmetric with the implementation of 'inet6', it would ask for IPv6 addresses to be converted somehow to IPv4. You would do that if you have no IPv4 support (and only if - no guarantee this kind of "tunneling" will work well!) You can more loosely interpret "inet6" as "want IPv6 addresses only", where "inet4" means "want IPv4 addresses only"; the lack of symmetry in implementation is natural.

in reply to:  12 comment:14 by axeld, 13 years ago

Replying to kallisti5:

Seems the only one that makes sense is B. Axel, do we have address scope?

I don't know what you mean by that question - IPv6 addresses do have scope, so we have, too. It's only interesting for routing, AFAICT. BTW, I would have suggest a) as the best choice, as link local addresses are fine for the local network, and nobody stops you from having a DNS server just for that one.

Maybe we do need to be a bit more intelligent or dumb with it. The intelligent solution could check if IPv6 can reach the address before returning it, the dumb solution would put the burden to the user to specifically turn on IPv6 when wanted.

However, I would suggest to first look at how it works in other operating systems like Linux, FreeBSD, or even Windows and MacOS X. It's not a problem that only Haiku should have, and there might be smart solutions out there that we didn't think of yet.

Replying to donn:

the lack of symmetry in implementation is natural.

Alright, sounds convincing enough to me :-)

comment:15 by phoudoin, 12 years ago

Cc: phoudoin added

comment:16 by mmadia, 12 years ago

To note, both patches (attachment:libnetwork.diff attachment:Network.diff ​) apply cleanly on current master (hrev45330) and compile fine with gcc2 and gcc4. Are either or both OK to commit?

comment:17 by kallisti5, 12 years ago

libnetwork.diff was applied in hrev45503. I didn't apply Network.diff because it didn't write to resolv.conf when I tested it.

comment:18 by pulkomandy, 10 years ago

Blocking: 9269 added

(In #9269) Indeed, that seems to be the case. Marking this ticket as blocked by #8293 then.

comment:19 by pulkomandy, 10 years ago

Blocking: 10033 added

(In #10033) Another manifestation of #8293.

comment:20 by pulkomandy, 10 years ago

As already mentionned above, http://www.ietf.org/rfc/rfc3484.txt describes the algorithm that should be used for this.

Linux implements this and has a way to further tweak the resolution through /etc/gai.conf (http://linux.die.net/man/5/gai.conf)

The "ipv6" option in resolv.conf is for gethostbyname only. getaddrinfo is passed the "service" (port number) so it can use the right address family. So I'm not sure having the "ipv4" option there makes sense, after all.

Also https://www.isc.org/downloads/libbind/ mentions that NetBSD now develops what used to be libbind as netresolv: http://wiki.netbsd.org/individual-software-releases/netresolv/ . We should consider migrating to it so we get the latest fixes and improvements.

Last edited 10 years ago by pulkomandy (previous) (diff)

comment:21 by donn, 10 years ago

My impression is that RFC 3484 (and hence gai.conf) is about order of address returns, and I don't see any way there to suppress IPV6 queries. There may be a way to use irs.conf.

The intent behind the "ipv4" flag is to suppress AF_INET6 queries when getaddrinfo() is called with AF_UNSPEC. I'm stumped by how a service port would affect that. Empirically, I do get an IPv6 address from getaddrinfo("www.google.com", "http", ...), followed by one or more IPv4 addresses. (In my present situation with someone's guest wifi, IPv6 queries get answers, instead of timing out.) There is of course a way for an application to request only AF_INET addresses - the ai_family field - but my sense is that an application should not be coded to do that. It's for the host to configure IPv4 or IPv6, and the application should work either way.

comment:22 by pulkomandy, 10 years ago

There is no need to completely remove the IPv6 addresses (and if the application asked for AF_UNSPEC, I think it is even wrong to do so). However, we should make sure the IPv4 addresses are returned first when they are routable. The RFC specifies how this should be done, so all systems implementing it behave in the same way.

comment:23 by donn, 10 years ago

I don't think anyone's talking about removing IPv6 addresses.

This ticket conversation has a couple of branches. Ticket sorting is one of them, and as far as I'm concerned that's between you and Korli and whoever else cares.

The parent branch, if I understand right, is about returning IPv6 addresses when the host can't use them.

The branch that led to the "ipv4" patch is about asking for IPv6 addresses, when the DNS service doesn't respond to such requests. That was happening to me, and others, with WebPositive in particular because it apparently used AF_UNSPEC, leading to extremely slow response due to a lot of DNS timeouts.

comment:24 by pulkomandy, 10 years ago

I don't think this is how it works. When you ask for AF_UNSPEC, the DNS server will reply (no timeout here). It will include every addresses it knows, including IPv6 ones. The timeout happens when Web+ (or other apps) try to connect to the IPv6 addresses and there is no IPv6 interface configured. After some time, the IPv6 request times out and connection is tried again with the next address (usually an IPv4 one). You can clearly see the same when using telnet, which prints the addresses it uses ("null" is shown because telnet does not expect IPv6 addresses and doesn't know how to print them):

~> telnet google.fr
Trying (null)...
telnet: connect to address (null): Network is unreachable
Trying 216.58.211.67...

The fix is to implement RFC3484 so in the case when there is no IPv6 network address configured on the system, the DNS replies are sorted so the IPv4 addresses are presented first to applications. So, AF_UNSPEC is still honored (you get all addresses) but if you only take the first one, it is an address you can connect to.

comment:25 by donn, 10 years ago

I'm away for a few more days, so I can't present an empirical demonstration until then, but that is indeed what was happening - I could see it in the system call trace, and match it to the timeout in getaddrinfo().

Your DNS service probably doesn't ignore IPV6 queries. Mine does ("mine" meaning, Haiku's choice in resolv.conf.) I don't doubt that it's wrong to do that, there's probably no excuse for ignoring a query like that, but 1) it happens (or did, years ago when this came up) to me and others (cf. discussion on developers list at the time.) The fix is harmless - if I want applications on my host to confine themselves to IPV4, then it isn't the OS's business to guess that I might be wrong, and in that case it would be silly to waste time resolving IPv6 addresses.

And of course this fix has already been added to getaddrinfo() etc, a couple years ago. Is the probem that we're reluctant to incorporate it into a new and hopefully working Network Prefs? If you can allow Network Prefs et al. to co-exist peacefully with manual edits to resolv.conf, then maybe you don't need to worry about it - it can be added to Network Prefs later if there's call for it, as long as the occasional user who needs it can manually add it. (Note that I'm not talking about only Network Prefs per se, wifi DHCP setup also overwrites my resolv.conf, which is another example of something that needs to be more merge and less overwrite, since I may have several interfaces and the DHCP-supplied DNS may not be suitable for the other interfaces, or suitable at all, for that matter.

comment:26 by pulkomandy, 9 years ago

Resolution: fixed
Status: newclosed

Fixed in hrev49293.

comment:27 by kallisti5, 9 years ago

Milestone: R1R1/beta1

comment:28 by pulkomandy, 9 years ago

Resolution: fixed
Status: closedreopened

netresolv doesn't solve this problem, now that I enabled IPv6 support in it. We will have to implement this ourselves or see if the current code in NetBSD does it.

comment:29 by pulkomandy, 9 years ago

Blocking: 12173 added

comment:30 by pulkomandy, 9 years ago

Blocking: 12174 added

comment:31 by axeld, 9 years ago

Blocking: 12186 added

(In #12186) You cannot browse sites that return IPv6 addresses.

comment:32 by waddlesplash, 9 years ago

Blocking: 12188 added

comment:33 by pulkomandy, 9 years ago

Resolution: fixed
Status: reopenedclosed

Fixed in hrev49401.

comment:34 by diver, 9 years ago

Blocking: 12233 added

comment:35 by axeld, 8 years ago

Resolution: fixed
Status: closedreopened

It's still not completely fixed; the attached test program has the following output:

got: [::1]:80
got: 127.0.0.1:80

There is no IPv6 address for anything but localhost.

by axeld, 8 years ago

Attachment: getaddrinfo.cpp added

Test program to demonstrate the remaining problem.

comment:36 by pulkomandy, 8 years ago

That would be ok for localhost (there is a valid ipv6 address there and it is reachable, so why not use it?). But the test program has the same result if using "google.com" as the host string, which is more annoying.

I had this working when I closed the ticket, so it must be a regression since then, or maybe I had something else different in my network setup.

comment:37 by pulkomandy, 8 years ago

I remembered the missing bit: you need to set the AI_ADDRCONFIG flag for the filtering to be enabled. Otherwise, you get the full reply from the DNS server, witohut filtering. As specified in POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/functions/freeaddrinfo.html

We do so in BNetworkAddressResolver: http://cgit.haiku-os.org/haiku/tree/src/kits/network/libnetapi/NetworkAddressResolver.cpp#n163

All users of getaddrinfo need to be updated to use the flag when it makes sense, however. That is almost always, unless:

  • They don't plan to actually connect to the address (nslookup type of application)
  • They don't plan to us IPv4 or v6 addresses (looking for an MX record, for example)
  • They handle IPv6/v4 switch in their own way

comment:38 by pulkomandy, 8 years ago

Resolution: fixed
Status: reopenedclosed

Confirmed working with AI_ADDRCONFIG flag, closing again.

Note: See TracTickets for help on using tickets.