The other day I wanted to adjust the timeout in resolv.conf, but in times like Docker and Kubernetes. Even if you're only using Linux, the resolver isn't necessarily that of glibc. So I investigated the difference between glibc, musl libc (alpine), and go resolver.
I used docker container for each, and checked the status of the inquiry with tcpdump.
macOS Catalina (10.15.7)
docker desktop 3.0.3 (51017)
container image
alpine:3.12.3 / musl-1.1.24-r10
debian:buster-20201209 (10.7) / glibc 2.28-10
Go used 1.15.6 on a mac, cross-compiled it and ran it on alpine above.
Since it is cross-compiled, it should be CGO_ENABLED = 0
by default, but I built it explicitly.
resolv.conf example
nameserver 8.8.8.8
nameserver 8.8.4.4
search default.svc.cluster.local svc.cluster.local cluster.local asia-northeast1-b.c.project-id.internal c.project-id.internal google.internal
options ndots:5 timeout:5 attempts:2
There are various options in resolv.conf, but all three resolvers this time support it.
ndots
timeout
attempts
It was three.
It supports search
as well as nameserver
.
Older musl libc didn't support search
(domain completion), but it has been supported since 1.1.13.
nameserver
can be specified multiple times, but only three are used. Even if you specify 4 or more, only 3 will be used.
Let's see the difference in function
ndots
The first is ndots
, I first learned about it when I started using Kubernetes.
You specify the number of dots (.
) contained in the domain you are querying, which is 5 (ndots: 5
) by default in Kubernetes.
For example, if you say ping www.google.com
, the number of dots is 2.
If the number of dots is greater than or equal to that specified by ndots
, it is determined to be an FQDN and queries the DNS server without completing the domain specified by search
. Now, when NXDOMAIN
(that domain doesn't exist) is returned, glibc and go will query for the domain specified by search
. However, ** musl libc ends up saying that such a domain doesn't exist without making a complementary query. ** **
A long time ago, musl did not support domain completion in the first place, so people who have been using it since that time will always specify it by FQDN, so it will not be a problem, but it should be possible to solve it by reducing ndots
without knowing it. The domain may not be resolved.
If the number of dots is less than ndots
, it will first complete and query multiple domains specified by search
in order, repeating until found (I forgot to check this limit), so the order is Is it important? However, the number specified by ndots
is more important than the order, and there are quite a lot of 5
. The default setting in Kubernetes is service.namespace.svc.cluster.local
, which is 4
even if he specifies the FQDN. After trying to complete all the domains listed in search
, you can just query service.namespace.svc.cluster.local
and finally get the result you want, that there is a negative cache on the DNS server side. No, it's a lot of waste because it communicates with the DNS server that many times. The GKE default settings enumerate as many as six domains. There are four EKS. Rather than making ndots
smaller and creating a domain that cannot resolve names, the default value is that it can resolve names even if there are more useless queries. ndots
seems to have room for tuning.
By the way, if you add .
at the end, it means that it is FQDN, so if you use ping www.google.com.
etc., any resolver will not complete the domain.
nameserver
The handling when multiple servers are specified is very different between glibc and musl, so I will explain it first. glibc and Go query the first server, and if they wait for timeout seconds and there is no response, they query the next server. musl libc sends requests to multiple nameservers at the same time and uses the first response returned. If three name servers are specified, three times as many inquiries will occur.
Since timeout
and attempts
are related, they are treated together. It also depends on the number of nameserver
s, so group them by each.
glibc and Go will throw a query and wait timeout
seconds, then query again if there is no response, for a total of attempts
times.
musl libc waits for each query for the number of seconds that timeout
seconds are divided by atempts
. In other words, wait up to timeout
seconds in total including retries. The default setting of timeout: 5 attempts: 2
waits 2.5 seconds each.
glibc and Go query the first nameserver
, wait timeout
seconds, and then query the second nameserver
. This is one dose of attempts
. The default setting of timeout: 5 attempts: 2
queries each nameserver
twice, for a total of four times. Wait 5 seconds each time.
As mentioned in the nameserver
section, musl libc queries multiple nameserver
s at the same time, so it's the same as having one nameserver
.
The behavior of glibc's timeout
changes slightly when there are three nameserver
s. Wait for the first nameserver
for the number of seconds specified by timeout
, the second is shorter than specified by timeout
, and the third is specified by timeout
. It will be longer.
For timeout: 5
, it will be 5 seconds, 3 seconds, 6 seconds. Repeat this set attempts
times.
Go always waits for timeout
seconds.
musl libc is the same as before.
If it times out no matter how many times you inquire, it's not so important because it's no good anymore, but I will write it because there was a difference in operation.
If timeout continues, glibc will not complete the second and subsequent domain, but will query without completion as to whether to complete the second and subsequent search domains. Go will dutifully try all domain completions and will also make inquiries without completions. musl libc queries for the first completion, but that's it. There are no inquiries without completion.
package main
import (
"context"
"net"
"os"
"log"
)
func main() {
ctx := context.Background()
resolver := &net.Resolver{}
for _, v := range os.Args[1:] {
log.Printf("Resolving %s\n", v)
names, err := resolver.LookupHost(ctx, v)
if err != nil {
log.Fatal(err)
}
for _, name := range names {
log.Printf("%s\n", name)
}
}
}
It's hard to understand even if you write it in sentences, so let's see what kind of inquiry occurs when you try to access storage.googleapis.com
in the GKE environment. Processed the output of tcpdump when ping storage.googleapis.com
is executed with a pod set in the default
namespace. (To reduce the width)
It's horrifying to think that this number of queries will occur every time you try to access the Cloud Storage API. If you are not doing Keep-Alive etc. with microservices, this may occur even with a large amount of internal communication. It would be nice if you could cache the DNS like a JVM.
I wonder if it is possible to ensure that the FQDN is always specified as ndots: 2
.
14:53:34.706909 ▶︎ A? storage.googleapis.com.default.svc.cluster.local. (66)
14:53:34.707130 ▶︎ AAAA? storage.googleapis.com.default.svc.cluster.local. (66)
14:53:34.708881 ◁ NXDomain 0/1/0 (159)
14:53:34.708940 ◁ NXDomain 0/1/0 (159)
14:53:34.709051 ▶︎ A? storage.googleapis.com.svc.cluster.local. (58)
14:53:34.709133 ▶︎ AAAA? storage.googleapis.com.svc.cluster.local. (58)
14:53:34.709615 ◁ NXDomain 0/1/0 (151)
14:53:34.709689 ◁ NXDomain 0/1/0 (151)
14:53:34.709771 ▶︎ A? storage.googleapis.com.cluster.local. (54)
14:53:34.709816 ▶︎ AAAA? storage.googleapis.com.cluster.local. (54)
14:53:34.712211 ◁ NXDomain 0/1/0 (147)
14:53:34.712280 ◁ NXDomain 0/1/0 (147)
14:53:34.712387 ▶︎ A? storage.googleapis.com.asia-northeast1-b.c.my-project-id.internal. (83)
14:53:34.712479 ▶︎ AAAA? storage.googleapis.com.asia-northeast1-b.c.my-project-id.internal. (83)
14:53:34.716561 ◁ NXDomain 0/1/0 (189)
14:53:34.716623 ◁ NXDomain 0/1/0 (189)
14:53:34.716718 ▶︎ A? storage.googleapis.com.c.my-project-id.internal. (65)
14:53:34.716760 ▶︎ AAAA? storage.googleapis.com.c.my-project-id.internal. (65)
14:53:34.719891 ◁ NXDomain 0/1/0 (162)
14:53:34.720191 ◁ NXDomain 0/1/0 (162)
14:53:34.720304 ▶︎ A? storage.googleapis.com.google.internal. (56)
14:53:34.720390 ▶︎ AAAA? storage.googleapis.com.google.internal. (56)
14:53:34.724145 ◁ NXDomain 0/1/0 (145)
14:53:34.724352 ◁ NXDomain 0/1/0 (145)
14:53:34.724458 ▶︎ A? storage.googleapis.com. (40)
14:53:34.724500 ▶︎ AAAA? storage.googleapis.com. (40)
14:53:34.726930 ◁ 4/0/0 AAAA 2404:6800:4004:813::2010, AAAA 2404:6800:4004:81c::2010, AAAA 2404:6800:4004:81d::2010, AAAA 2404:6800:4004:81e::2010 (152)
14:53:34.726957 ◁ 16/0/0 A 172.217.161.80, A 172.217.175.16, A 172.217.175.48, A 172.217.175.80, A 172.217.175.112, A 216.58.197.144, A 172.217.25.208, A 172.217.25.240, A 172.217.26.48, A 172.217.31.176, A 172.217.161.48, A 172.217.174.112, A 172.217.175.240, A 216.58.220.112, A 216.58.197.208, A 216.58.197.240 (296)
Recommended Posts