[DOCKER] Differences between glibc, musl libc and go resolvers

The other day I wanted to adjust the timeout in resolv.conf, but in times like Docker and Kubernetes. Even if you're only using Linux, the resolver isn't necessarily that of glibc. So I investigated the difference between glibc, musl libc (alpine), and go resolver.

environment

I used docker container for each, and checked the status of the inquiry with tcpdump.

macOS Catalina (10.15.7)
docker desktop 3.0.3 (51017)

container image

alpine:3.12.3 / musl-1.1.24-r10
debian:buster-20201209 (10.7) / glibc 2.28-10

Go used 1.15.6 on a mac, cross-compiled it and ran it on alpine above. Since it is cross-compiled, it should be CGO_ENABLED = 0 by default, but I built it explicitly.

Support status such as options in resolv.conf

resolv.conf example


nameserver 8.8.8.8
nameserver 8.8.4.4
search default.svc.cluster.local svc.cluster.local cluster.local asia-northeast1-b.c.project-id.internal c.project-id.internal google.internal
options ndots:5 timeout:5 attempts:2

There are various options in resolv.conf, but all three resolvers this time support it.

It was three.

It supports search as well as nameserver. Older musl libc didn't support search (domain completion), but it has been supported since 1.1.13. nameserver can be specified multiple times, but only three are used. Even if you specify 4 or more, only 3 will be used.

Let's see the difference in function

ndots

The first is ndots, I first learned about it when I started using Kubernetes.

You specify the number of dots (.) contained in the domain you are querying, which is 5 (ndots: 5) by default in Kubernetes.

For example, if you say ping www.google.com, the number of dots is 2.

When the number of dots is ndots or more

If the number of dots is greater than or equal to that specified by ndots, it is determined to be an FQDN and queries the DNS server without completing the domain specified by search. Now, when NXDOMAIN (that domain doesn't exist) is returned, glibc and go will query for the domain specified by search. However, ** musl libc ends up saying that such a domain doesn't exist without making a complementary query. ** **

A long time ago, musl did not support domain completion in the first place, so people who have been using it since that time will always specify it by FQDN, so it will not be a problem, but it should be possible to solve it by reducing ndots without knowing it. The domain may not be resolved.

If the number of dots is less than ndots

If the number of dots is less than ndots, it will first complete and query multiple domains specified by search in order, repeating until found (I forgot to check this limit), so the order is Is it important? However, the number specified by ndots is more important than the order, and there are quite a lot of 5. The default setting in Kubernetes is service.namespace.svc.cluster.local, which is 4 even if he specifies the FQDN. After trying to complete all the domains listed in search, you can just query service.namespace.svc.cluster.local and finally get the result you want, that there is a negative cache on the DNS server side. No, it's a lot of waste because it communicates with the DNS server that many times. The GKE default settings enumerate as many as six domains. There are four EKS. Rather than making ndots smaller and creating a domain that cannot resolve names, the default value is that it can resolve names even if there are more useless queries. ndots seems to have room for tuning.

By the way, if you add . at the end, it means that it is FQDN, so if you use ping www.google.com. etc., any resolver will not complete the domain.

nameserver

The handling when multiple servers are specified is very different between glibc and musl, so I will explain it first. glibc and Go query the first server, and if they wait for timeout seconds and there is no response, they query the next server. musl libc sends requests to multiple nameservers at the same time and uses the first response returned. If three name servers are specified, three times as many inquiries will occur.

timeout and attempts

Since timeout and attempts are related, they are treated together. It also depends on the number of nameservers, so group them by each.

If there is one nameserver

glibc and Go will throw a query and wait timeout seconds, then query again if there is no response, for a total of attempts times.

musl libc waits for each query for the number of seconds that timeout seconds are divided by atempts. In other words, wait up to timeout seconds in total including retries. The default setting of timeout: 5 attempts: 2 waits 2.5 seconds each.

When there are two nameservers

glibc and Go query the first nameserver, wait timeout seconds, and then query the second nameserver. This is one dose of attempts. The default setting of timeout: 5 attempts: 2 queries each nameserver twice, for a total of four times. Wait 5 seconds each time.

As mentioned in the nameserver section, musl libc queries multiple nameservers at the same time, so it's the same as having one nameserver.

When there are three nameservers

The behavior of glibc's timeout changes slightly when there are three nameservers. Wait for the first nameserver for the number of seconds specified by timeout, the second is shorter than specified by timeout, and the third is specified by timeout. It will be longer. For timeout: 5, it will be 5 seconds, 3 seconds, 6 seconds. Repeat this set attempts times.

Go always waits for timeout seconds.

musl libc is the same as before.

Domain completion at timeout

If it times out no matter how many times you inquire, it's not so important because it's no good anymore, but I will write it because there was a difference in operation.

If timeout continues, glibc will not complete the second and subsequent domain, but will query without completion as to whether to complete the second and subsequent search domains. Go will dutifully try all domain completions and will also make inquiries without completions. musl libc queries for the first completion, but that's it. There are no inquiries without completion.

The code used to check the operation of Go

package main

import (
	"context"
	"net"
	"os"
	"log"
)

func main() {
	ctx := context.Background()
	resolver := &net.Resolver{}
	for _, v := range os.Args[1:] {
		log.Printf("Resolving %s\n", v)
		names, err := resolver.LookupHost(ctx, v)
		if err != nil {
			log.Fatal(err)
		}
		for _, name := range names {
			log.Printf("%s\n", name)
		}
	}
}

Visualize the atrocity of ndots: 5

It's hard to understand even if you write it in sentences, so let's see what kind of inquiry occurs when you try to access storage.googleapis.com in the GKE environment. Processed the output of tcpdump when ping storage.googleapis.com is executed with a pod set in the default namespace. (To reduce the width)

It's horrifying to think that this number of queries will occur every time you try to access the Cloud Storage API. If you are not doing Keep-Alive etc. with microservices, this may occur even with a large amount of internal communication. It would be nice if you could cache the DNS like a JVM.

I wonder if it is possible to ensure that the FQDN is always specified as ndots: 2.

14:53:34.706909 ▶︎ A? storage.googleapis.com.default.svc.cluster.local. (66)
14:53:34.707130 ▶︎ AAAA? storage.googleapis.com.default.svc.cluster.local. (66)
14:53:34.708881 ◁ NXDomain 0/1/0 (159)
14:53:34.708940 ◁ NXDomain 0/1/0 (159)
14:53:34.709051 ▶︎ A? storage.googleapis.com.svc.cluster.local. (58)
14:53:34.709133 ▶︎ AAAA? storage.googleapis.com.svc.cluster.local. (58)
14:53:34.709615 ◁ NXDomain 0/1/0 (151)
14:53:34.709689 ◁ NXDomain 0/1/0 (151)
14:53:34.709771 ▶︎ A? storage.googleapis.com.cluster.local. (54)
14:53:34.709816 ▶︎ AAAA? storage.googleapis.com.cluster.local. (54)
14:53:34.712211 ◁ NXDomain 0/1/0 (147)
14:53:34.712280 ◁ NXDomain 0/1/0 (147)
14:53:34.712387 ▶︎ A? storage.googleapis.com.asia-northeast1-b.c.my-project-id.internal. (83)
14:53:34.712479 ▶︎ AAAA? storage.googleapis.com.asia-northeast1-b.c.my-project-id.internal. (83)
14:53:34.716561 ◁ NXDomain 0/1/0 (189)
14:53:34.716623 ◁ NXDomain 0/1/0 (189)
14:53:34.716718 ▶︎ A? storage.googleapis.com.c.my-project-id.internal. (65)
14:53:34.716760 ▶︎ AAAA? storage.googleapis.com.c.my-project-id.internal. (65)
14:53:34.719891 ◁ NXDomain 0/1/0 (162)
14:53:34.720191 ◁ NXDomain 0/1/0 (162)
14:53:34.720304 ▶︎ A? storage.googleapis.com.google.internal. (56)
14:53:34.720390 ▶︎ AAAA? storage.googleapis.com.google.internal. (56)
14:53:34.724145 ◁ NXDomain 0/1/0 (145)
14:53:34.724352 ◁ NXDomain 0/1/0 (145)
14:53:34.724458 ▶︎ A? storage.googleapis.com. (40)
14:53:34.724500 ▶︎ AAAA? storage.googleapis.com. (40)
14:53:34.726930 ◁ 4/0/0 AAAA 2404:6800:4004:813::2010, AAAA 2404:6800:4004:81c::2010, AAAA 2404:6800:4004:81d::2010, AAAA 2404:6800:4004:81e::2010 (152)
14:53:34.726957 ◁ 16/0/0 A 172.217.161.80, A 172.217.175.16, A 172.217.175.48, A 172.217.175.80, A 172.217.175.112, A 216.58.197.144, A 172.217.25.208, A 172.217.25.240, A 172.217.26.48, A 172.217.31.176, A 172.217.161.48, A 172.217.174.112, A 172.217.175.240, A 216.58.220.112, A 216.58.197.208, A 216.58.197.240 (296)

Recommended Posts

Differences between glibc, musl libc and go resolvers
Differences between Windows and Linux directories
Differences between yum commands and APT commands
Differences between symbolic links and hard links
Differences between Python, stftime and strptime
Differences between Ruby and Python in scope
Differences in syntax between Python and Java
Matplotlib Basics / Differences between fig and axes
Differences between Numpy 1D array [x] and 2D array [x, 1]
Differences in multithreading between Python and Jython
Differences between Ruby and Python (basic syntax)
Relationship between Firestore and Go data type conversion
Differences between queryStringParameters and multiValueQueryStringParameters in AWS Lambda
Summary of the differences between PHP and Python
EP 3 Know the Differences Between bytes, str, and unicode
Correspondence between Unix system call ʻopen` and libc` fopen () `
Differences between numpy and pandas methods for finding variance