I just started studying Docker on CentOS 8. So, I wanted to follow the packet flow of the external host <---> container, so I did a lot of research.
However, I was able to follow the flow of the request, but I could not follow the response. .. .. For the time being, I will summarize only the request route.
I am tracing in the following environment.
userland proxy(docker-proxy) I didn't use docker-pxory because I wanted to see the basic operation of docker network with iptables / nftables. (Hairpin NAT) The following files are placed and verified.
/etc/docker/daemon.json
{
"userland-proxy": false
}
Reference: https://github.com/nigelpoulton/docker/blob/master/docs/userguide/networking/default_network/binding.md
docker network and containers
As the verification environment, use the one generated by the following entry.
The docker host 10.254.10.252
configured with CentOS8, the explanation will be for radius.
** Docker Compose can create network services in 5 minutes (dhcp / radius / proxy / tftp / syslog) **
The following container will be created.
server | app | address | listen |
---|---|---|---|
proxy | squid | 172.20.0.2 | 8080/tcp |
syslog | rsyslog | 172.20.0.3 | 514/udp |
radius | freeRADIUS | 172.20.0.4 | 1812/udp |
dhcp | ISC-Kea | 172.20.0.5 | 67/udp |
tftp | tftp-server | - | 69/udp |
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b11308767849 infraserv:proxy "/usr/sbin/init" 3 minutes ago Up 3 minutes 0.0.0.0:8080->8080/tcp proxy
33054f8b7d58 infraserv:tftp "/usr/sbin/init" 35 hours ago Up 2 hours tftp
851ea861d04e infraserv:syslog "/usr/sbin/init" 35 hours ago Up 2 hours 0.0.0.0:514->514/udp syslog
dd3a657cfda2 infraserv:dhcp "/usr/sbin/init" 35 hours ago Up 2 hours 0.0.0.0:67->67/udp dhcp
7249b9c4f11d infraserv:radius "/usr/sbin/init" 35 hours ago Up 2 hours 0.0.0.0:1812->1812/udp radius
A network with the following parameters is generated.
key | value |
---|---|
name | infraserv_infranet |
subnet | 172.20.0.0/24 |
interface | docker1 |
Since tftp operates in the environment of --net = host
, docker network
is in the following state.
# docker network inspect infraserv_infranet
[
{
"Name": "infraserv_infranet",
"Id": "7ed8face2e4fec3110384fa3366512f8c78db6e10be6e7271b3d92452aefd254",
"Created": "2020-02-15T05:37:59.248249755-05:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.20.0.0/24",
"Gateway": "172.20.0.1"
}
]
},
"Internal": false,
"Attachable": true,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {
"7249b9c4f11de1f986892965671086d20957a6021269a5f5bc6dd85263bc0d70": {
"Name": "radius",
"EndpointID": "03ae6a9b9ff7817eea101955d2d6ff016982beb65c7dd6631c75c7299682c2dd",
"MacAddress": "02:42:ac:14:00:04",
"IPv4Address": "172.20.0.4/24",
"IPv6Address": ""
},
"851ea861d04edeb5f5c2498cc60f58532c87a44592db1f6c51280a8ce27940bd": {
"Name": "syslog",
"EndpointID": "d18e466d27def913ac74b7555acc9ef79c88c62e62085b50172636546d2e72bb",
"MacAddress": "02:42:ac:14:00:03",
"IPv4Address": "172.20.0.3/24",
"IPv6Address": ""
},
"b11308767849c7227fbde53234c1b1816859c8e871fcc98c4fcaacdf7818e89e": {
"Name": "proxy",
"EndpointID": "ffa6479b4f28c9c1d106970ffa43bd149461b4728b64290541643eb895a02892",
"MacAddress": "02:42:ac:14:00:02",
"IPv4Address": "172.20.0.2/24",
"IPv6Address": ""
},
"dd3a657cfda211c08b7c5c2166f10d189986e4779f1dfea227b3afe284cbafec": {
"Name": "dhcp",
"EndpointID": "7371f4cf652d8b1bdbf2dc1e5e8ae97013a9a70b890c2caa36c2a7cc93b165df",
"MacAddress": "02:42:ac:14:00:05",
"IPv4Address": "172.20.0.5/24",
"IPv6Address": ""
}
},
"Options": {
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker1"
},
"Labels": {
"com.docker.compose.network": "infranet",
"com.docker.compose.project": "infraserv",
"com.docker.compose.version": "1.25.3"
}
}
]
For the sake of brevity, we have focused on IPv4.
This time, we will take an example of sending a radius Request from an external terminal (10.254.10.105) to a Docker host (10.254.10.252). Since it is forwarded after it arrives at the local host, the hook of the chain of interest is prerouting-> forward-> postrouting. Therefore, the chain type will be explained focusing on only filter and nat.
The rules exclude unnecessary ones from nft list ruleset
, but they are not very useful information, so I summarized them in [Supplement](#Check required tables).
If the hook is prerouting from nft list ruleset
, it will be as follows.
table ip nat {
chain PREROUTING {
(1) type nat hook prerouting priority -100; policy accept;
(2)-> fib daddr type local COUNTER jump DOCKER
}
->(2) chain DOCKER {
↓ meta l4proto udp udp dport 514 COUNTER dnat to 172.20.0.3:514
↓ meta l4proto udp udp dport 67 COUNTER dnat to 172.20.0.5:67
↓ meta l4proto tcp tcp dport 8080 COUNTER dnat to 172.20.0.2:8080
(3) meta l4proto udp udp dport 1812 COUNTER dnat to 172.20.0.4:1812
}
}
The current communication is 10.254.10.105: random-> 10.254.10.252: 1812
.
(1) A chain called PREROUTING that hooks prerouting and performs nat is selected.
(2) Since DstAddr is local, jump to the chain called DOCKER
addr type local is the address of the local host (Docker host in this case).
This time it's lo: 127.0.0.1
ʻens192: 10.254.10.252
docker1: 172.20.0.1`.
(3) Since DstPort is 1812, ** DNAT DstAddr to 172.20.0.4:1812 **
Apply policy-> ** accept ** because there is no further processing
The communication at this point is 10.254.10.105: random-> 172.20.0.4: 1812
.
Since the destination has changed to 172.20.0.4, the routing decision will take you to the forward hook.
Extracting the hook forward from nft list ruleset
gives:
table ip filter {
chain FORWARD {
(1) type filter hook forward priority 0; policy drop;
(2)-> COUNTER jump DOCKER-USER
->(3)(4)-> COUNTER jump DOCKER-ISOLATION-STAGE-1
->(5) oifname "docker1" ct state related,established COUNTER accept
(6)-> oifname "docker1" COUNTER jump DOCKER
iifname "docker1" oifname != "docker1" COUNTER accept
iifname "docker1" oifname "docker1" COUNTER accept
}
->(4) chain DOCKER-ISOLATION-STAGE-1 {
(5)-> COUNTER return
}
->(2) chain DOCKER-USER {
(3)-> COUNTER return
}
->(6) chain DOCKER {
↓ iifname != "docker1" oifname "docker1" meta l4proto udp ip daddr 172.20.0.3 udp dport 514 COUNTER accept
↓ iifname != "docker1" oifname "docker1" meta l4proto udp ip daddr 172.20.0.5 udp dport 67 COUNTER accept
↓ iifname != "docker1" oifname "docker1" meta l4proto tcp ip daddr 172.20.0.2 tcp dport 8080 COUNTER accept
(7) iifname != "docker1" oifname "docker1" meta l4proto udp ip daddr 172.20.0.4 udp dport 1812 COUNTER accept
}
}
table inet firewalld {
chain filter_FORWARD {
(8) type filter hook forward priority 10; policy accept;
↓ ct state established,related accept
(9) ct status dnat accept
iifname "lo" accept
jump filter_FORWARD_IN_ZONES
jump filter_FORWARD_OUT_ZONES
ct state invalid drop
reject with icmpx type admin-prohibited
}
chain filter_FORWARD_IN_ZONES {
iifname "ens192" goto filter_FWDI_public
goto filter_FWDI_public
}
chain filter_FORWARD_OUT_ZONES {
oifname "ens192" goto filter_FWDO_public
goto filter_FWDO_public
}
chain filter_FWDI_public { meta l4proto { icmp, ipv6-icmp } accept }
chain filter_FWDO_public { jump filter_FWDO_public_allow }
chain filter_FWDO_public_allow { ct state new,untracked accept }
}
The current communication is 10.254.10.105: random-> 172.20.0.4: 1812
.
(1) Since it has the highest priority among forward hooks, a chain called FORWARD that performs filtering is selected (pri: 0).
(2) Unconditionally fly to DOCKER-USER
(3) Return without doing anything
(4) Unconditionally fly to DOCKER-ISOLATION-STAGE-1
(5) Return without doing anything
(6) Since the output IF is docker1, jump to DOCKER
(7) Input IF is ens192, output IF is docker1, and DstAddr is 172.20.0.4:1812, so ** accept **
DOCKER in regular chain is called from FORWARD in base chain.
When accepted by DOCKER, the caller's FORWARD is evaluated and this chain ends.
(8) Since it has the second highest priority among the forward hooks, a chain called filter_FORWARD that performs filtering is selected (pri: 10).
(9) Since the packet is DNAT, ** accept **
The communication at this point is the same as the first, 10.254.10.105: random-> 172.20.0.4: 1812
.
If the hook is postrouting from nft list ruleset
, it will be as follows.
table ip nat {
chain POSTROUTING {
(1) type nat hook postrouting priority 100; policy accept;
↓ oifname "docker1" fib saddr type local COUNTER masquerade
↓ oifname != "docker1" ip saddr 172.20.0.0/24 COUNTER masquerade
↓ meta l4proto udp ip saddr 172.20.0.3 ip daddr 172.20.0.3 udp dport 514 COUNTER masquerade
↓ meta l4proto udp ip saddr 172.20.0.5 ip daddr 172.20.0.5 udp dport 67 COUNTER masquerade
↓ meta l4proto tcp ip saddr 172.20.0.2 ip daddr 172.20.0.2 tcp dport 8080 COUNTER masquerade
↓ meta l4proto udp ip saddr 172.20.0.4 ip daddr 172.20.0.4 udp dport 1812 COUNTER masquerade
}
table ip firewalld {
chain nat_POSTROUTING {
(2) type nat hook postrouting priority 110; policy accept;
(3)-> jump nat_POSTROUTING_ZONES
}
->(3) chain nat_POSTROUTING_ZONES {
↓ oifname "ens192" goto nat_POST_public
(4)-> goto nat_POST_public
}
->(4) chain nat_POST_public {
(5)-> jump nat_POST_public_allow
}
->(5) chain nat_POST_public_allow {
(6) oifname != "lo" masquerade
}
}
}
The current communication is 10.254.10.105: random-> 172.20.0.4: 1812
.
(1) Since it has the highest priority among postrouting hooks, a chain called POSTROUTING that performs nat is selected (pri: 100).
Apply policy-> ** accept ** because there is no further processing
(2) Since it has the second highest priority among postrouting hooks, a chain called nat_POSTROUTING that performs nat is selected (pri: 110).
(3) Unconditionally fly to nat_POSTROUTING_ZONES
(4) Unconditionally fly to nat_POST_public
(5) Unconditionally fly to nat_POST_public_allow
(6) Since the output IF is docker1, ** masquerade **
Since the chain ends at the destination called by goto, policy is applied-> ** accept **
The regular chain nat_POST_public_allow is called from the regular chain nat_POST_public.
The regular chain nat_POST_public is called by the goto instruction from the regular chain nat_POSTROUTING_ZONES.
When the processing of nat_POST_public called by the goto command is completed, the called nat_POSTROUTING_ZONES ends.
The nat_POSTROUTING that called it also ends and policy accept is applied.
After being processed by masquerade, the final result is 172.20.0.1: random-> 172.20.0.4: 1812
.
(Since it is sent from docker1, the source address will be docker1 when processed by masquerade)
Requests received by the radius container
172.20.0.1:random --> 172.20.0.4:1812
The radius server checks for availability and returns a response to the radius client.
Response that the radius container replies
172.20.0.4:1812 --> 172.20.0.1:random
I'm exhausted. .. .. When I set up a counter with nftables, I saw the address when passing through the following chain. Since it was a one-time authentication exchange, one packet was visible in each chain.
type filter hook prerouting : 172.20.0.4:1812 --> 172.20.0.1:random
type filter hook input : 172.20.0.4:1812 --> 10.254.10.105:random
type filter hook forward : 172.20.0.4:1812 --> 10.254.10.105:random
type filter hook postrouting : 172.20.0.4:1812 --> 10.254.10.105:random
The response from the radius container is 172.20.0.4:1812-> 172.20.0.1: random
,
When you receive an incoming call, it looks like a communication addressed to you, so you know that you are passing through hook: input
.
After that, do you go forward through LocalProcess? I'm not sure about this. .. ..
I don't know the route of the response packet from radius.
Why doesn't any chain type: nat
pass? .. ..
Why are you going through hook: input
and hook: forward
at the same time? .. ..
Even though it is in type: filter hook: input pri: -200
of table bridge filter
I didn't go into the type: filter hook: input pri: 0
of the table ip filter.
Is the L2 bridge and the L3 IP doing different processing?
https://knowledge.sakura.ad.jp/22636/ https://ja.wikipedia.org/wiki/Iptables https://ja.wikipedia.org/wiki/Nftables https://wiki.archlinux.jp/index.php/Nftables https://wiki.archlinux.jp/index.php/Iptables https://wiki.nftables.org/wiki-nftables/index.php/Netfilter_hooks https://www.frozentux.net/iptables-tutorial/iptables-tutorial.html#TRAVERSINGOFTABLES https://wiki.archlinux.jp/index.php/Nftables https://knowledge.sakura.ad.jp/22636/ https://www.codeflow.site/ja/article/a-deep-dive-into-iptables-and-netfilter-architecture
Recommended Posts