復制 $ sudo ip link add veth0 type veth peer name ceth0 1. 用這條簡單的命令,我們就可以創建一對互聯的虛擬Ethernet設備。默認選擇了veth0和ceth0這兩個名稱。
復制 $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:e3:27:77 brd ff:ff:ff:ff:ff:ff 5: ceth0@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 66:2d:24:e3:49:3f brd ff:ff:ff:ff:ff:ff 6: veth0@ceth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 96:e8:de:1d:22:e0 brd ff:ff:ff:ff:ff:ff 創建的veth0和ceth0都在主機的網絡棧(也稱為root網絡命名空間)上。將netns0命名空間連接到root命名空間,需要將一個設備留在root命名空間,另一個挪到netns0里:
復制 $ sudo ip link set ceth0 netns netns0 # 列出所有設備,可以看到ceth0已經從root棧里消失了 $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST,UP,LOWER/_UP> mtu 1500 qdisc fq/_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:e3:27:77 brd ff:ff:ff:ff:ff:ff 6: veth0@if5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 96:e8:de:1d:22:e0 brd ff:ff:ff:ff:ff:ff link-netns netns0 一旦啟用設備并且分配了合適的IP地址,其中一個設備上產生的包會立刻出現在其配對設備里,從而連接起兩個命名空間。從root命名空間開始:
復制 $ sudo ip link set veth0 up $ sudo ip addr add 172.18.0.11/16 dev veth0 然后是netns0:
復制 $ sudo nsenter --net=/var/run/netns/netns0 $ ip link set lo up $ ip link set ceth0 up $ ip addr add 172.18.0.10/16 dev ceth0 $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 5: ceth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 66:2d:24:e3:49:3f brd ff:ff:ff:ff:ff:ff link-netnsid 0
檢查連通性:
復制 # 在netns0里ping root的 veth0 $ ping -c 2 172.18.0.11 PING 172.18.0.11 (172.18.0.11) 56(84) bytes of data. 64 bytes from 172.18.0.11: icmp_seq=1 ttl=64 time=0.038 ms 64 bytes from 172.18.0.11: icmp_seq=2 ttl=64 time=0.040 ms --- 172.18.0.11 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 58ms rtt min/avg/max/mdev = 0.038/0.039/0.040/0.001 ms # 離開 netns0 $ exit # 在root命名空間里ping ceth0 $ ping -c 2 172.18.0.10 PING 172.18.0.10 (172.18.0.10) 56(84) bytes of data. 64 bytes from 172.18.0.10: icmp_seq=1 ttl=64 time=0.073 ms 64 bytes from 172.18.0.10: icmp_seq=2 ttl=64 time=0.046 ms --- 172.18.0.10 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 3ms rtt min/avg/max/mdev = 0.046/0.059/0.073/0.015 ms 同時,如果嘗試從netns0命名空間訪問其他地址,也同樣可以成功:
復制 # 在 root 命名空間 $ ip addr show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:e3:27:77 brd ff:ff:ff:ff:ff:ff inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic noprefixroute eth0 valid_lft 84057sec preferred_lft 84057sec inet6 fe80::5054:ff:fee3:2777/64 scope link valid_lft forever preferred_lft forever # 記住這里IP是10.0.2.15 $ sudo nsenter --net=/var/run/netns/netns0 # 嘗試ping主機的eth0 $ ping 10.0.2.15 connect: Network is unreachable # 嘗試連接外網 $ ping 8.8.8.8 connect: Network is unreachable 這也很好理解。在netns0路由表里沒有這類包的路由。唯一的entry是如何到達172.18.0.0/16網絡:
復制 # 在netns0命名空間: $ ip route 172.18.0.0/16 dev ceth0 proto kernel scope link src 172.18.0.10 Linux有好幾種方式建立路由表。其中一種是直接從網絡接口上提取路由。記住,命名空間創建后, netns0里的路由表是空的。但是隨后我們添加了ceth0設備并且分配了IP地址172.18.0.0/16。因為我們使用的不是簡單的IP地址,而是地址和子網掩碼的組合,網絡棧可以從其中提取出路由信息。目的地是172.18.0.0/16的每個網絡包都會通過ceth0設備。但是其他包會被丟棄。類似的,root命名空間也有了個新的路由:
復制 # 在root命名空間: $ ip route # ... 忽略無關行 ... 172.18.0.0/16 dev veth0 proto kernel scope link src 172.18.0.11 這里,就可以回答第一個問題了。我們了解了如何隔離,虛擬化并且連接Linux網絡棧。
復制 # 從 root 命名空間 $ sudo ip netns add netns1 $ sudo ip link add veth1 type veth peer name ceth1 $ sudo ip link set ceth1 netns netns1 $ sudo ip link set veth1 up $ sudo ip addr add 172.18.0.21/16 dev veth1 $ sudo nsenter --net=/var/run/netns/netns1 $ ip link set lo up $ ip link set ceth1 up $ ip addr add 172.18.0.20/16 dev ceth1 檢查連通性:
復制 # 從netns1無法連通root 命名空間! $ ping -c 2 172.18.0.21 PING 172.18.0.21 (172.18.0.21) 56(84) bytes of data. From 172.18.0.20 icmp_seq=1 Destination Host Unreachable From 172.18.0.20 icmp_seq=2 Destination Host Unreachable --- 172.18.0.21 ping statistics --- 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 55ms pipe 2 # 但是路由是存在的! $ ip route 172.18.0.0/16 dev ceth1 proto kernel scope link src 172.18.0.20 # 離開 `netns1` $ exit # 從 root 命名空間無法連通`netns1` $ ping -c 2 172.18.0.20 PING 172.18.0.20 (172.18.0.20) 56(84) bytes of data. From 172.18.0.11 icmp_seq=1 Destination Host Unreachable From 172.18.0.11 icmp_seq=2 Destination Host Unreachable --- 172.18.0.20 ping statistics --- 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 23ms pipe 2 # 從netns0可以連通 `veth1` $ sudo nsenter --net=/var/run/netns/netns0 $ ping -c 2 172.18.0.21 PING 172.18.0.21 (172.18.0.21) 56(84) bytes of data. 64 bytes from 172.18.0.21: icmp_seq=1 ttl=64 time=0.037 ms 64 bytes from 172.18.0.21: icmp_seq=2 ttl=64 time=0.046 ms --- 172.18.0.21 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 33ms rtt min/avg/max/mdev = 0.037/0.041/0.046/0.007 ms # 但是仍然無法連通netns1 $ ping -c 2 172.18.0.20 PING 172.18.0.20 (172.18.0.20) 56(84) bytes of data. From 172.18.0.10 icmp_seq=1 Destination Host Unreachable From 172.18.0.10 icmp_seq=2 Destination Host Unreachable --- 172.18.0.20 ping statistics --- 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 63ms pipe 2 暈!有地方出錯了……netns1有問題。它無法連接到root,并且從root命名空間里也無法訪問到它。但是,因為兩個容器都在相同的IP網段172.18.0.0/16里,從netns0容器可以訪問到主機的veth1。
這里花了些時間來找到原因,不過很明顯遇到的是路由問題。先查一下root命名空間的路由表:
復制 $ ip route # ... 忽略無關行... # 172.18.0.0/16 dev veth0 proto kernel scope link src 172.18.0.11 172.18.0.0/16 dev veth1 proto kernel scope link src 172.18.0.21 在添加了第二個veth對之后,root的網絡棧知道了新路由172.18.0.0/16 dev veth1 proto kernel scope link src 172.18.0.21,但是之前已經存在該網絡的路由了。當第二個容器嘗試ping veth1時,選中的是第一個路由規則,這導致網絡無法連通。如果我們刪除第一個路由sudo ip route delete 172.18.0.0/16 dev veth0 proto kernel scope link src 172.18.0.11,然后重新檢查連通性,應該就沒有問題了。netns1可以連通,但是netns0就不行了。
復制 $ sudo ip netns delete netns0 $ sudo ip netns delete netns1 $ sudo ip link delete veth0 $ sudo ip link delete ceth0 $ sudo ip link delete veth1 $ sudo ip link delete ceth1 快速重建兩個容器。注意,我們沒有給新的veth0和veth1設備分配任何IP地址:
復制 $ sudo ip netns add netns0 $ sudo ip link add veth0 type veth peer name ceth0 $ sudo ip link set veth0 up $ sudo ip link set ceth0 netns netns0 $ sudo nsenter --net=/var/run/netns/netns0 $ ip link set lo up $ ip link set ceth0 up $ ip addr add 172.18.0.10/16 dev ceth0 $ exit $ sudo ip netns add netns1 $ sudo ip link add veth1 type veth peer name ceth1 $ sudo ip link set veth1 up $ sudo ip link set ceth1 netns netns1 $ sudo nsenter --net=/var/run/netns/netns1 $ ip link set lo up $ ip link set ceth1 up $ ip addr add 172.18.0.20/16 dev ceth1 $ exit 確保主機上沒有新的路由:
復制 $ ip route default via 10.0.2.2 dev eth0 proto dhcp metric 100 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 metric 100 最后創建網橋接口:
復制 $ sudo ip link add br0 type bridge $ sudo ip link set br0 up 將veth0和veth1接到網橋上:
復制 $ sudo ip link set veth0 master br0 $ sudo ip link set veth1 master br0
檢查容器間的連通性:
復制 $ sudo nsenter --net=/var/run/netns/netns0 $ ping -c 2 172.18.0.20 PING 172.18.0.20 (172.18.0.20) 56(84) bytes of data. 64 bytes from 172.18.0.20: icmp_seq=1 ttl=64 time=0.259 ms 64 bytes from 172.18.0.20: icmp_seq=2 ttl=64 time=0.051 ms --- 172.18.0.20 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 2ms rtt min/avg/max/mdev = 0.051/0.155/0.259/0.104 ms 復制 $ sudo nsenter --net=/var/run/netns/netns1 $ ping -c 2 172.18.0.10 PING 172.18.0.10 (172.18.0.10) 56(84) bytes of data. 64 bytes from 172.18.0.10: icmp_seq=1 ttl=64 time=0.037 ms 64 bytes from 172.18.0.10: icmp_seq=2 ttl=64 time=0.089 ms --- 172.18.0.10 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 36ms rtt min/avg/max/mdev = 0.037/0.063/0.089/0.026 ms 太好了!工作得很好。用這種新方案,我們根本不需要配置veth0和veth1。只需要在ceth0和ceth1端點分配兩個IP地址。但是因為它們都連接在相同的Ethernet上(記住,它們連接到虛擬switch上),之間在L2層是連通的:
復制 $ sudo nsenter --net=/var/run/netns/netns0 $ ip neigh 172.18.0.20 dev ceth0 lladdr 6e:9c:ae:02:60:de STALE $ exit $ sudo nsenter --net=/var/run/netns/netns1 $ ip neigh 172.18.0.10 dev ceth1 lladdr 66:f3:8c:75:09:29 STALE $ exit 太好了,我們學習了如何將容器變成友鄰,讓它們互不干擾,但是又可以連通。