如上所示,出现部署在master节点上的pod,无法解析gateway.default.svc.cluster.local域名,但是部署在node2,确可以解析,如上curl-6bf6db5c4f-pjld9,test-post-start1通过nslookup都可以解析.
# 报错 / # nslookup gateway nslookup: can't resolve '(null)': Name does not resolve nslookup: can't resolve 'gateway': Try again / # nslookup gateway.default.svc.cluster.local Server: 10.244.0.10 Address 1: 10.244.0.10 nslookup: can't resolve 'gateway.default.svc.cluster.local'进入master节点pod,直接通过coredns pod ip解析测试
kubectl exec -it test-post-start2 sh / # nslookup gateway.default.svc.cluster.local 10.244.0.13 Server: 10.244.0.13 Address 1: 10.244.0.13 10-244-0-13.kube-dns.kube-system.svc.cluster.local Name: gateway.default.svc.cluster.local Address 1: 10.244.106.29 gateway.default.svc.cluster.local发现直接通过coredns pod ip解析可以成功,证明coredns服务本身没有问题.
查看dns clusterIP.
[root@master ~]# kubectl get svc -nkube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.244.0.10 <none> 53/UDP,53/TCP,9153/TCP 21m # 通过clusterIP解析域名失败 nslookup gateway.default.svc.cluster.local 10.244.0.10通过以上测试证明问题出现在coredns service上.
导出现有kube-dns service配置
kubectl get svc -nkube-system kube-dns -oyaml > kube-dns-svc.yaml修改kube-dns-svc.yaml.
apiVersion: v1 kind: Service metadata: annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: KubeDNS name: kube-dns namespace: kube-system spec: ports: - name: dns port: 53 protocol: UDP targetPort: 53 - name: dns-tcp port: 53 protocol: TCP targetPort: 53 - name: metrics port: 9153 protocol: TCP targetPort: 9153 selector: k8s-app: kube-dns sessionAffinity: None type: ClusterIP kubectl apply -f kube-dns-svc.yaml查看最新的coredns clusterIP,当前为10.244.47.231.
[root@master ~]# kubectl get svc -nkube-system NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-dns ClusterIP 10.244.47.231 <none> 53/UDP,53/TCP,9153/TCP 21m进去之前无法解析的pod中测试,证明新的clusterIP没有问题.
nslookup gateway.default.svc.cluster.local 10.244.47.231修改kubelet --clusterDNS,这样新创建的pod /etc/resolv.conf中nameserver为新的coredns clusterIP.
# 修改kubelet配置 vim /var/lib/kubelet/config.yaml # 找到clusterDNS clusterDNS: - 10.244.47.231 # 重启kubelet生效,注意k8s中所有节点都需要修改重启 systemctl restart kubelet.service最后测试,新的pod中/etc/resolv.conf.解析没有问题.
/ # cat /etc/resolv.conf nameserver 10.244.47.231 search default.svc.cluster.local svc.cluster.local cluster.local options ndots:5