部署k3s + Rancher

    技术2022-07-13  73

    文章目录

    一. 部署k3s1.1 初始化机器1.2 安装Docker1.3 安装K3S-Server(master节点)1.4 安装K3S-Agent(node节点)1.5 搭建NFS Server(随便安装那个节点,现在安装master节点)1.6 安装NFS客户端(所有node节点) 二. 安装helm32.1 下载2.2 安装2.3 配置helm使用的仓库为国内阿里云地址 三. helm安装nginx-ingress(跳过)3.1 搜索3.2 安装 四. helm安装cert-manager自动化 HTTPS(失败了,跳过)4.1 使用常规清单安装4.2 使用helm安装(我使用了)4.3 验证安装4.4 创建签发机构 五. 配置 Traefik 自动化 HTTPS5.1 安装crd和rbac5.2 安装traefik5.2.1 dnsChallenge配置providers5.2.2 获取dnspod用户令牌5.2.3 创建`dnspod`用户令牌密钥5.2.4 创建`traefik` 5.3 创建ingressRoute5.4 访问traefik web-ui 六. Rancher6.1 安装Rancher6.2 添加k3s集群6.3 部署服务6.1 部署mysql6.2 部署redis

    一. 部署k3s

    准备部署一台master,四台node节点因为我的 k8s 版本是 1.17.3,在这个版本中 Deployment 已经从 extensions/v1beta1 弃用,使用apps/v1

    1.1 初始化机器

    curl http://pigx.vip/os7init.sh | sh -s 主机名 curl http://pigx.vip/os7init.sh | sh -s master curl http://pigx.vip/os7init.sh | sh -s node1 curl http://pigx.vip/os7init.sh | sh -s node2 curl http://pigx.vip/os7init.sh | sh -s node3 curl http://pigx.vip/os7init.sh | sh -s node4 # 然后重启 reboot

    http://pigx.vip/os7init.sh:是centos7 64位初始化脚本

    1.2 安装Docker

    curl http://pigx.vip/docker_install.sh | sh

    http://pigx.vip/docker_install.sh:是docker安装脚本

    1.3 安装K3S-Server(master节点)

    export INSTALL_K3S_VERSION=v1.17.3-k3s1 export INSTALL_K3S_EXEC="--docker --kube-apiserver-arg service-node-port-range=1-65000 --no-deploy traefik --write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666" curl -sfL http://rancher-mirror.cnrancher.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -

    安装完成

    [root@master master]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 3m25s v1.17.3+k3s1

    1.4 安装K3S-Agent(node节点)

    # 查看 server 秘钥(master节点执行) cat /var/lib/rancher/k3s/server/node-token export INSTALL_K3S_VERSION=v1.17.3-k3s1 # 将上面查询的秘钥放在这里 export K3S_TOKEN=XXXXXX # 指定 server 地址 export K3S_URL=https://10.0.0.28:6443 # 指定执行参数 export INSTALL_K3S_EXEC="--docker --kube-apiserver-arg service-node-port-range=1-65000 --no-deploy traefik --write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666" # 安装脚本安装 curl -sfL http://rancher-mirror.cnrancher.com/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -

    如果删除了节点,然后再加入同样名称的节点,发现失败 master节点报错Nov 13 11:57:57 master k3s[9284]: time="2020-11-13T11:57:57.350894560+08:00" level=error msg="Node password validation failed for 'node2', using passwd file '/var/lib/rancher/k3s/server/cred/node-passwd'" 删除以前的节点node-passwd, vi /var/lib/rancher/k3s/server/cred/node-passwd master节点删除node节点

    安装完成

    [root@master master]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 11m v1.17.3+k3s1 node4 Ready <none> 62s v1.17.3+k3s1 node2 Ready <none> 59s v1.17.3+k3s1 node1 Ready <none> 54s v1.17.3+k3s1

    1.5 搭建NFS Server(随便安装那个节点,现在安装master节点)

    yum -y install nfs-utils rpcbind #分配权限 mkdir /nfsdata && chmod 666 /nfsdata && chown nfsnobody /nfsdata # 配置挂载 vim /etc/exports /nfsdata *(rw,no_root_squash,no_all_squash,sync) # 启动 systemctl start rpcbind.service systemctl enable rpcbind.service systemctl start nfs.service systemctl enable nfs.service

    1.6 安装NFS客户端(所有node节点)

    如果不安装,使用StorageClass的nfs-client 的自动配置程序,我们也叫它 Provisioner所在的node节点就会一直ContainerCreating

    [root@master nfs-client]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nfs-client-provisioner-798cfd7476-zrndd 0/1 ContainerCreating 0 3m53s <none> node1 <none> <none>

    安装

    yum -y install nfs-utils rpcbind systemctl start rpcbind.service systemctl enable rpcbind.service systemctl start nfs.service systemctl enable nfs.service [root@node1 ~]# showmount -e 10.0.0.28 Export list for 10.0.0.28: /nfsdata *

    二. 安装helm3

    2.1 下载

    下载地址 https://github.com/helm/helm/releases

    [root@master helm]# wget https://get.helm.sh/helm-v3.2.4-linux-amd64.tar.gz

    服务器下载速度太慢,可以科学上网下载好了上传到服务器,安装lrzsz https://blog.csdn.net/qq_22356995/article/details/104071562

    解压

    [root@master helm]# ls helm-v3.2.4-linux-amd64.tar.gz [root@master helm]# tar -xvf helm-v3.2.4-linux-amd64.tar.gz linux-amd64/ linux-amd64/helm linux-amd64/README.md linux-amd64/LICENSE

    2.2 安装

    将helm移到/usr/local/bin目录

    [root@master helm]# ls helm-v3.2.4-linux-amd64.tar.gz linux-amd64 [root@master helm]# mv linux-amd64/helm /usr/local/bin [root@master helm]# helm version version.BuildInfo{Version:"v3.2.4", GitCommit:"0ad800ef43d3b826f31a5ad8dfbb4fe05d143688", GitTreeState:"clean", GoVersion:"go1.13.12"}

    2.3 配置helm使用的仓库为国内阿里云地址

    [root@master ~]# helm repo add apphub https://apphub.aliyuncs.com "apphub" has been added to your repositories [root@master ~]# helm repo list NAME URL apphub https://apphub.aliyuncs.com

    三. helm安装nginx-ingress(跳过)

    3.1 搜索

    [root@master helm]# helm search repo nginx-ingress NAME CHART VERSION APP VERSION DESCRIPTION apphub/nginx-ingress 1.30.3 0.28.0 An nginx Ingress controller that uses ConfigMap... apphub/nginx-ingress-controller 5.3.4 0.29.0 Chart for the nginx Ingress controller apphub/nginx-lego 0.3.1 Chart for nginx-ingress-controller and kube-lego

    因为是阿里云的chart,下载镜像不需要科学上网,所以直接安装

    3.2 安装

    编辑自定义配置

    [root@master nginx-ingress]# cat <<EOF> my-values.yaml controller: hostNetwork: true daemonset: useHostPort: false hostPorts: http: 80 https: 443 service: type: ClusterIP tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master defaultBackend: tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master EOF # 安装 helm install apphub/nginx-ingress --namespace kube-system --name-template nginx-ingress -f my-values.yaml

    安装完成

    [root@master nginx-ingress]# kubectl get pods -n kube-system | grep nginx-ingress nginx-ingress-default-backend-bbff5894d-dt4fq 1/1 Running 0 28s nginx-ingress-controller-754d99f4c4-ftxmh 1/1 Running 0 28s

    四. helm安装cert-manager自动化 HTTPS(失败了,跳过)

    4.1 使用常规清单安装

    安装CustomResourceDefinitions和cert-manager本身

    [root@master ~]# mkdir -p ~/i/master/helm/cert-manager/ && cd ~/i/master/helm/cert-manager/ [root@master cert-manager]# wget https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager.yaml # Kubernetes 1.15+ [root@master cert-manager]# kubectl apply --validate=false -f cert-manager.yaml

    速度很慢

    4.2 使用helm安装(我使用了)

    kubectl create namespace cert-manager helm repo add jetstack https://charts.jetstack.io helm repo update # Helm v3+ v0.15.1 $ helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --set cainjector.image.repository=registry.cn-shanghai.aliyuncs.com/wanfei/cert-manager-cainjector \ --set image.repository=registry.cn-shanghai.aliyuncs.com/wanfei/cert-manager-controller \ --set webhook.image.repository=registry.cn-shanghai.aliyuncs.com/wanfei/cert-manager-webhook \ --version v0.15.1 \ --set installCRDs=true \ --set ingressShim.defaultIssuerName=letsencrypt-prod \ --set ingressShim.defaultIssuerKind=ClusterIssuer # v0.15.2 helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --version v0.15.2 \ --set installCRDs=true

    4.3 验证安装

    [root@master cert-manager]# kubectl get pods -n cert-manager NAME READY STATUS RESTARTS AGE cert-manager-cainjector-8545fdf87c-54mbs 1/1 Running 0 4m36s cert-manager-webhook-8c5db9fb6-2t5qk 1/1 Running 0 4m36s cert-manager-9b8969d86-n54n2 1/1 Running 0 4m36s

    4.4 创建签发机构

    [root@master cert-manager]# cat <<EOF> production-issuer.yaml apiVersion: cert-manager.io/v1alpha2 kind: ClusterIssuer metadata: name: letsencrypt-prod spec: acme: # The ACME server URL server: https://acme-v02.api.letsencrypt.org/directory # Email address used for ACME registration email: www19930327@126.com # Name of a secret used to store the ACME account private key privateKeySecretRef: name: letsencrypt-prod # Enable the HTTP-01 challenge provider solvers: - http01: ingress: class: nginx EOF # 运行 [root@master cert-manager]# kubectl create -f production-issuer.yaml

    报错了

    Error from server (InternalError): error when creating "production-issuer.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: dial tcp 10.43.226.158:443: connect: connection timed out

    可能是新版本不兼容,这个错误一直没有找到解决办法,后来换了Traefik2.2,实现了自动Https

    五. 配置 Traefik 自动化 HTTPS

    官方文档 https://docs.traefik.io/user-guides/crd-acme/

    5.1 安装crd和rbac

    [root@master traefik]# cat <<EOF> crd_rbac.yaml apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ingressroutes.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRoute plural: ingressroutes singular: ingressroute scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: middlewares.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: Middleware plural: middlewares singular: middleware scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ingressroutetcps.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRouteTCP plural: ingressroutetcps singular: ingressroutetcp scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ingressrouteudps.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRouteUDP plural: ingressrouteudps singular: ingressrouteudp scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: tlsoptions.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: TLSOption plural: tlsoptions singular: tlsoption scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: tlsstores.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: TLSStore plural: tlsstores singular: tlsstore scope: Namespaced --- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: traefikservices.traefik.containo.us spec: group: traefik.containo.us version: v1alpha1 names: kind: TraefikService plural: traefikservices singular: traefikservice scope: Namespaced --- apiVersion: v1 kind: ServiceAccount metadata: name: traefik-ingress-controller namespace: kube-system --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses/status verbs: - update - apiGroups: - traefik.containo.us resources: - middlewares - ingressroutes - traefikservices - ingressroutetcps - ingressrouteudps - tlsoptions - tlsstores verbs: - get - list - watch --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: traefik-ingress-controller roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: traefik-ingress-controller subjects: - kind: ServiceAccount name: traefik-ingress-controller namespace: kube-system EOF

    安装

    kubectl apply -f crd_rbac.yaml

    5.2 安装traefik

    因为想要实现自动Https,所以要使用 Let’s Encrypt 来进行自动化 HTTPS,需要首先开启 ACME,参考官方文档https://docs.traefik.io/https/acme/

    经过测试tlsChallenge和httpChallenge需要 Let’s Encrypt 到 Traefik 443和80 端口必须是可达的,用这两个都没有弄成功,后来用了dnsChallenge才成功了

    5.2.1 dnsChallenge配置providers

    providers配置根据域名服务商决定 腾讯云上买的域名是

    5.2.2 获取dnspod用户令牌

    服务商是dnspod官网地址 https://www.dnspod.cn/,登录后 创建秘钥

    5.2.3 创建dnspod用户令牌密钥

    下面我们通过设置 - --certificatesresolvers.myresolver.acme.dnschallenge.provider=dnspod 参数来指定指定腾讯云的 DNS 校验,要使用腾讯云的 DNS 校验我们还需要配置1个环境变量:DNSPOD_API_KEY

    kubectl create secret generic traefik-dnspod-secret --from-literal=DNSPOD_API_KEY=1xxxxx6,6exxxxxxxxxxxxxxxxxxc3e9 --from-literal=DNSPOD_HTTP_TIMEOUT=30 -n kube-system DNSPOD_API_KEY:是用户令牌,ID,Token拼接而成(必填参数)DNSPOD_HTTP_TIMEOUT:是API请求超时时间,第一次没有设置这个参数,上面的用户令牌也正确,还是报超时错误acme: error presenting token: API call failed: Post \"https://dnsapi.cn/Domain.List\": context deadline exceeded (Client.Timeout exceeded while awaiting headers),所以这里要设置一个长的超时时间,我设置了30,第一次就成功了,参考 https://github.com/go-acme/lego/issues/1096
    5.2.4 创建traefik
    cat <<EOF> traefik.yaml kind: Deployment apiVersion: apps/v1 metadata: namespace: kube-system name: traefik labels: app: traefik spec: replicas: 1 selector: matchLabels: app: traefik template: metadata: labels: app: traefik spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 60 tolerations: - operator: "Exists" nodeSelector: kubernetes.io/hostname: master volumes: - name: acme hostPath: # 证书挂载地址,nfsdata是nfs的根目录 path: /nfsdata/traefik/acme containers: - name: traefik image: traefik:v2.2 args: - --api.insecure=true - --accesslog - --entrypoints.web.Address=:80 - --entrypoints.websecure.Address=:443 - --providers.kubernetescrd - --api - --api.dashboard=true # 使用 dns 验证方式 - --certificatesresolvers.myresolver.acme.dnschallenge.provider=dnspod - --certificatesresolvers.myresolver.acme.dnschallenge.delaybeforecheck=0 - --certificatesresolvers.myresolver.acme.email=www19930327@126.com - --certificatesresolvers.myresolver.acme.storage=/etc/acme/acme.json # 下面是用于测试的ca服务,如果https证书生成成功了,则移除下面参数 #- --certificatesresolvers.myresolver.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory # dnspod的用户令牌通过secret作为环境变量读取 envFrom: - secretRef: name: traefik-dnspod-secret # 挂载到容器目录 volumeMounts: - name: acme mountPath: /etc/acme ports: - name: web containerPort: 80 hostPort: 80 - name: websecure containerPort: 443 hostPort: 443 - name: admin containerPort: 8080 --- kind: Service apiVersion: v1 metadata: name: traefik namespace: kube-system spec: selector: app: traefik ports: - protocol: TCP port: 8080 name: admin EOF

    安装

    kubectl apply -f traefik.yaml [root@master traefik]# kubectl get pods -n kube-system | grep traefik traefik-cd86f5748-wbsd9 1/1 Running 0 43m

    5.3 创建ingressRoute

    vi ingressRoute.yaml apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: traefik-webui-tls namespace: kube-system spec: entryPoints: - websecure routes: - match: Host(`traefik.wanfei.wang`) kind: Rule services: - name: traefik port: 8080 tls: certResolver: myresolver # 创建 kubectl apply -f ingressRoute.yaml

    5.4 访问traefik web-ui

    现在的https是健康的

    完整示例

    apiVersion: traefik.containo.us/v1alpha1 kind: Middleware metadata: name: redirect-https namespace: kube-system spec: redirectScheme: scheme: https --- apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: traefik-webui namespace: kube-system spec: entryPoints: - web routes: - match: Host(`traefik.wanfei.wang`) kind: Rule services: - name: traefik port: 8080 middlewares: - name: redirect-https --- apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: traefik-webui-tls namespace: kube-system spec: entryPoints: - websecure routes: - match: Host(`traefik.wanfei.wang`) kind: Rule services: - name: traefik port: 8080 tls: certResolver: myresolver

    参考 https://www.jianshu.com/p/8facbaa9d6e0

    六. Rancher

    6.1 安装Rancher

    官方文档 https://rancher2.docs.rancher.cn/docs/installation/other-installation-methods/single-node-docker/_index/

    helm安装rancher2.4.5无法使用traefik2自动Https,下面使用docker安装

    [root@master rancher]# cat start.sh #!/bin/bash docker stop rancher docker rm rancher docker run -d --restart=unless-stopped \ --name rancher \ -v $PWD/rancher:/var/lib/rancher/ \ -p 8080:80 -p 8443:443 \ rancher/rancher:latest

    执行

    [root@master rancher]# sh start.sh

    访问 https://rancher.wanfei.wang:8443/,第一次访问要设置密码

    6.2 添加k3s集群

    修改语言为中文,添加集群 选择导入 选择跳过证书认证到master节点执行

    [root@master rancher]# curl --insecure -sfL https://rancher.wanfei.wang:8443/v3/import/gjcwdkf4fxzl8xgxl6zq528tdx59qb79p9f7jxt9llschlgngmsr66.yaml | kubectl apply -f - clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver created clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master created namespace/cattle-system created serviceaccount/cattle created clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created secret/cattle-credentials-2cfe787 created clusterrole.rbac.authorization.k8s.io/cattle-admin created deployment.apps/cattle-cluster-agent created daemonset.apps/cattle-node-agent created

    点击主机数

    6.3 部署服务

    6.1 部署mysql

    镜像准备好registry.cn-shanghai.aliyuncs.com/wanfei/pigx-mysql:8.0.20 环境变量设置数据库密码MYSQL_ROOT_PASSWORD = root 数据卷设置挂载的storageClass我的是nfs-data

    6.2 部署redis

    设置密码不知道咋搞

    Processed: 0.008, SQL: 9