• 内网离线 k3s Rancher 高可用安装部署流程


    1. 总体架构


    1.1 节点规划

    节点 IP安装服务角色功能备注
    192.168.100.31Harbor, Nginx, NTP, MySQL镜像仓库,负载均衡,时间同步,K3S 数据库Docker 启动 Nginx 和 MySQL
    192.168.100.32K3SRancher Server 节点连接外置 MySQL 数据库
    192.168.100.146K3SRancher Server 节点连接外置 MySQL 数据库
    192.168.100.33Docker下游集群 Rancher Agent 节点ETCD, Control, Worker
    192.168.100.34Docker下游集群 Rancher Agent 节点ETCD, Worker
    192.168.100.35Docker下游集群 Rancher Agent 节点Worker
    192.168.100.147Docker下游集群 Rancher Agent 节点ETCD, Worker

    1.2 架构设计

    • Rancher Server (local 集群)

      按照官方高可用部署方案,通过 k3s 部署的 Rancher Server 至少需要两个节点实现高可用,即 192.168.100.32 和 192.168.100.146 组成高可用节点。

      在 192.168.100.31 上部署 MySQL 服务作为 k3s 外置数据库,实现高可用。

      在 192.168.100.31 上部署 nginx 服务,通过配置文件实现 Rancher Server 高可用及负载均衡。

    • Rancher Agent(下游集群)

      下游集群通过 Rancher 自定义集群进行部署,为了保证高可用,至少需要三个 ETCD 节点,以保证在单个 ETCD 节点故障时集群的可用性。

    • 域名规划

      高可用的实现依赖于通过域名访问 Rancher Server,因此本次将 xxyf.rancher.com 作为域名使用。

    • 离线程序包下载

      所有所需的 rpm 程序包均使用 repotrack 方式进行下载,该方式可以将程序本体及所有依赖全部下载下来,保证离线环境的顺利安装。

      # 外网环境下,安装 repotrack
      yum -y install yum-utils
      # 例:下载 nginx 及其依赖包至 /home/nginx-rpms 目录下
      repotrack -p /home/nginx-rpms nginx
      # 打包所有依赖
      tar -zcvf nginx-rpms.tar.gz /home/nginx-rpms/
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
    • 其他

      由于在内网环境,需要安装 NTP 时间同步服务器以保证集群内所有节点的时间一致性。

      网络安全方面的要求,不允许关闭防火墙,需要考虑规则配置,尽可能少的开放端口。


    2. 节点准备


    2.1 NTP 时间同步服务

    2.1.1 NTP 程序包下载
    • 外网环境下载

      yum -y install ntp --downloadonly --downloaddir /root/ntp-rpms
      
      • 1
    • 打包得到 ntp-rpms.tar.gz 并拷贝至所有内网节点中

      cd /root/
      tar -zcvf ntp-rpms.tar.gz  ntp-rpms/
      
      • 1
      • 2
    2.1.2 NTP 服务端安装
    • 修改主机时间

      timedatectl set-timezone Asia/Shanghai
      date -s "2022-07-29 10:13:00"
      hwclock --set --date "2022-07-29 10:13:00"
      hwclock --hctosys
      hwclock -w
      init 6
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
    • 解压并安装 NTP 程序包

      tar -zxvf ntp-rpms.tar.gz
      rpm -Uvh --nodeps --force ntp-rpms/*.rpm
      
      • 1
      • 2
    • 编辑配置文件

      vi /etc/ntp.conf
      
      • 1
      # 添加
      restrict -4 default kod notrap nomodify
      restrict -6 default kod notrap nomodify
      
      # 注释掉
      #server 0.centos.pool.ntp.org iburst
      #server 1.centos.pool.ntp.org iburst
      #server 2.centos.pool.ntp.org iburst
      #server 3.centos.pool.ntp.org iburst
      
      # 添加(代表使用自身作为服务器)
      server 127.127.1.0
      fudge 127.127.1.0 stratum 8
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
    • 启动服务

      systemctl restart ntpd
      systemctl disable chronyd.service
      systemctl enable ntpd.service
      
      • 1
      • 2
      • 3
    • 查看状态

      ntpstat
      netstat -tunlp | grep 123
      
      • 1
      • 2
    • 开放防火墙端口

      firewall-cmd --permanent --add-port 123/udp
      firewall-cmd --reload
      
      • 1
      • 2
    2.1.3 NTP 客户端安装
    • 解压并安装 NTP 程序包

      tar -zxvf ntp-rpms.tar.gz
      rpm -Uvh --nodeps --force ntp-rpms/*.rpm
      
      • 1
      • 2
    • 编辑配置文件

      vi /etc/ntp.conf
      
      • 1
      # 添加
      restrict -4 default kod notrap nomodify
      restrict -6 default kod notrap nomodify
      
      # 注释掉
      #server 0.centos.pool.ntp.org iburst
      #server 1.centos.pool.ntp.org iburst
      #server 2.centos.pool.ntp.org iburst
      #server 3.centos.pool.ntp.org iburst
      
      # 添加(此处 ip 地址为服务端地址)
      server 192.168.100.31
      fudge 192.168.100.31 stratum 8
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
    • 启动服务

      systemctl restart ntpd
      systemctl disable chronyd.service
      systemctl enable ntpd.service
      
      • 1
      • 2
      • 3
    • 查看状态

      ntpstat
      
      • 1
    • 从服务端同步时间

      ntpdate -u 192.168.100.31
      
      • 1

    2.2 系统环境配置


    • 关闭 SELinux

      sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
      grep "SELINUX=disabled" /etc/selinux/config
      setenforce 0
      
      • 1
      • 2
      • 3
    • 关闭 swap 分区

      swapoff -a
      sed -i 's$/dev/mapper/centos-swap$#/dev/mapper/centos-swap$g' /etc/fstab
      echo "vm.swappiness=0" >> /etc/sysctl.conf
      sysctl -p /etc/sysctl.conf
      
      • 1
      • 2
      • 3
      • 4
    • 修改内核参数

      echo """
      net.bridge.bridge-nf-call-ip6tables=1
      net.bridge.bridge-nf-call-iptables=1
      net.ipv4.ip_forward=1
      """ >> /etc/sysctl.conf
      
      modprobe br_netfilter
      sysctl -p
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
    • 配置 dns

      此处配置没有实际作用,但是不配置在安装 k3s 时会报错

      echo "nameserver 114.114.114.114" > /etc/resolv.conf
      systemctl daemon-reload
      systemctl restart network
      
      • 1
      • 2
      • 3
    • 配置 hosts

      将域名 xxyf.rancher.com 解析到 nginx 所在服务器,通过 nginx 进行负载均衡

      cat << EOF >> /etc/hosts
      192.168.100.31  xxyf.rancher.com
      192.168.100.32  server032
      192.168.100.33  server033
      192.168.100.34  server034
      192.168.100.35  server035
      192.168.100.146 server146
      192.168.100.147 server147
      EOF
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
    • 防火墙配置

      此处列出的端口列表基本覆盖各种不同方式类型的 Rancher,后续可以根据实际使用进行调整

      firewall-cmd --permanent --add-port 22/tcp
      firewall-cmd --permanent --add-port 80/tcp
      firewall-cmd --permanent --add-port 443/tcp
      firewall-cmd --permanent --add-port 2379/tcp
      firewall-cmd --permanent --add-port 2380/tcp
      firewall-cmd --permanent --add-port 6443/tcp
      firewall-cmd --permanent --add-port 6444/tcp
      firewall-cmd --permanent --add-port 8472/udp
      firewall-cmd --permanent --add-port 9345/tcp
      firewall-cmd --permanent --add-port 10249/tcp
      firewall-cmd --permanent --add-port 10250/tcp
      firewall-cmd --permanent --add-port 10256/tcp
      firewall-cmd --permanent --add-port 30935/tcp
      firewall-cmd --permanent --add-port 31477/tcp
      # 这两行非常非常非常重要,没有的话后续会报错,Rancher 节点间无法通信
      firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16 
      firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16
      firewall-cmd --reload
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
    • k3s-selinux 安装(仅在两个 k3s 节点安装)

      tar -zxvf container-selinux-rpms.tar.gz
      rpm -Uvh --force --nodeps container-selinux-rpms/*.rpm
      
      • 1
      • 2

    3. k3s 安装


    3.1 数据库准备

    192.168.100.31 服务器上部署 MySQL 5.7 作为 k3s 集群外置数据库,实现高可用。

    docker run -d -p 13306:3306 --restart=unless-stopped --name mysql-k3s \
    -v /home/mysql-k3s:/var/lib/mysql \
    -e MYSQL_ROOT_PASSWORD=xxx 192.168.100.31:18888/library/mysql:5.7.38
    
    • 1
    • 2
    • 3

    安全起见,不要使用 root 账号,建立 k3s 用户,整体信息如下:

    IP: 192.168.100.31
    PORT: 13306
    DB: k3s
    USERNAME: k3s
    PASSWORD: K3s_12345AA
    
    • 1
    • 2
    • 3
    • 4
    • 5

    3.2 k3s 程序下载

    根据官网文档,在 Github 或者国内镜像下载以下文件:

    • k3s (v1.18.20+k3s1)
    • k3s-airgap-images-amd64.tar (离线镜像包)
    • install.sh (安装脚本)

    3.3 k3s 安装

    mkdir -p /var/lib/rancher/k3s/agent/images/
    cp ./k3s-airgap-images-amd64.tar /var/lib/rancher/k3s/agent/images/
    cp ./k3s /usr/local/bin/
    chmod a+x /usr/local/bin/k3s
    chmod +x ./install.sh
    
    # master 安装
    K3S_DATASTORE_ENDPOINT='mysql://k3s:K3s_12345AA@tcp(192.168.100.31:13306)/k3s' INSTALL_K3S_SKIP_DOWNLOAD=true ./install.sh
    
    # node 安装(不执行!本次使用两个 k3s master 做高可用,未使用 node 节点)
    K3S_URL=https://xxyf.rancher.com:6443 INSTALL_K3S_SKIP_DOWNLOAD=true K3S_TOKEN=xxx ./k3s/install.sh
    # 其中 K3S_TOKEN 通过在 master 节点执行下面命令获取
    cat /var/lib/rancher/k3s/server/node-token
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    3.4 kubectl 使用

    安装 kubectl 工具

    # 根据官方文档下载 kubectl 程序并拷贝至 k3s server 节点安装
    install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
    # k3s 安装过程中会生成 /etc/rancher/k3s/k3s.yaml 文件
    mkdir -p ~/.kube/config/
    cp /etc/rancher/k3s/k3s.yaml ~/.kube/config/
    vi ~/.kube/config/k3s.yaml
    # 将 server 修改为域名,端口保留 6443 不要漏掉
    server: https://xxyf.rancher.com:6443
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    查看集群状态

    export KUBECONFIG=~/.kube/config/k3s.yaml
    kubectl get pods --all-namespaces
    kubectl get nodes
    
    • 1
    • 2
    • 3

    查看节点 ROLES 的解决方法(k3s node 节点会出现该情况,master 节点未遇见过)

    [root@server032 ~]# kubectl get nodes
    NAME           STATUS   ROLES    AGE    VERSION
    server032   Ready    master   24m    v1.18.20+k3s1
    server033   Ready    <none>   106s   v1.18.20+k3s1
    [root@server032 ~]# 
    [root@server032 ~]# kubectl label node server033 node-role.kubernetes.io/worker=worker
    node/server033 labeled
    [root@server032 ~]# 
    [root@server032 ~]# kubectl get nodes
    NAME           STATUS   ROLES    AGE     VERSION
    server032   Ready    master   24m     v1.18.20+k3s1
    server033   Ready    worker   2m13s   v1.18.20+k3s1
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    4. Helm 下载 rancher 模板(外网)


    • 下载 helm 程序包 helm-v3.9.1-linux-amd64.tar.gz

      https://github.com/helm/helm/releases/tag/v3.9.1
      
      • 1
    • 解压缩并移动目录

      tar -zxvf helm-v3.9.1-linux-amd64.tar.gz
      mv linux-amd64/helm /usr/local/bin/helm
      
      • 1
      • 2
    • 获取 rancher 模板

      helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
      helm fetch rancher-stable/rancher --version=v2.5.14
      
      • 1
      • 2
    • 渲染 rancher 模板,得到名称为 rancher 文件夹

      helm template rancher ./rancher-2.5.14.tgz --output-dir . \
          --no-hooks \
          --namespace cattle-system \
          --set hostname=xxyf.rancher.com \
          --set rancherImage=192.168.100.31:18888/rancher/rancher \
          --set ingress.tls.source=secret \
          --set systemDefaultRegistry=192.168.100.31:18888 \
          --set useBundledSystemChart=true \
          --set rancherImageTag = v2.5.14
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9

    5. TLS 证书生成(外网)


    • 证书一键生成脚本 create_self-signed-cert.sh

      #!/bin/bash -e
      
      help ()
      {
          echo  ' ================================================================ '
          echo  ' --ssl-domain: 生成ssl证书需要的主域名,如不指定则默认为www.rancher.local,如果是ip访问服务,则可忽略;'
          echo  ' --ssl-trusted-ip: 一般ssl证书只信任域名的访问请求,有时候需要使用ip去访问server,那么需要给ssl证书添加扩展IP,多个IP用逗号隔开;'
          echo  ' --ssl-trusted-domain: 如果想多个域名访问,则添加扩展域名(SSL_TRUSTED_DOMAIN),多个扩展域名用逗号隔开;'
          echo  ' --ssl-size: ssl加密位数,默认2048;'
          echo  ' --ssl-cn: 国家代码(2个字母的代号),默认CN;'
          echo  ' 使用示例:'
          echo  ' ./create_self-signed-cert.sh --ssl-domain=www.test.com --ssl-trusted-domain=www.test2.com \ '
          echo  ' --ssl-trusted-ip=1.1.1.1,2.2.2.2,3.3.3.3 --ssl-size=2048 --ssl-date=3650'
          echo  ' ================================================================'
      }
      
      case "$1" in
          -h|--help) help; exit;;
      esac
      
      if [[ $1 == '' ]];then
          help;
          exit;
      fi
      
      CMDOPTS="$*"
      for OPTS in $CMDOPTS;
      do
          key=$(echo ${OPTS} | awk -F"=" '{print $1}' )
          value=$(echo ${OPTS} | awk -F"=" '{print $2}' )
          case "$key" in
              --ssl-domain) SSL_DOMAIN=$value ;;
              --ssl-trusted-ip) SSL_TRUSTED_IP=$value ;;
              --ssl-trusted-domain) SSL_TRUSTED_DOMAIN=$value ;;
              --ssl-size) SSL_SIZE=$value ;;
              --ssl-date) SSL_DATE=$value ;;
              --ca-date) CA_DATE=$value ;;
              --ssl-cn) CN=$value ;;
          esac
      done
      
      # CA相关配置
      CA_DATE=${CA_DATE:-3650}
      CA_KEY=${CA_KEY:-cakey.pem}
      CA_CERT=${CA_CERT:-cacerts.pem}
      CA_DOMAIN=cattle-ca
      
      # ssl相关配置
      SSL_CONFIG=${SSL_CONFIG:-$PWD/openssl.cnf}
      SSL_DOMAIN=${SSL_DOMAIN:-'www.rancher.local'}
      SSL_DATE=${SSL_DATE:-3650}
      SSL_SIZE=${SSL_SIZE:-2048}
      
      ## 国家代码(2个字母的代号),默认CN;
      CN=${CN:-CN}
      
      SSL_KEY=$SSL_DOMAIN.key
      SSL_CSR=$SSL_DOMAIN.csr
      SSL_CERT=$SSL_DOMAIN.crt
      
      echo -e "\033[32m ---------------------------- \033[0m"
      echo -e "\033[32m       | 生成 SSL Cert |       \033[0m"
      echo -e "\033[32m ---------------------------- \033[0m"
      
      if [[ -e ./${CA_KEY} ]]; then
          echo -e "\033[32m ====> 1. 发现已存在CA私钥,备份"${CA_KEY}"为"${CA_KEY}"-bak,然后重新创建 \033[0m"
          mv ${CA_KEY} "${CA_KEY}"-bak
          openssl genrsa -out ${CA_KEY} ${SSL_SIZE}
      else
          echo -e "\033[32m ====> 1. 生成新的CA私钥 ${CA_KEY} \033[0m"
          openssl genrsa -out ${CA_KEY} ${SSL_SIZE}
      fi
      
      if [[ -e ./${CA_CERT} ]]; then
          echo -e "\033[32m ====> 2. 发现已存在CA证书,先备份"${CA_CERT}"为"${CA_CERT}"-bak,然后重新创建 \033[0m"
          mv ${CA_CERT} "${CA_CERT}"-bak
          openssl req -x509 -sha256 -new -nodes -key ${CA_KEY} -days ${CA_DATE} -out ${CA_CERT} -subj "/C=${CN}/CN=${CA_DOMAIN}"
      else
          echo -e "\033[32m ====> 2. 生成新的CA证书 ${CA_CERT} \033[0m"
          openssl req -x509 -sha256 -new -nodes -key ${CA_KEY} -days ${CA_DATE} -out ${CA_CERT} -subj "/C=${CN}/CN=${CA_DOMAIN}"
      fi
      
      echo -e "\033[32m ====> 3. 生成Openssl配置文件 ${SSL_CONFIG} \033[0m"
      cat > ${SSL_CONFIG} <<EOM
      [req]
      req_extensions = v3_req
      distinguished_name = req_distinguished_name
      [req_distinguished_name]
      [ v3_req ]
      basicConstraints = CA:FALSE
      keyUsage = nonRepudiation, digitalSignature, keyEncipherment
      extendedKeyUsage = clientAuth, serverAuth
      EOM
      
      if [[ -n ${SSL_TRUSTED_IP} || -n ${SSL_TRUSTED_DOMAIN} || -n ${SSL_DOMAIN} ]]; then
          cat >> ${SSL_CONFIG} <<EOM
      subjectAltName = @alt_names
      [alt_names]
      EOM
          IFS=","
          dns=(${SSL_TRUSTED_DOMAIN})
          dns+=(${SSL_DOMAIN})
          for i in "${!dns[@]}"; do
            echo DNS.$((i+1)) = ${dns[$i]} >> ${SSL_CONFIG}
          done
      
          if [[ -n ${SSL_TRUSTED_IP} ]]; then
              ip=(${SSL_TRUSTED_IP})
              for i in "${!ip[@]}"; do
                echo IP.$((i+1)) = ${ip[$i]} >> ${SSL_CONFIG}
              done
          fi
      fi
      
      echo -e "\033[32m ====> 4. 生成服务SSL KEY ${SSL_KEY} \033[0m"
      openssl genrsa -out ${SSL_KEY} ${SSL_SIZE}
      
      echo -e "\033[32m ====> 5. 生成服务SSL CSR ${SSL_CSR} \033[0m"
      openssl req -sha256 -new -key ${SSL_KEY} -out ${SSL_CSR} -subj "/C=${CN}/CN=${SSL_DOMAIN}" -config ${SSL_CONFIG}
      
      echo -e "\033[32m ====> 6. 生成服务SSL CERT ${SSL_CERT} \033[0m"
      openssl x509 -sha256 -req -in ${SSL_CSR} -CA ${CA_CERT} \
          -CAkey ${CA_KEY} -CAcreateserial -out ${SSL_CERT} \
          -days ${SSL_DATE} -extensions v3_req \
          -extfile ${SSL_CONFIG}
      
      echo -e "\033[32m ====> 7. 证书制作完成 \033[0m"
      echo
      echo -e "\033[32m ====> 8. 以YAML格式输出结果 \033[0m"
      echo "----------------------------------------------------------"
      echo "ca_key: |"
      cat $CA_KEY | sed 's/^/  /'
      echo
      echo "ca_cert: |"
      cat $CA_CERT | sed 's/^/  /'
      echo
      echo "ssl_key: |"
      cat $SSL_KEY | sed 's/^/  /'
      echo
      echo "ssl_csr: |"
      cat $SSL_CSR | sed 's/^/  /'
      echo
      echo "ssl_cert: |"
      cat $SSL_CERT | sed 's/^/  /'
      echo
      
      echo -e "\033[32m ====> 9. 附加CA证书到Cert文件 \033[0m"
      cat ${CA_CERT} >> ${SSL_CERT}
      echo "ssl_cert: |"
      cat $SSL_CERT | sed 's/^/  /'
      echo
      
      echo -e "\033[32m ====> 10. 重命名服务证书 \033[0m"
      echo "cp ${SSL_DOMAIN}.key tls.key"
      cp ${SSL_DOMAIN}.key tls.key
      echo "cp ${SSL_DOMAIN}.crt tls.crt"
      cp ${SSL_DOMAIN}.crt tls.crt
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
      • 19
      • 20
      • 21
      • 22
      • 23
      • 24
      • 25
      • 26
      • 27
      • 28
      • 29
      • 30
      • 31
      • 32
      • 33
      • 34
      • 35
      • 36
      • 37
      • 38
      • 39
      • 40
      • 41
      • 42
      • 43
      • 44
      • 45
      • 46
      • 47
      • 48
      • 49
      • 50
      • 51
      • 52
      • 53
      • 54
      • 55
      • 56
      • 57
      • 58
      • 59
      • 60
      • 61
      • 62
      • 63
      • 64
      • 65
      • 66
      • 67
      • 68
      • 69
      • 70
      • 71
      • 72
      • 73
      • 74
      • 75
      • 76
      • 77
      • 78
      • 79
      • 80
      • 81
      • 82
      • 83
      • 84
      • 85
      • 86
      • 87
      • 88
      • 89
      • 90
      • 91
      • 92
      • 93
      • 94
      • 95
      • 96
      • 97
      • 98
      • 99
      • 100
      • 101
      • 102
      • 103
      • 104
      • 105
      • 106
      • 107
      • 108
      • 109
      • 110
      • 111
      • 112
      • 113
      • 114
      • 115
      • 116
      • 117
      • 118
      • 119
      • 120
      • 121
      • 122
      • 123
      • 124
      • 125
      • 126
      • 127
      • 128
      • 129
      • 130
      • 131
      • 132
      • 133
      • 134
      • 135
      • 136
      • 137
      • 138
      • 139
      • 140
      • 141
      • 142
      • 143
      • 144
      • 145
      • 146
      • 147
      • 148
      • 149
      • 150
      • 151
      • 152
      • 153
      • 154
      • 155
      • 156
      • 157
    • 生成证书,得到 tls.crttls.key

      ./create_self-signed-cert.sh \
      --ssl-domain=xxyf.rancher.com \
      --ssl-trusted-ip=192.168.100.32,192.168.100.146 \
      --ssl-trusted-domain=xxyf.rancher.com \
      --ssl-size=2048 \
      --ssl-date=3650
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6

    6. Rancher 安装


    所有步骤在 k3s server 节点执行(任选一个节点即可)

    • 配置 k3s 私有镜像仓库

      mkdir -p /etc/rancher/k3s/
      vi /etc/rancher/k3s/registries.yaml
      
      • 1
      • 2
      mirrors:
        docker.io:
          endpoint:
            - "http://192.168.100.31:18888"
        "192.168.100.31:18888":
          endpoint:
            - "http://192.168.100.31:18888"
      configs:
        "192.168.100.31:18888":
          auth:
            username: admin
            password: xxx
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      # k3s server 节点执行
      systemctl restart k3s
      # k3s agent 节点执行(本例无)
      systemctl restart k3s-agent
      
      • 1
      • 2
      • 3
      • 4
    • 安装 tls 证书

      kubectl create namespace cattle-system
      kubectl -n cattle-system create secret tls tls-rancher-ingress --cert=./tls.crt --key=./tls.key
      
      • 1
      • 2
    • 修改 Rancher 实例数

      vi ./rancher/templates/deployment.yaml
      
      kind: Deployment
      apiVersion: apps/v1
      metadata:
        name: rancher
        labels:
          app: rancher
          chart: rancher-2.5.14
          heritage: Helm
          release: rancher
      spec:
        # 由默认 3 修改为 2(只有两个 k3s server 节点)
        replicas: 2
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
    • 安装 Rancher

      kubectl -n cattle-system  apply -R -f ./rancher
      
      • 1
    • 防火墙(若网络不通可临时配置测试)

      iptables -P INPUT ACCEPT
      iptables -P FORWARD ACCEPT
      iptables -P OUTPUT ACCEPT
      iptables -F
      
      • 1
      • 2
      • 3
      • 4
    • 日志查看

      kubectl get pods -A -o wide
      kubectl logs -n cattle-system helm-operation-xxx
      # crictl 命令替代 docker 命令
      k3s crictl ps -a
      k3s crictl images
      k3s crictl logs xxx
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
    • 关闭遥测(离线环境需关闭)

      在这里插入图片描述


    7. 配置下游集群


    7.1 节点环境配置

    • 关闭 SELinux

      sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
      grep "SELINUX=disabled" /etc/selinux/config
      setenforce 0
      
      • 1
      • 2
      • 3
    • 关闭 swap 分区

      swapoff -a
      echo "vm.swappiness=0" >> /etc/sysctl.conf
      sysctl -p /etc/sysctl.conf
      sed -i 's$/dev/mapper/centos-swap$#/dev/mapper/centos-swap$g' /etc/fstab
      
      • 1
      • 2
      • 3
      • 4
    • 配置系统参数

      echo """
      net.bridge.bridge-nf-call-ip6tables=1
      net.bridge.bridge-nf-call-iptables=1
      net.ipv4.ip_forward=1
      """ >> /etc/sysctl.conf
      
      modprobe br_netfilter
      sysctl -p
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
    • 修改主机时区(不执行!使用 NTP 同步时间)

      timedatectl set-timezone Asia/Shanghai
      date -s "2022-07-29 11:09:00"
      hwclock --set --date "2022-07-29 11:09:00"
      hwclock --hctosys
      hwclock -w
      init 6
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
    • 配置 DNS

      echo "nameserver 114.114.114.114" > /etc/resolv.conf
      systemctl daemon-reload
      systemctl restart network
      
      • 1
      • 2
      • 3
    • 配置 hosts

      cat << EOF >> /etc/hosts
      192.168.100.32  server032  xxyf.rancher.com
      192.168.100.33  server033
      192.168.100.34  server034
      192.168.100.35  server035
      192.168.100.146 server146  xxyf.rancher.com
      192.168.100.147 server147
      EOF
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
    • 防火墙配置

      firewall-cmd --permanent --add-port 22/tcp
      firewall-cmd --permanent --add-port 80/tcp
      firewall-cmd --permanent --add-port 443/tcp
      firewall-cmd --permanent --add-port 2376/tcp
      firewall-cmd --permanent --add-port 2379/tcp
      firewall-cmd --permanent --add-port 2380/tcp
      firewall-cmd --permanent --add-port 6443/tcp
      firewall-cmd --permanent --add-port 6444/tcp
      firewall-cmd --permanent --add-port 8472/udp
      firewall-cmd --permanent --add-port 9099/tcp
      firewall-cmd --permanent --add-port 9345/tcp
      firewall-cmd --permanent --add-port 10249/tcp
      firewall-cmd --permanent --add-port 10250/tcp
      firewall-cmd --permanent --add-port 10254/tcp
      firewall-cmd --permanent --add-port 10256/tcp
      firewall-cmd --permanent --add-port 30935/tcp
      firewall-cmd --permanent --add-port 31477/tcp
      firewall-cmd --permanent --zone=trusted --add-source=10.42.0.0/16 
      firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16 
      firewall-cmd --reload
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
      • 19
      • 20

    7.2 Docker 安装

    • 拷贝离线安装包及依赖至主机任意目录并安装

      tar -zxvf docker-rpms.tar.gz
      rpm -ivh docker-rpms/*.rpm --nodeps --force
      
      • 1
      • 2
    • 启动 docker 并设置开机启动

      systemctl start docker
      systemctl enable docker
      
      • 1
      • 2
    • 查看 docker 启动状态

      $ systemctl status docker
      
      ● docker.service - Docker Application Container Engine
         Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
         Active: active (running) since Fri 2022-06-24 01:49:42 EDT; 24min ago
           Docs: https://docs.docker.com
       Main PID: 9949 (dockerd)
         CGroup: /system.slice/docker.service
                 └─9949 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
    • 查看 docker 版本号

      $ docker -v
      
      Docker version 20.10.17, build 100c701
      
      • 1
      • 2
      • 3

    7.3 私有镜像仓库配置

    • 配置 docker 私有仓库地址

      mkdir -p /etc/docker
      
      tee /etc/docker/daemon.json << EOF
      {
        "insecure-registries": ["192.168.100.31:18888"]
      }
      EOF
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
    • 重启 docker 使生效

      systemctl daemon-reload
      systemctl restart docker
      systemctl enable docker
      
      • 1
      • 2
      • 3
    • 登录私有仓库

      $ docker login 192.168.100.31:18888
      
      Username: admin
      Password:
      WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
      Configure a credential helper to remove this warning. See
      https://docs.docker.com/engine/reference/commandline/login/#credentials-store
      
      Login Succeeded
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9

    7.4 TLS 证书配置

    • 创建目录

      mkdir -p /etc/kubernetes/ssl/certs
      
      • 1
    • 将之前生成的 tls.crt 证书文件拷贝至各集群节点

      cp tls.crt /etc/kubernetes/ssl/certs/
      
      • 1

    7.5 创建自定义集群

    访问 https://xxyf.rancher.com 进入 Rancher UI,创建自定义集群。

    7.6 cattle-cluster-agent 报错解决

    • 报错信息

      INFO: Arguments: --server https://xxyf.rancher.com --token REDACTED --etcd --controlplane --worker
      INFO: Environment: CATTLE_ADDRESS=192.168.100.34 CATTLE_INTERNAL_ADDRESS= CATTLE_NODE_NAME=server034 CATTLE_ROLE=,etcd,worker,controlplane CATTLE_SERVER=https://xxyf.rancher.com CATTLE_TOKEN=REDACTED
      INFO: Using resolv.conf: nameserver 114.114.114.114
      INFO: https://xxyf.rancher.com/ping is accessible
      INFO: xxyf.rancher.com resolves to 192.168.100.32
      time="2022-07-27T23:34:26Z" level=info msg="Listening on /tmp/log.sock"
      time="2022-07-27T23:34:26Z" level=info msg="Rancher agent version 52a8de7b6-dirty is starting"
      time="2022-07-27T23:34:26Z" level=info msg="Option requestedHostname=server034"
      time="2022-07-27T23:34:26Z" level=info msg="Option customConfig=map[address:192.168.100.34 internalAddress: label:map[] roles:[etcd worker controlplane] taints:[]]"
      time="2022-07-27T23:34:26Z" level=info msg="Option etcd=true"
      time="2022-07-27T23:34:26Z" level=info msg="Option controlPlane=true"
      time="2022-07-27T23:34:26Z" level=info msg="Option worker=true"
      time="2022-07-27T23:34:26Z" level=info msg="Certificate details from https://xxyf.rancher.com"
      time="2022-07-27T23:34:26Z" level=info msg="Certificate #0 (https://xxyf.rancher.com)"
      time="2022-07-27T23:34:26Z" level=info msg="Subject: CN=xxyf.rancher.com,C=CN"
      time="2022-07-27T23:34:26Z" level=info msg="Issuer: CN=cattle-ca,C=CN"
      time="2022-07-27T23:34:26Z" level=info msg="IsCA: false"
      time="2022-07-27T23:34:26Z" level=info msg="DNS Names: [xxyf.rancher.com xxyf.rancher.com]"
      time="2022-07-27T23:34:26Z" level=info msg="IPAddresses: [192.168.100.32 192.168.100.33]"
      time="2022-07-27T23:34:26Z" level=info msg="NotBefore: 2022-07-26 04:52:29 +0000 UTC"
      time="2022-07-27T23:34:26Z" level=info msg="NotAfter: 2032-07-23 04:52:29 +0000 UTC"
      time="2022-07-27T23:34:26Z" level=info msg="SignatureAlgorithm: SHA256-RSA"
      time="2022-07-27T23:34:26Z" level=info msg="PublicKeyAlgorithm: RSA"
      time="2022-07-27T23:34:26Z" level=info msg="Certificate #1 (https://xxyf.rancher.com)"
      time="2022-07-27T23:34:26Z" level=info msg="Subject: CN=cattle-ca,C=CN"
      time="2022-07-27T23:34:26Z" level=info msg="Issuer: CN=cattle-ca,C=CN"
      time="2022-07-27T23:34:26Z" level=info msg="IsCA: true"
      time="2022-07-27T23:34:26Z" level=info msg="DNS Names: "
      time="2022-07-27T23:34:26Z" level=info msg="IPAddresses: "
      time="2022-07-27T23:34:26Z" level=info msg="NotBefore: 2022-07-26 04:52:29 +0000 UTC"
      time="2022-07-27T23:34:26Z" level=info msg="NotAfter: 2032-07-23 04:52:29 +0000 UTC"
      time="2022-07-27T23:34:26Z" level=info msg="SignatureAlgorithm: SHA256-RSA"
      time="2022-07-27T23:34:26Z" level=info msg="PublicKeyAlgorithm: RSA"
      time="2022-07-27T23:34:26Z" level=fatal msg="Certificate chain is not complete, please check if all needed intermediate certificates are included in the server certificate (in the correct order) and if the cacerts setting in Rancher either contains the correct CA certificate (in the case of using self signed certificates) or is empty (in the case of using a certificate signed by a recognized CA). Certificate information is displayed above. error: Get \"https://xxyf.rancher.com\": x509: certificate signed by unknown authority"
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
      • 19
      • 20
      • 21
      • 22
      • 23
      • 24
      • 25
      • 26
      • 27
      • 28
      • 29
      • 30
      • 31
      • 32
      • 33
      • 34
    • 添加主机域名解析

    在这里插入图片描述

    • 添加主机目录映射

    在这里插入图片描述


    8. 安装 Longhorn


    Longhorn 是 Rancher 使用的一种存储类,在进行容器数据持久化时,可以通过新建 PVC 实现。

    8.1 主机节点安装 iscsi

    • 在具有网络环境的机器上下载 iscsi 及依赖程序包

      repotrack iscsi-initiator-utils
      
      • 1
    • 压缩为 tar 包

      tar zcvf iscsi-rpms.tar.gz iscsi/
      
      • 1
    • 拷贝至 Rancher 所有工作节点主机上并解压

      tar -zxvf iscsi-rpms.tar.gz
      
      • 1
    • 安装程序

      rpm -ivh iscsi/*.rpm --nodeps --force
      
      • 1

    8.2 应用商店中安装

    在这里插入图片描述

    默认选项安装即可(建议先建立项目 Longhorn,安装到该项目下)

    在这里插入图片描述


    9. Nginx 负载均衡配置


    • 拉取 nginx 镜像

      docker pull 192.168.100.31:18888/library/nginx:latest
      
      • 1
    • 创建 nginx.conf 配置文件模板

      vi /etc/nginx.conf
      
      • 1
      worker_processes 4;
      worker_rlimit_nofile 40000;
      
      events {
      	worker_connections 8192;
      }
      
      stream {
      	upstream rancher_servers_http {
      		least_conn;
      		server 192.168.100.32:80 max_fails=3 fail_timeout=5s;
      		server 192.168.100.146:80 max_fails=3 fail_timeout=5s;
      	}
      	server {
      		listen 80;
      		proxy_pass rancher_servers_http;
      	}
      	upstream rancher_servers_https {
      		least_conn;
      		server 192.168.100.32:443 max_fails=3 fail_timeout=5s;
      		server 192.168.100.146:443 max_fails=3 fail_timeout=5s;
      	}
      	server {
      		listen 443;
      		proxy_pass rancher_servers_https;
      	}
      }
      
      • 1
      • 2
      • 3
      • 4
      • 5
      • 6
      • 7
      • 8
      • 9
      • 10
      • 11
      • 12
      • 13
      • 14
      • 15
      • 16
      • 17
      • 18
      • 19
      • 20
      • 21
      • 22
      • 23
      • 24
      • 25
      • 26
      • 27
    • 启动容器,并挂载配置文件

      docker run -d --privileged --restart=unless-stopped \
      -p 80:80 -p 443:443 \
      -v /etc/nginx.conf:/etc/nginx/nginx.conf \
      192.168.100.31:18888/library/nginx:latest
      
      • 1
      • 2
      • 3
      • 4

    10. Rancher 节点清理脚本


    
    #!/bin/bash
    
    KUBE_SVC='
    kubelet
    kube-scheduler
    kube-proxy
    kube-controller-manager
    kube-apiserver
    '
    
    for kube_svc in ${KUBE_SVC};
    do
      # 停止服务
      if [[ `systemctl is-active ${kube_svc}` == 'active' ]]; then
        systemctl stop ${kube_svc}
      fi
      # 禁止服务开机启动
      if [[ `systemctl is-enabled ${kube_svc}` == 'enabled' ]]; then
        systemctl disable ${kube_svc}
      fi
    done
    
    # 停止所有容器
    docker stop $(docker ps -aq)
    
    # 删除所有容器
    docker rm -f $(docker ps -qa)
    
    # 删除所有容器卷
    docker volume rm $(docker volume ls -q)
    
    # 卸载mount目录
    for mount in $(mount | grep tmpfs | grep '/var/lib/kubelet' | awk '{ print $3 }') /var/lib/kubelet /var/lib/rancher;
    do
      umount $mount;
    done
    
    # 备份目录
    mv /etc/kubernetes /etc/kubernetes-bak-$(date +"%Y%m%d%H%M")
    mv /var/lib/etcd /var/lib/etcd-bak-$(date +"%Y%m%d%H%M")
    mv /var/lib/rancher /var/lib/rancher-bak-$(date +"%Y%m%d%H%M")
    mv /opt/rke /opt/rke-bak-$(date +"%Y%m%d%H%M")
    
    # 删除残留路径
    rm -rf /etc/ceph \
        /etc/cni \
        /opt/cni \
        /run/secrets/kubernetes.io \
        /run/calico \
        /run/flannel \
        /var/lib/calico \
        /var/lib/cni \
        /var/lib/kubelet \
        /var/log/containers \
        /var/log/kube-audit \
        /var/log/pods \
        /var/run/calico \
        /usr/libexec/kubernetes
    
    # 清理网络接口
    no_del_net_inter='
    lo
    docker0
    eth
    ens
    bond
    '
    
    network_interface=`ls /sys/class/net`
    
    for net_inter in $network_interface;
    do
      if ! echo "${no_del_net_inter}" | grep -qE ${net_inter:0:3}; then
        ip link delete $net_inter
      fi
    done
    
    # 清理残留进程
    port_list='
    80
    443
    6443
    2376
    2379
    2380
    8472
    9099
    10250
    10254
    '
    
    for port in $port_list;
    do
      pid=`netstat -atlnup | grep $port | awk '{print $7}' | awk -F '/' '{print $1}' | grep -v - | sort -rnk2 | uniq`
      if [[ -n $pid ]]; then
        kill -9 $pid
      fi
    done
    
    kube_pid=`ps -ef | grep -v grep | grep kube | awk '{print $2}'`
    
    if [[ -n $kube_pid ]]; then
      kill -9 $kube_pid
    fi
    
    # 清理Iptables表
    ## 注意:如果节点Iptables有特殊配置,以下命令请谨慎操作
    sudo iptables --flush
    sudo iptables --flush --table nat
    sudo iptables --flush --table filter
    sudo iptables --table nat --delete-chain
    sudo iptables --table filter --delete-chain
    systemctl restart docker
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114

    11. 错误汇总


    1. iptables 报错
    022/07/29 03:56:17 [INFO] kontainerdriver rancherkubernetesengine stopped
    2022/07/29 03:56:17 [ERROR] error syncing 'c-q8krf': handler cluster-provisioner-controller: [Failed to start [rke-etcd-port-listener] container on host [192.168.100.34]: Error response from daemon: driver failed programming external connectivity on endpoint rke-etcd-port-listener (f50d03f943d38a0b5eb75c08481b16fdf3b7025a953f3242c005fc7461b05c64):  (COMMAND_FAILED: '/usr/sbin/iptables -w10 -t nat -A DOCKER -p tcp -d 0/0 --dport 2380 -j DNAT --to-destination 172.17.0.2:1337 ! -i docker0' failed: Another app is currently holding the xtables lock; still 1s 0us time ahead to have a chance to grab the lock...
    Another app is currently holding the xtables lock. Stopped waiting after 10s.
    )], requeuing
    
    • 1
    • 2
    • 3
    • 4

    解决思路:先重启防火墙,再重启 docker

    systemctl restart firewalld
    systemctl restart docker
    
    • 1
    • 2
    1. 证书报错
    2022-07-29 04:04:08.025220 I | embed: rejected connection from "192.168.100.34:42768" (error "tls: failed to verify client's certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kube-ca\")", ServerName "")
    
    • 1

    解决思路:将证书拷贝至所有节点中,包括下游集群节点,参考 7.4 处理步骤

  • 相关阅读:
    SpringBoot缓存
    TinyEngine 开源低代码引擎首次直播答疑Q&A合集
    【牛客网刷题】中秋节前开启java专项练习错题总结第一天
    产品设计中的倒角——手绘外观设计的关键点
    eclipse怎样显示行数
    构建系列之前端脚手架vite
    【Vant2】Tab标签页组件自动跳转的坑
    Chrome-谷歌浏览器-查看http报文-跟踪访问链接
    随笔Kubernetes
    Apache网页优化
  • 原文地址:https://blog.csdn.net/qq12547345/article/details/126121601