| 角色 | IP |
| master-140 | 192.168.100.140 |
| node-141 | 192.168.100.141 |
| node-142 | 192.168.100.142 |
- 1. 关闭防火墙:
- systemctl stop firewalld
- systemctl disable firewalld
-
- 2. 关闭selinux:
- sed -i 's/enforcing/disabled/'/etc/selinux/config # 永久 需重启
- setenforce 0# 临时
-
- 3. 关闭swap:
- swapoff -a # 临时
- vim /etc/fstab # 永久 将swap那一行注释
-
- 4.根据规划设置主机名
- 在master添加hosts:
- cat >>/etc/hosts << EOF
- 192.168.100.140 master-140
- 192.168.100.141 node-141
- 192.168.100.142 node-142
- EOF
-
- 5.修改linux的内核采纳数,添加网桥过滤和地址转发功能
- cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
- overlay
- br_netfilter
- EOF
-
- sudo modprobe overlay
- sudo modprobe br_netfilter
-
- cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
- net.bridge.bridge-nf-call-iptables = 1
- net.bridge.bridge-nf-call-ip6tables = 1
- net.ipv4.ip_forward = 1
- EOF
-
- sudo sysctl --system
-
- 6.时间同步:
- yum install ntpdate -y
- ntpdate cn.pool.ntp.org
- 1.使用containerd 作为容器,下载 containerd 包
-
- # wget https://github.com/containerd/containerd/releases/download/v1.6.6/cri-containerd-cni-1.6.6-linux-amd64.tar.gz
-
- 这里需要制定解压目录为【/】,包自带结构。
- # tar zxvf cri-containerd-cni-1.6.6-linux-amd64.tar.gz -C /
-
- 2.创建容器目录
- # mkdir /etc/containerd
-
- 3.生成容器配置文件
- # containerd config default >> /etc/containerd/config.toml
-
- 4.配置systemdcgroup 驱动程序
- # vim /etc/containerd/config.toml
-
- [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
- ...
- [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
- SystemdCgroup = true
-
- 5.修改sandbox (pause) image地址
- # vim /etc/containerd/config.toml
-
- [plugins."io.containerd.grpc.v1.cri"]
- sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2"
-
- 6.更新runc,因为cri-containerd-cni-1.6.6-linux-amd64.tar.gz的runc二进制文件有问题,最后说明。这一步很重要 ✰ ✰ ✰ ✰ ✰ ✰ ✰ ✰ ✰ ✰ ✰ ✰
- # wget https://github.com/opencontainers/runc/releases/download/v1.1.3/runc.amd64
- # mv runc.amd64 /usr/local/sbin/runc
- mv:是否覆盖"/usr/local/sbin/runc"? y
- # chmod +x /usr/local/sbin/runc
-
- 7.启动containerd服务
- # systemctl start containerd
- # systemctl enable containerd
- 1.添加阿里云YUM源
- # cat >/etc/yum.repos.d/kubernetes.repo << EOF
- [kubernetes]
- name=Kubernetes
- baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
- enabled=1
- gpgcheck=0
- repo_gpgcheck=0
- gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
- EOF
-
- 2.指定版本安装软件
- # yum install kubelet-1.24.2 kubeadm-1.24.2 kubectl-1.24.2
-
- 3.配置kubelet的cgroup
- # vim /etc/sysconfig/kubelet, 添加下面的配置
- KUBELET_CGROUP_ARGS="--cgroup-driver=systemd"
-
- 3. kubelet设置开机自启
- # systemctl enable kubelet
- 【此步骤只在master节点执行】
- # kubeadm init \
- --apiserver-advertise-address=192.168.100.140 \
- --image-repository=registry.aliyuncs.com/google_containers \
- --kubernetes-version=1.24.2 \
- --pod-network-cidr=10.244.0.0/16 \
- --service-cidr=10.96.0.0/12
- 【下面为日志输出】
- ......
- [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
- [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
- [addons] Applied essential addon: CoreDNS
- [addons] Applied essential addon: kube-proxy
-
- Your Kubernetes control-plane has initialized successfully!
-
- To start using your cluster, you need to run the following as a regular user:
-
- mkdir -p $HOME/.kube
- sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
- sudo chown $(id -u):$(id -g) $HOME/.kube/config
-
- Alternatively, if you are the root user, you can run:
-
- export KUBECONFIG=/etc/kubernetes/admin.conf
-
- You should now deploy a pod network to the cluster.
- Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
- https://kubernetes.io/docs/concepts/cluster-administration/addons/
-
- Then you can join any number of worker nodes by running the following on each as root:
-
- kubeadm join 192.168.100.140:6443 --token dirta5.mvlho7gqshh9hw6o \
- --discovery-token-ca-cert-hash sha256:fc2e5cf3feebbdf8fec37ca9ce7656431414ebf816f217b7d1c076dd89e9dadd
根据输出日志操作
- # mkdir -p $HOME/.kube
- # sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
- # sudo chown $(id -u):$(id -g) $HOME/.kube/config
-
- 查看node
- # kubectl get nodes
- NAME STATUS ROLES AGE VERSION
- master-140 Ready control-plane 2m16s v1.24.2
- 在node节点执行。向集群添加新节点,执行在kubeadm init输出的kubeadm join命令。
- # kubeadm join 192.168.100.140:6443 --token dirta5.mvlho7gqshh9hw6o --discovery-token-ca-cert-hash sha256:fc2e5cf3feebbdf8fec37ca9ce7656431414ebf816f217b7d1c076dd89e9dadd
-
- 查看node(上面我只在node-141执行了)
- kubectl get nodes
- NAME STATUS ROLES AGE VERSION
- master-140 Ready control-plane 2m16s v1.24.2
- node-141 Ready <none> 54s v1.24.2
token默认有效期为24小时,过期后需要重新创建:
- 1.查看token
- # kubeadm token list
- TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
- dirta5.mvlho7gqshh9hw6o 23h 2022-06-27T05:01:40Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
-
- 2.创建token
- # kubeadm token create --print-join-command
- kubeadm join 192.168.100.140:6443 --token 81zsrm.jvjhbg0mwlsdzdb7 --discovery-token-ca-cert-hash sha256:fc2e5cf3feebbdf8fec37ca9ce7656431414ebf816f217b7d1c076dd89e9dadd
-
- 3.查看token
- # kubeadm token list
- TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
- 81zsrm.jvjhbg0mwlsdzdb7 23h 2022-06-27T05:11:20Z authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
- dirta5.mvlho7gqshh9hw6o 23h 2022-06-27T05:01:40Z authentication,signing The default bootstrap token generated by 'kubeadm init'. system:bootstrappers:kubeadm:default-node-token
-
- 4.查看discovery-token-ca-cert-hash
- # openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
- fc2e5cf3feebbdf8fec37ca9ce7656431414ebf816f217b7d1c076dd89e9dadd
用新创建的token把node-142加入集群
- 【此操作在node-142执行】
- # kubeadm join 192.168.100.140:6443 --token 81zsrm.jvjhbg0mwlsdzdb7 --discovery-token-ca-cert-hash sha256:fc2e5cf3feebbdf8fec37ca9ce7656431414ebf816f217b7d1c076dd89e9dadd
-
- 【在master查看node】
- # kubectl get nodes
- NAME STATUS ROLES AGE VERSION
- master-140 Ready control-plane 16m v1.24.2
- node-141 Ready <none> 15m v1.24.2
- node-142 Ready <none> 35s v1.24.2
解决容器跨主机网络通信,此cni网络插件使用calico
参考地址:Quickstart for Calico on Kubernetes
- 1.查看kebe-system空间的pod
- # kubectl get pods -n kube-system
- NAME READY STATUS RESTARTS AGE
- coredns-74586cf9b6-5bfk7 0/1 ContainerCreating 0 22m
- coredns-74586cf9b6-d29mj 0/1 ContainerCreating 0 22m
- ...
-
- 查看到coredns的两个pod异常,是因为没有部署cni网络插件。
-
- 2.下载calico的yaml文件
- # wget https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
- # wget https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml
-
- 3.修改custom-resources.yaml
-
- ipPools:
- - blockSize: 26
- cidr: 10.244.0.0/16 # 此处修改为pod-network-cidr的范围,就是init集群时候写的。
- encapsulation: VXLANCrossSubnet
- natOutgoing: Enabled
- nodeSelector: all()
-
- 4.安装calico
- # kubectl apply -f tigera-operator.yaml
- # kubectl apply -f custom-resources.yaml
-
- 5.查看
- # kubectl get pods -n calico-system
- NAME READY STATUS RESTARTS AGE
- calico-kube-controllers-86dff98c45-jjflf 1/1 Running 0 2m20s
- calico-node-27zbg 1/1 Running 0 2m20s
- calico-node-kjphd 1/1 Running 0 2m20s
- calico-node-ntw22 1/1 Running 0 2m20s
- calico-typha-6c8778fdb7-bbpnh 1/1 Running 0 2m20s
- calico-typha-6c8778fdb7-lpmdl 1/1 Running 0 2m11s
-
- 6.查看coredns是否正常
- kubectl get pods -n kube-system
- NAME READY STATUS RESTARTS AGE
- coredns-74586cf9b6-5bfk7 1/1 Running 0 28m
- coredns-74586cf9b6-d29mj 1/1 Running 0 28m
- ...
- 查看已正常
- 1.部署一个deployment
- # kubectl create deployment deploy-nginx --image=nginx:1.18
-
- 2.部署的deploy默认是一个pod,现在扩容为3个
- # kubectl scale deployment deploy-nginx --replicas=3
-
- 3.暴露端口
- # kubectl expose deployment deploy-nginx --port=80 --target-port=8081 --type=NodePort
-
- 4.查看
- # kubectl get deployment,pods,svc -o wide
- NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
- deployment.apps/deploy-nginx 3/3 3 3 5m5s nginx nginx:1.18 app=deploy-nginx
-
- NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
- pod/deploy-nginx-74565bf758-8dsp7 1/1 Running 0 5m5s 10.244.65.194 node-141 <none> <none>
- pod/deploy-nginx-74565bf758-9kc74 1/1 Running 0 4m12s 10.244.56.3 node-142 <none> <none>
- pod/deploy-nginx-74565bf758-j7gs9 1/1 Running 0 4m12s 10.244.56.4 node-142 <none> <none>
-
- NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
- service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 37m <none>
- service/svc-nginx NodePort 10.101.189.51 <none> 8081:31379/TCP 4s app=deploy-nginx
-
- 5.访问pod地址和svc地址
- ---pod地址
- # curl 10.244.65.194
- # curl 10.244.56.3
- # curl 10.244.56.4
- HTTP/1.1 200 OK
- Server: nginx/1.18.0
- Date: Sun, 26 Jun 2022 05:41:47 GMT
- Content-Type: text/html
- Content-Length: 612
- Last-Modified: Tue, 21 Apr 2020 14:09:01 GMT
- Connection: keep-alive
- ETag: "5e9efe7d-264"
- Accept-Ranges: bytes
- ---svc地址
- # curl -I 10.101.189.51:8081
- HTTP/1.1 200 OK
- Server: nginx/1.18.0
- Date: Sun, 26 Jun 2022 05:42:36 GMT
- Content-Type: text/html
- Content-Length: 612
- Last-Modified: Tue, 21 Apr 2020 14:09:01 GMT
- Connection: keep-alive
- ETag: "5e9efe7d-264"
- Accept-Ranges: bytes
在Kubernetes中Service有两种带来模型,一种是基于iptables的,一种是基于ipvs的两者比较的话,ipvs的性能明显要高一些,但是如果要使用它,需要手动载入ipvs模块 。
- 1.安装ipset和ipvsadm
- # yum install ipset ipvsadm -y
-
- 2.添加需要加载的模块写入脚本文件
- # cat <<EOF> /etc/sysconfig/modules/ipvs.modules
- #!/bin/bash
- modprobe -- ip_vs
- modprobe -- ip_vs_rr
- modprobe -- ip_vs_wrr
- modprobe -- ip_vs_sh
- modprobe -- nf_conntrack_ipv4
- EOF
-
- 3.为脚本添加执行权限
- # chmod +x /etc/sysconfig/modules/ipvs.modules
-
- 4.执行脚本文件
- # /bin/bash /etc/sysconfig/modules/ipvs.modules
-
- 5.查看对应的模块是否加载成功
- # lsmod | grep -e ip_vs -e nf_conntrack_ipv4
修改kube-proxy 的工作模式
- 1.在master节点执行
- # kubectl edit cm kube-proxy -n kube-system
- ...
- kind: KubeProxyConfiguration
- metricsBindAddress: ""
- mode: "ipvs" # 此处修改为ipvs,默认为空
- nodePortAddresses: null
-
- ...
-
- 2.查看当前的kube-proxy
- # kubectl get pods -n kube-system
- NAME READY STATUS RESTARTS AGE
- coredns-74586cf9b6-5bfk7 1/1 Running 0 75m
- coredns-74586cf9b6-d29mj 1/1 Running 0 75m
- etcd-master-140 1/1 Running 0 76m
- kube-apiserver-master-140 1/1 Running 0 76m
- kube-controller-manager-master-140 1/1 Running 0 76m
- kube-proxy-f7rcx 1/1 Running 0 74m
- kube-proxy-ggchx 1/1 Running 0 60m
- kube-proxy-hbt94 1/1 Running 0 75m
- kube-scheduler-master-140 1/1 Running 0 76m
-
- 3.删除当前的kube-proxy
- # kubectl delete pod kube-proxy-f7rcx kube-proxy-ggchx kube-proxy-hbt94 -n kube-system
- pod "kube-proxy-f7rcx" deleted
- pod "kube-proxy-ggchx" deleted
- pod "kube-proxy-hbt94" deleted
-
- 4.查看新自动创建的kube-proxy
- # kubectl get pods -n kube-system
- NAME READY STATUS RESTARTS AGE
- coredns-74586cf9b6-5bfk7 1/1 Running 0 77m
- coredns-74586cf9b6-d29mj 1/1 Running 0 77m
- etcd-master-140 1/1 Running 0 78m
- kube-apiserver-master-140 1/1 Running 0 78m
- kube-controller-manager-master-140 1/1 Running 0 78m
- kube-proxy-7859q 1/1 Running 0 44s
- kube-proxy-l4gqx 1/1 Running 0 43s
- kube-proxy-nnjr2 1/1 Running 0 43s
- kube-scheduler-master-140 1/1 Running 0 78m
验证:
- 1.查看刚才创建的svc
- # kubectl get svc -o wide
- NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
- kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 80m <none>
- svc-nginx NodePort 10.101.189.51 <none> 8081:31379/TCP 42m app=deploy-nginx
-
- 2. 请求
- # curl -I 10.101.189.51:8081
- HTTP/1.1 200 OK
- Server: nginx/1.18.0
- Date: Sun, 26 Jun 2022 06:22:14 GMT
- Content-Type: text/html
- Content-Length: 612
- Last-Modified: Tue, 21 Apr 2020 14:09:01 GMT
- Connection: keep-alive
- ETag: "5e9efe7d-264"
- Accept-Ranges: bytes
-
- 3.查看ipvs规则
- # ipvsadm -ln
- IP Virtual Server version 1.2.1 (size=4096)
- Prot LocalAddress:Port Scheduler Flags
- -> RemoteAddress:Port Forward Weight ActiveConn InActConn
- ...
- TCP 10.101.148.59:443 rr
- -> 10.244.56.2:5443 Masq 1 0 0
- -> 10.244.65.193:5443 Masq 1 0 0
- (下面这个就是svc的ipvs规则链)
- TCP 10.101.189.51:8081 rr
- -> 10.244.56.3:80 Masq 1 0 0
- -> 10.244.56.4:80 Masq 1 0 0
- -> 10.244.65.194:80 Masq 1 0 1
- TCP 10.103.59.95:9094 rr
- -> 10.244.56.1:9094 Masq 1 0 0
- ...
进行集群初始化时候遇到如下错误。是因为安装的containerd二进制包里面的runc有问题,从官网从新下载一个替换解决,不要问为什么,我也不知道。
- [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
- [kubelet-check] Initial timeout of 40s passed.
-
- Unfortunately, an error has occurred:
- timed out waiting for the condition
-
- This error is likely caused by:
- - The kubelet is not running
- - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
-
- If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- - 'systemctl status kubelet'
- - 'journalctl -xeu kubelet'
-
- Additionally, a control plane component may have crashed or exited when started by the container runtime.
- To troubleshoot, list all containers using your preferred container runtimes CLI.
- Here is one example how you may list all running Kubernetes containers by using crictl:
- - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
- Once you have found the failing container, you can inspect its logs with:
- - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
- error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
- To see the stack trace of this error execute with --v=5 or higher
如果以上内容有错误的地方,欢迎指正,谢谢!