For image download, domain name resolution and time synchronization, please click Alibaba open source mirror station
Practice environment
CentOS-7-x86_64-DVD-1810
Docker 19.03.9
Kubernetes version: v1.20.5
Before starting
One Linux operating system or more, compatible with deb,rpm
Ensure that each machine has 2G memory or more
Ensure that when the node machine of the control panel, its CPU core number is dual core or above
Ensure that all machines in the cluster are networked
target
- Install a Kubernetes cluster control panel
Install a Pod networ based on the cluster so that the clusters can communicate with each other
Installation guide
Install Docker
Installation process
Note that when installing docker, you need to refer to the version supported by Kubenetes (see below). If the installed docker version is too high, the following problems will be prompted
WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03
Specify the version when installing docker
sudo yum install docker-ce-19.03.9 docker-ce-cli-19.03.9 containerd.io
If docker is not installed, the following questions will be prompted when running kubedm init
cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH [preflight] WARNING: Couldn't create the interface used for talking to the container runtime: docker is required for container runtime: exec: "docker": executable file not found in $PATH
Installing kubedm
If it is not installed, install kubedm first. If it is installed, you can update the latest version of kubedm through apt get update & & apt get upgrade or yum update command
Note: during kubedm update, kubelet will restart every few seconds, which is normal.
Other pre operations
Turn off firewall
# systemctl stop firewalld && systemctl disable firewalld
Run the above command to stop and disable the firewall, otherwise the following problems will be prompted when running kubedm init
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
Modify the / etc/docker/daemon.json file
Edit the / etc/docker/daemon.json file and add the following
{ "exec-opts":["native.cgroupdriver=systemd"] }
Then execute the systemctl restart docker command to restart docker
If you do not perform the above operations, the following questions will be prompted when running kubedm init
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
Install dependent software packages such as socat and conntrack
# yum install socat conntrack-tools
If the above dependent package is not installed, the following questions will be prompted when running kubedm init
[WARNING FileExisting-socat]: socat not found in system path error execution phase preflight: [preflight] Some fatal errors occurred:` [ERROR FileExisting-conntrack]: conntrack not found in system path`
Set net.ipv4.ip_ The forward value is 1
Set net.ipv4.ip_ The forward value is 1, as follows
# sysctl -w net.ipv4.ip_forward=1 net.ipv4.ip_forward = 1
Description: net.ipv4.ip_ If forward is 0, it means that forwarding of packets is prohibited; if it is 1, it means that forwarding of packets is allowed; if net.ipv4.ip_ If the forward value is not 1, the following questions will be prompted when running kubedm init
ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
The above configuration takes effect temporarily. In order to avoid failure after restarting the machine, the following settings are made
# echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
Note: the following methods are recommended for permanent configuration on the Internet, but the author has tried, but it doesn't work in practice
# echo "sysctl -w net.ipv4.ip_forward=1" >> /etc/rc.local # chmod +x /etc/rc.d/rc.local
Set the value of net.bridge.bridge-nf-call-iptables to 1
Refer to net.ipv4.ip for practice_ Forward setting
Note: the above operations should be implemented once at each cluster node
Initialize control panel node
The machine running the control panel component is called the control panel node, including etcd (cluster database) and API Server (called by kubectl command line tool)
1. (recommended) if you plan to upgrade a single control panel kubedm cluster to high availability, you should specify the -- control plane endpoint parameter option for kubedm init to set a shared endpoint for all control panel nodes. This endpoint can be a DNS name or a local load balancing IP address.
2. Select a network plug-in and confirm whether the plug-in needs to pass parameters to kubedm init, which depends on your selected plug-in. For example, if you use flannel, you must specify the -- pod network CIDR parameter option for kubedm init
3. (optional) from version 1.14, kubedm will automatically detect the container runtime. If you need to use different container runtime, or if there are more than one container runtime, you need to specify the -- CRI socket parameter option for kubedm init
4. (optional) Unless otherwise specified, kubedm uses the network interface associated with the default gateway to set the advertisement address for the API server of the specified control panel node. If you need to specify other network interfaces, you need to specify the apiserver advertisement address = < IP address > parameter option for kubedm init. When publishing IPV6 Kubernetes clusters, you need to specify -- apiserver advertisement address for kubedm init Parameter options to set the IPv6 address, such as -- apiserver advertisement address = fd00:: 101
5. (optional) before running kubedm init, run kubedm config images pull to confirm that you can connect to the gcr.io container image registry
As follows, run kubedm init with parameters to initialize the control panel node machine. When running this command, a series of pre checks will be performed to ensure that the machine meets the requirements of running kubernetes. If the pre check finds an error, the program will exit automatically. Otherwise, continue to execute, download and install the cluster control panel components. This may take several minutes
# kubeadm init --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version stable --pod-network-cidr=10.244.0.0/16 [init] Using Kubernetes version: v1.20.5 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost.localdomain] and IPs [10.96.0.1 10.118.80.93] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [localhost localhost.localdomain] and IPs [10.118.80.93 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [localhost localhost.localdomain] and IPs [10.118.80.93 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [apiclient] All control plane components are healthy after 89.062309 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node localhost.localdomain as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)" [mark-control-plane] Marking the node localhost.localdomain as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 1sh85v.surdstc5dbrmp1s2 [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo \ --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a
As above, the command outputs your kubernetes control plane has initialized successfully! And other prompts to tell us that the control panel node is initialized successfully.
be careful:
1. If you do not use the -- image repository option to specify an alicloud image, the following errors may be reported
failed to pull image "k8s.gcr.io/kube-apiserver:v1.20.5": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) , error: exit status 1
2. Because the flannel network plug-in is used, the -- Pod network CIDR configuration option must be specified. Otherwise, the Pod named coredns xxxxxxxxx xxxxx cannot be started and is always in ContainerCreating state. Check the details. You can see the following error messages
networkPlugin cni failed to set up pod "coredns-7f89b7bc75-9vrrl_kube-system" network: open /run/flannel/subnet.env: no such file or directory
3. -- Pod network CIDR option parameter, that is, the Pod network cannot be the same as the host host network. Otherwise, after installing the flannel plug-in, the route will be repeated, and tools such as XShell cannot ssh the host, as follows:
Practice host network 10.118.80.0/24, network card interface ens33
--pod-network-cidr=10.118.80.0/24
4. In addition, it should be noted that the option parameter of ` -- pod network CIDR must be consistent with the key value of net-conf.json.Network in kube-flannel.yml file (in this example, the key value is 10.244.0.0/16 as shown below, so when running kubedm init command, the parameter value of - pod network CIDR option is set to 10.244.0.0/16)
# cat kube-flannel.yml|grep -E "^\s*\"Network" "Network": "10.244.0.0/16",
In the initial practice, set -- pod Network CIDR = 10.1.15.0/24, and the Network key value in kube-flannel.yml is not modified. The nodes newly added to the cluster cannot automatically obtain pod cidr, as shown below
# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system kube-flannel-ds-psts8 0/1 CrashLoopBackOff 62 15h ...slightly # kubectl -n kube-system logs kube-flannel-ds-psts8 ...slightly E0325 01:03:08.190986 1 main.go:292] Error registering network: failed to acquire lease: node "k8snode1" pod cidr not assigned W0325 01:03:08.192875 1 reflector.go:424] github.com/coreos/flannel/subnet/kube/kube.go:300: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding I0325 01:03:08.193782 1 main.go:371] Stopping shutdownHandler...
Later, try to modify whether the key value of ` net-conf.json.Network in kube-flannel.yml is 10.1.15.0/24 or the same prompt (download kube-flannel.yml first, then modify the configuration, and then install the network plug-in)
For the above node "xxxxxx" pod cidr not assigned problem, there is also a temporary solution online (not verified by the author), that is, manually assign podCIDR to the node. The command is as follows:
kubectl patch node <NODE_NAME> -p '{"spec":{"podCIDR":"<SUBNET>"}}'
5. Referring to the output prompt, to enable non root users to execute kubectl normally, run the following command
# mkdir -p $HOME/.kube # sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config # sudo chown $(id -u):$(id -g) $HOME/.kube/config
Optionally, if you are root, run the following command
export KUBECONFIG=/etc/kubernetes/admin.conf
Record the kubedm join in the kubedm init output. You need to use this command to add nodes to the cluster later
Token is used for mutual authentication between control panel nodes and nodes joining the cluster. It needs to be saved securely, because anyone who owns the token can add the authentication node to the cluster. Kubedm token can be used to display, create and delete the token. For command details, refer to kubedm reference guide
Install Pod network plug-in
A Container Network Interface (CNI) must be published based on the Pod network so that pods can communicate with each other. Cluster DNS (CoreDNS) will not start until the Pod network is installed
- Note that the Pod network cannot overlap with the host network. If it overlaps, there will be problems (if it is found that the preferred Pod network of the network plug-in conflicts with some host networks, consider using an appropriate CIDR block, and then add the -- Pod network CIDR option to replace the network configuration in the network plug-in YAML when executing kubedm init
- By default, kubedm sets the cluster to enforce RBAC (role-based access control). Ensure that the Pod network plug-in and any manifest published with it support RBAC
- If the cluster uses IPv6 -- dual stack or only single stack IPv6 network, ensure that IPv6 support is added to the plug-in support of IPv6. CNI v0.6.0.
Many projects use CNI to provide Kubernetes network support, and some of them also support network policies. The following is the list of plug-ins that implement Kubernetes network model
https://kubernetes.io/docs/co...
A Pod network plug-in can be installed on the control panel node machine or the node machine with kubeconfig credentials by executing the following command. The plug-in is directly installed in the form of daemon, and the configuration file will be written to the / etc/cni/net.d Directory:
kubectl apply -f <add-on.yaml>
flannel network plug-in installation
Manually publish flannel(Kubernetes v1.17 +)
# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds created
Reference connection: https://github.com/flannel-io...
Only one Pod network can be installed in each cluster. After the Pod network is installed, you can judge whether the network is normal by executing kubectl get pods -- all namespaces command and checking whether the coredns xxxxxxxxx XXX Pod in the command output is Running
View flannel subnet environment configuration information
# cat /run/flannel/subnet.env FLANNEL_NETWORK=10.244.0.0/16 FLANNEL_SUBNET=10.244.0.1/24 FLANNEL_MTU=1450 FLANNEL_IPMASQ=true
After the flannel network plug-in is installed, two virtual network cards will be automatically added to the host: cni0 and flannel.1
# ifconfig -a cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.244.0.1 netmask 255.255.255.0 broadcast 10.244.0.255 inet6 fe80::705d:43ff:fed6:80c9 prefixlen 64 scopeid 0x20<link> ether 72:5d:43:d6:80:c9 txqueuelen 1000 (Ethernet) RX packets 312325 bytes 37811297 (36.0 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 356346 bytes 206539626 (196.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 inet6 fe80::42:e1ff:fec3:8b6a prefixlen 64 scopeid 0x20<link> ether 02:42:e1:c3:8b:6a txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3 bytes 266 (266.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.118.80.93 netmask 255.255.255.0 broadcast 10.118.80.255 inet6 fe80::6ff9:dbee:6b27:1315 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:d3:3b:ef txqueuelen 1000 (Ethernet) RX packets 2092903 bytes 1103282695 (1.0 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 969483 bytes 253273828 (241.5 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.244.0.0 netmask 255.255.255.255 broadcast 10.244.0.0 inet6 fe80::a49a:2ff:fe38:3e4b prefixlen 64 scopeid 0x20<link> ether a6:9a:02:38:3e:4b txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 8 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 30393748 bytes 5921348235 (5.5 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 30393748 bytes 5921348235 (5.5 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Reinitialize control panel nodes
In practice, because the options are not configured correctly, the need is found after the network plug-in is installed, and the kubedm init command needs to be executed again. The specific practical operations are as follows:
# kubeadm reset [reset] Reading configuration from the cluster... [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks [reset] Removing info for node "localhost.localdomain" from the ConfigMap "kubeadm-config" in the "kube-system" Namespace [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] [reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni] The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command. If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables. The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file. # rm -rf /etc/cni/net.d # rm -f $HOME/.kube/config #
After executing the above commands, you need to re initialize the control panel node and reinstall the network plug-in
Summary of problems encountered
After re executing the kubedm init command, execute kubectl get pods -- all namespaces to view the status of the Pod. It is found that the status of coredns XXXXXXXX xxxxx is ContainerCreating, as shown below
# kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-7f89b7bc75-pxvdx 0/1 ContainerCreating 0 8m33s kube-system coredns-7f89b7bc75-v4p57 0/1 ContainerCreating 0 8m33s kube-system etcd-localhost.localdomain 1/1 Running 0 8m49s ...slightly
Execute the kubectl describe Pod coredns-7f89b7bc75-pxvdx - n Kube system command to view the details of the corresponding Pod. The following errors are found:
Warning FailedCreatePodSandBox 98s (x4 over 103s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "04434c63cdf067e698a8a927ba18e5013d2a1a21afa642b3cddedd4ff4592178" network for pod "coredns-7f89b7bc75-pxvdx": networkPlugin cni failed to set up pod "coredns-7f89b7bc75-pxvdx_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.1.15.1/24
As follows, check the network card information and find that cni0 has been assigned an IP address (last assigned by the network plug-in), which leads to the failure of the network plug-in to set IP for it this time.
# ifconfig -a cni0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 10.118.80.1 netmask 255.255.255.0 broadcast 10.118.80.255 inet6 fe80::482d:65ff:fea6:32fd prefixlen 64 scopeid 0x20<link> ether 4a:2d:65:a6:32:fd txqueuelen 1000 (Ethernet) RX packets 267800 bytes 16035849 (15.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 116238 bytes 10285959 (9.8 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ...slightly flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450 inet 10.1.15.0 netmask 255.255.255.255 broadcast 10.1.15.0 inet6 fe80::a49a:2ff:fe38:3e4b prefixlen 64 scopeid 0x20<link> ether a6:9a:02:38:3e:4b txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 8 overruns 0 carrier 0 collisions 0 ...slightly
The solution is as follows: delete the cni0 network card with the wrong configuration. After deleting the network card, it will be rebuilt automatically, and then it will be fine
$ sudo ifconfig cni0 down $ sudo ip link delete cni0
Control panel node acceleration (optional)
By default, for security reasons, the cluster will not schedule Pod on the control panel node machine. If you want to schedule Pod on the control panel node machine, such as a stand-alone Kubernetes cluster for development, you need to run the following command
kubectl taint nodes --all node-role.kubernetes.io/master- # Remove stains (Taints) of all Labels nodes starting with node-role.kubernetes.io/master
Practice is as follows
# kubectl get nodes NAME STATUS ROLES AGE VERSION localhost.localdomain Ready control-plane,master 63m v1.20.5 # kubectl taint nodes --all node-role.kubernetes.io/master- node/localhost.localdomain untainted
Add node to cluster
Modify the hostname of the new node
# hostname localhost.localdomain # hostname k8sNode1
Modifying the host name through the above command is only temporarily effective. In order to avoid restart failure, you need to edit the / etc/hostname file and replace the default localhost.localdomain as the target name (k8snode in example). If you do not add it, you will encounter the following errors in subsequent operations
[WARNING Hostname]: hostname "k8sNode1" could not be reached [WARNING Hostname]: hostname "k8sNode1": lookup k8sNode1 on 223.5.5.5:53: read udp 10.118.80.94:33293->223.5.5.5:53: i/o timeout
Modify the / ect/hosts configuration and add the mapping from the node machine hostname to the node machine IP (in example, 10.118.80.94), as follows
# vi /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 10.118.80.94 k8sNode1
ssh log in to the target node machine, switch to the root user (if the non root user logs in), and then run the kubedm join command output from the kubedm init command on the control panel machine. Enter:
kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>
You can check the existing and unexpired token s by running the following command on the control panel machine
# kubeadm token list
If there is no token, you can regenerate the token through the following command on the control panel machine
# kubeadm token create
Practice is as follows
# kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
On the node machine of the control panel, that is, on the master machine, check whether to add a node
# kubectl get nodes NAME STATUS ROLES AGE VERSION k8snode1 NotReady <none> 74s v1.20.5 localhost.localdomain Ready control-plane,master 7h24m v1.20.5
As above, a k8snode1 node is added
Summary of problems encountered
Problem 1: when running]kubedm join, an error occurs, as follows
# kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a [preflight] Running pre-flight checks error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "ap4vvq" To see the stack trace of this error execute with --v=5 or higher
resolvent:
When the token expires, run the kubedm token create command to regenerate the token
Problem 1: when running]kubedm join, an error occurs, as follows
# kubeadm join 10.118.80.93:6443 --token pa0gxw.4vx2wud1e7e0rzbx --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a [preflight] Running pre-flight checks error execution phase preflight: couldn't validate the identity of the API Server: cluster CA found in cluster-info ConfigMap is invalid: none of the public keys "sha256:8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f" are pinned To see the stack trace of this error execute with --v=5 or higher
resolvent:
The discovery token CA cert hash fails. Run the following command to retrieve the discovery token CA cert hash value
# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' 8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f
Use the output hash value
--discovery-token-ca-cert-hash sha256:8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f
Problem 2: cni config uninitialized error
Through k8s the built-in UI, you can see that the status of the newly added node is KubeletNotReady. The prompt information is as follows:,
[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful, runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized, CSINode is not yet initialized, missing node capacity for resources: ephemeral-storage]
Solution: reinstall the CNI network plug-in (in practice, the virtual machine is used, perhaps because the snapshot used at that time does not contain the network plug-in), then clean up the node again, and finally rejoin the node
# CNI_VERSION="v0.8.2" # mkdir -p /opt/cni/bin # curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-amd64-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz
clear
If you use one-time servers in the cluster for testing, you can directly shut down these servers without further cleaning. You can use kubectl config delete cluster to delete the local reference to the cluster (the author has not tried).
However, if you want to clean up the cluster more cleanly, you should first empty the node data to ensure that the node data is cleared, and then delete the node
Remove node
Operation on control panel node machine
First run the following command on the control panel node machine to tell the control panel node machine to forcibly delete the node data to be deleted
kubectl drain <node name> --delete-emptydir-data --force --ignore-daemonsets
The practice is as follows:
# kubectl get nodes NAME STATUS ROLES AGE VERSION k8snode1 Ready <none> 82m v1.20.5 localhost.localdomain Ready control-plane,master 24h v1.20.5 # kubectl drain k8snode1 --delete-emptydir-data --force --ignore-daemonsets node/k8snode1 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-4xqcc, kube-system/kube-proxy-c7qzs evicting pod default/nginx-deployment-64859b8dcc-v5tcl evicting pod default/nginx-deployment-64859b8dcc-qjrld evicting pod default/nginx-deployment-64859b8dcc-rcvc8 pod/nginx-deployment-64859b8dcc-rcvc8 evicted pod/nginx-deployment-64859b8dcc-qjrld evicted pod/nginx-deployment-64859b8dcc-v5tcl evicted node/k8snode1 evicted # kubectl get nodes NAME STATUS ROLES AGE VERSION localhost.localdomain Ready control-plane,master 24h v1.20.5
Operation on target node machine
Log in to the target node machine and execute the following command
# kubeadm reset
The above commands will not reset and clean iptables and IPVS tables. If iptables needs to be reset, you also need to run the following commands manually:
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
If you need to reset IPVS, you must run the following command.
ipvsadm -C
Note: do not reset the network without special requirements
Delete node profile
# rm -rf /etc/cni/net.d # rm -f $HOME/.kube/config
Operation on control panel node machine
Delete node kubectl delete node < node name > by executing the command
###Delete undeleted pod s # kubectl delete pod kube-flannel-ds-4xqcc -n kube-system --force # kubectl delete pod kube-proxy-c7qzs -n kube-system --force # kubectl delete node k8snode1 node "k8snode1" deleted
After deletion, if you need to rejoin the node, you can run the join with appropriate parameters through kubedm join
Clean the control panel
You can use the kubedm reset command on the node machine in the control panel. Click to view the kubedm reset command reference
This article is transferred from: https://www.cnblogs.com/shouk...