Kubernetes uses kubedm to create clusters

Posted by rtsanderson on Mon, 06 Dec 2021 06:28:40 +0100

For image download, domain name resolution and time synchronization, please click Alibaba open source mirror station

Practice environment

CentOS-7-x86_64-DVD-1810

Docker 19.03.9

Kubernetes version: v1.20.5

Before starting

One Linux operating system or more, compatible with deb,rpm

Ensure that each machine has 2G memory or more

Ensure that when the node machine of the control panel, its CPU core number is dual core or above

Ensure that all machines in the cluster are networked

target

  • Install a Kubernetes cluster control panel
  • Install a Pod networ based on the cluster so that the clusters can communicate with each other

Installation guide

Install Docker

Installation process

Note that when installing docker, you need to refer to the version supported by Kubenetes (see below). If the installed docker version is too high, the following problems will be prompted

WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.5. Latest validated version: 19.03

Specify the version when installing docker

sudo yum install docker-ce-19.03.9 docker-ce-cli-19.03.9 containerd.io

If docker is not installed, the following questions will be prompted when running kubedm init

cannot automatically set CgroupDriver when starting the Kubelet: cannot execute 'docker info -f {{.CgroupDriver}}': executable file not found in $PATH

[preflight] WARNING: Couldn't create the interface used for talking to the container runtime: docker is required for container runtime: exec: "docker": executable file not found in $PATH

Installing kubedm

If it is not installed, install kubedm first. If it is installed, you can update the latest version of kubedm through apt get update & & apt get upgrade or yum update command

Note: during kubedm update, kubelet will restart every few seconds, which is normal.

Other pre operations

Turn off firewall

# systemctl stop firewalld && systemctl disable firewalld

Run the above command to stop and disable the firewall, otherwise the following problems will be prompted when running kubedm init

[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly

Modify the / etc/docker/daemon.json file

Edit the / etc/docker/daemon.json file and add the following

{
"exec-opts":["native.cgroupdriver=systemd"]
}

Then execute the systemctl restart docker command to restart docker

If you do not perform the above operations, the following questions will be prompted when running kubedm init

[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/

Install dependent software packages such as socat and conntrack

# yum install socat conntrack-tools

If the above dependent package is not installed, the following questions will be prompted when running kubedm init

[WARNING FileExisting-socat]: socat not found in system path
error execution phase preflight: [preflight] Some fatal errors occurred:`
[ERROR FileExisting-conntrack]: conntrack not found in system path`

Set net.ipv4.ip_ The forward value is 1

Set net.ipv4.ip_ The forward value is 1, as follows

# sysctl -w net.ipv4.ip_forward=1
net.ipv4.ip_forward = 1  

Description: net.ipv4.ip_ If forward is 0, it means that forwarding of packets is prohibited; if it is 1, it means that forwarding of packets is allowed; if net.ipv4.ip_ If the forward value is not 1, the following questions will be prompted when running kubedm init

ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1

The above configuration takes effect temporarily. In order to avoid failure after restarting the machine, the following settings are made

# echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf

Note: the following methods are recommended for permanent configuration on the Internet, but the author has tried, but it doesn't work in practice

# echo "sysctl -w net.ipv4.ip_forward=1" >> /etc/rc.local 
# chmod +x /etc/rc.d/rc.local

Set the value of net.bridge.bridge-nf-call-iptables to 1

Refer to net.ipv4.ip for practice_ Forward setting

Note: the above operations should be implemented once at each cluster node

Initialize control panel node

The machine running the control panel component is called the control panel node, including etcd (cluster database) and API Server (called by kubectl command line tool)

1. (recommended) if you plan to upgrade a single control panel kubedm cluster to high availability, you should specify the -- control plane endpoint parameter option for kubedm init to set a shared endpoint for all control panel nodes. This endpoint can be a DNS name or a local load balancing IP address.

2. Select a network plug-in and confirm whether the plug-in needs to pass parameters to kubedm init, which depends on your selected plug-in. For example, if you use flannel, you must specify the -- pod network CIDR parameter option for kubedm init

3. (optional) from version 1.14, kubedm will automatically detect the container runtime. If you need to use different container runtime, or if there are more than one container runtime, you need to specify the -- CRI socket parameter option for kubedm init

4. (optional) Unless otherwise specified, kubedm uses the network interface associated with the default gateway to set the advertisement address for the API server of the specified control panel node. If you need to specify other network interfaces, you need to specify the apiserver advertisement address = < IP address > parameter option for kubedm init. When publishing IPV6 Kubernetes clusters, you need to specify -- apiserver advertisement address for kubedm init Parameter options to set the IPv6 address, such as -- apiserver advertisement address = fd00:: 101

5. (optional) before running kubedm init, run kubedm config images pull to confirm that you can connect to the gcr.io container image registry

As follows, run kubedm init with parameters to initialize the control panel node machine. When running this command, a series of pre checks will be performed to ensure that the machine meets the requirements of running kubernetes. If the pre check finds an error, the program will exit automatically. Otherwise, continue to execute, download and install the cluster control panel components. This may take several minutes

# kubeadm init --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version stable  --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.20.5
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost.localdomain] and IPs [10.96.0.1 10.118.80.93]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost localhost.localdomain] and IPs [10.118.80.93 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost localhost.localdomain] and IPs [10.118.80.93 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[apiclient] All control plane components are healthy after 89.062309 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node localhost.localdomain as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)"
[mark-control-plane] Marking the node localhost.localdomain as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 1sh85v.surdstc5dbrmp1s2
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo \     
    --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a

As above, the command outputs your kubernetes control plane has initialized successfully! And other prompts to tell us that the control panel node is initialized successfully.

be careful:

1. If you do not use the -- image repository option to specify an alicloud image, the following errors may be reported

failed to pull image "k8s.gcr.io/kube-apiserver:v1.20.5": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1

2. Because the flannel network plug-in is used, the -- Pod network CIDR configuration option must be specified. Otherwise, the Pod named coredns xxxxxxxxx xxxxx cannot be started and is always in ContainerCreating state. Check the details. You can see the following error messages

networkPlugin cni failed to set up pod "coredns-7f89b7bc75-9vrrl_kube-system" network: open /run/flannel/subnet.env: no such file or directory

3. -- Pod network CIDR option parameter, that is, the Pod network cannot be the same as the host host network. Otherwise, after installing the flannel plug-in, the route will be repeated, and tools such as XShell cannot ssh the host, as follows:

Practice host network 10.118.80.0/24, network card interface ens33

--pod-network-cidr=10.118.80.0/24

4. In addition, it should be noted that the ` ` -- pod network CIDR option parameter must be consistent with the net-conf.json.Network key value in kube-flannel.yml file (in this example, the key value is 10.244.0.0/16 as shown below, so when running kubedm init command, the - pod network CIDR option parameter value is set to 10.244.0.0/16 ')

# cat kube-flannel.yml|grep -E "^\s*\"Network"
      "Network": "10.244.0.0/16",

In the initial practice, set -- pod Network CIDR = 10.1.15.0/24, and the Network key value in kube-flannel.yml is not modified. The nodes newly added to the cluster cannot automatically obtain pod cidr, as shown below

# kubectl get pods --all-namespaces
NAMESPACE              NAME                                            READY   STATUS             RESTARTS   AGE
kube-system   kube-flannel-ds-psts8                           0/1     CrashLoopBackOff   62         15h
...slightly
# kubectl -n kube-system logs kube-flannel-ds-psts8
...slightly
E0325 01:03:08.190986       1 main.go:292] Error registering network: failed to acquire lease: node "k8snode1" pod cidr not assigned
W0325 01:03:08.192875       1 reflector.go:424] github.com/coreos/flannel/subnet/kube/kube.go:300: watch of *v1.Node ended with: an error on the server ("unable to decode an event from the watch stream: context canceled") has prevented the request from succeeding
I0325 01:03:08.193782       1 main.go:371] Stopping shutdownHandler...

Later, try to modify the key value of ` ` net-conf.json.Network 'in kube-flannel.yml to 10.1.15.0/24 or the same prompt (download kube-flannel.yml first, then modify the configuration, and then install the network plug-in)

For the above node "xxxxxx" pod cidr not assigned problem, there is also a temporary solution online (not verified by the author), that is, manually assign podCIDR to the node. The command is as follows:

kubectl patch node <NODE_NAME> -p '{"spec":{"podCIDR":"<SUBNET>"}}'

5. Referring to the output prompt, to enable non root users to execute kubectl normally, run the following command

# mkdir -p $HOME/.kube
# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# sudo chown $(id -u):$(id -g) $HOME/.kube/config

Optionally, if you are root, run the following command

export KUBECONFIG=/etc/kubernetes/admin.conf

Record the kubedm join in the kubedm init output. You need to use this command to add nodes to the cluster later

Token is used for mutual authentication between control panel nodes and nodes joining the cluster. It needs to be saved securely, because anyone who owns the token can add the authentication node to the cluster. Kubedm token can be used to display, create and delete the token. For command details, refer to kubedm reference guide

Install Pod network plug-in

**A Container Network Interface (CNI) must be published based on the Pod network so that pods can communicate with each other. Cluster DNS (CoreDNS) will not start until the Pod network is installed**

  • Note that the Pod network cannot overlap with the host network. If it overlaps, there will be problems (if it is found that the preferred Pod network of the network plug-in conflicts with some host networks, consider using an appropriate CIDR block, and then add the -- Pod network CIDR option to replace the network configuration in the network plug-in YAML when executing kubedm init
  • By default, kubedm sets the cluster to enforce RBAC (role-based access control). Ensure that the Pod network plug-in and any manifest published with it support RBAC
  • If the cluster uses IPv6 -- dual stack or only single stack IPv6 network, ensure that IPv6 support is added to the plug-in. CNI v0.6.0 adds IPv6 support. Many projects use CNI to provide Kubernetes network support, and some of them also support network policies. The following is the list of plug-ins that implement Kubernetes network model. View the address:

https://kubernetes.io/docs/concepts/cluster-administration/networking/#how-to-implement-the-kubernetes-networking-model

A Pod network plug-in can be installed on the control panel node machine or the node machine with kubeconfig credentials by executing the following command. The plug-in is directly installed in the form of daemon, and the configuration file will be written to the / etc/cni/net.d Directory:

kubectl apply -f <add-on.yaml>

flannel network plug-in installation

Manually publish flannel(Kubernetes v1.17 +)

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created

Reference connection: https://github.com/flannel-io/flannel#flannel

Only one Pod network can be installed in each cluster. After the Pod network is installed, you can judge whether the network is normal by executing kubectl get pods -- all namespaces command and checking whether the coredns xxxxxxxxx XXX Pod in the command output is Running

View flannel subnet environment configuration information

# cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.244.0.0/16
FLANNEL_SUBNET=10.244.0.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

After the flannel network plug-in is installed, two virtual network cards will be automatically added to the host: cni0 and flannel.1

# ifconfig -a
cni0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.244.0.1  netmask 255.255.255.0  broadcast 10.244.0.255
        inet6 fe80::705d:43ff:fed6:80c9  prefixlen 64  scopeid 0x20<link>
        ether 72:5d:43:d6:80:c9  txqueuelen 1000  (Ethernet)
        RX packets 312325  bytes 37811297 (36.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 356346  bytes 206539626 (196.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:e1ff:fec3:8b6a  prefixlen 64  scopeid 0x20<link>
        ether 02:42:e1:c3:8b:6a  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3  bytes 266 (266.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.118.80.93  netmask 255.255.255.0  broadcast 10.118.80.255
        inet6 fe80::6ff9:dbee:6b27:1315  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:d3:3b:ef  txqueuelen 1000  (Ethernet)
        RX packets 2092903  bytes 1103282695 (1.0 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 969483  bytes 253273828 (241.5 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.244.0.0  netmask 255.255.255.255  broadcast 10.244.0.0
        inet6 fe80::a49a:2ff:fe38:3e4b  prefixlen 64  scopeid 0x20<link>
        ether a6:9a:02:38:3e:4b  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 30393748  bytes 5921348235 (5.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 30393748  bytes 5921348235 (5.5 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Reinitialize control panel nodes

In practice, because the options are not configured correctly, the need is found after the network plug-in is installed, and the kubedm init command needs to be executed again. The specific practical operations are as follows:

# kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
[reset] Removing info for node "localhost.localdomain" from the ConfigMap "kubeadm-config" in the "kube-system" Namespace
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
# rm -rf /etc/cni/net.d
# rm -f $HOME/.kube/config
# 

After executing the above commands, you need to re initialize the control panel node and reinstall the network plug-in

Summary of problems encountered

After re executing the kubedm init command, execute kubectl get pods -- all namespaces to view the status of the Pod. It is found that the status of coredns XXXXXXXX xxxxx is ContainerCreating, as shown below

# kubectl get pods --all-namespaces
NAMESPACE     NAME                                            READY   STATUS              RESTARTS   AGE
kube-system   coredns-7f89b7bc75-pxvdx                        0/1     ContainerCreating   0          8m33s
kube-system   coredns-7f89b7bc75-v4p57                        0/1     ContainerCreating   0          8m33s
kube-system   etcd-localhost.localdomain                      1/1     Running             0          8m49s
...slightly

Execute the kubectl describe Pod coredns-7f89b7bc75-pxvdx - n Kube system command to view the details of the corresponding Pod. The following errors are found:

Warning  FailedCreatePodSandBox  98s (x4 over 103s)    kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "04434c63cdf067e698a8a927ba18e5013d2a1a21afa642b3cddedd4ff4592178" network for pod "coredns-7f89b7bc75-pxvdx": networkPlugin cni failed to set up pod "coredns-7f89b7bc75-pxvdx_kube-system" network: failed to set bridge addr: "cni0" already has an IP address different from 10.1.15.1/24

As follows, check the network card information and find that cni0 has been assigned an IP address (last assigned by the network plug-in), which leads to the failure of the network plug-in to set IP for it this time.

# ifconfig -a
cni0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 10.118.80.1  netmask 255.255.255.0  broadcast 10.118.80.255
        inet6 fe80::482d:65ff:fea6:32fd  prefixlen 64  scopeid 0x20<link>
        ether 4a:2d:65:a6:32:fd  txqueuelen 1000  (Ethernet)
        RX packets 267800  bytes 16035849 (15.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 116238  bytes 10285959 (9.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

...slightly
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet 10.1.15.0  netmask 255.255.255.255  broadcast 10.1.15.0
        inet6 fe80::a49a:2ff:fe38:3e4b  prefixlen 64  scopeid 0x20<link>
        ether a6:9a:02:38:3e:4b  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0
...slightly

The solution is as follows: delete the cni0 network card with the wrong configuration. After deleting the network card, it will be rebuilt automatically, and then it will be fine

$ sudo ifconfig cni0 down    
$ sudo ip link delete cni0

Control panel node acceleration (optional)

By default, for security reasons, the cluster will not schedule Pod on the control panel node machine. If you want to schedule Pod on the control panel node machine, such as a stand-alone Kubernetes cluster for development, you need to run the following command

kubectl taint nodes --all node-role.kubernetes.io/master- # Remove stains (Taints) of all Labels nodes starting with node-role.kubernetes.io/master

Practice is as follows

# kubectl get nodes
NAME                    STATUS   ROLES                  AGE   VERSION
localhost.localdomain   Ready    control-plane,master   63m   v1.20.5
# kubectl taint nodes --all node-role.kubernetes.io/master-
node/localhost.localdomain untainted

Add node to cluster

Modify the hostname of the new node

# hostname
localhost.localdomain
# hostname k8sNode1

Modifying the host name through the above command is only temporarily effective. In order to avoid restart failure, you need to edit the / etc/hostname file and replace the default localhost.localdomain as the target name (k8snode in example). If you do not add it, you will encounter the following errors in subsequent operations

[WARNING Hostname]: hostname "k8sNode1" could not be reached
	[WARNING Hostname]: hostname "k8sNode1": lookup k8sNode1 on 223.5.5.5:53: read udp 10.118.80.94:33293->223.5.5.5:53: i/o timeout

Modify the / ect/hosts configuration and add the mapping from the node machine hostname to the node machine IP (in example, 10.118.80.94), as follows

# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.118.80.94   k8sNode1

ssh log in to the target node machine, switch to the root user (if the non root user logs in), and then run the kubedm join command output from the kubedm init command on the control panel machine. Enter:

kubeadm join --token <token> <control-plane-host>:<control-plane-port> --discovery-token-ca-cert-hash sha256:<hash>

You can check the existing and unexpired token s by running the following command on the control panel machine

# kubeadm token list

If there is no token, you can regenerate the token through the following command on the control panel machine

# kubeadm token create

Practice is as follows

# kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo     --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

On the node machine of the control panel, that is, on the master machine, check whether to add a node

# kubectl get nodes
NAME                    STATUS     ROLES                  AGE     VERSION
k8snode1                NotReady   <none>                 74s     v1.20.5
localhost.localdomain   Ready      control-plane,master   7h24m   v1.20.5

As above, a k8snode1 node is added

Summary of problems encountered

Problem 1: when running]kubedm join, an error occurs, as follows

# kubeadm join 10.118.80.93:6443 --token ap4vvq.8xxcc0uea7dxbjlo     --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a
[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "ap4vvq"
To see the stack trace of this error execute with --v=5 or higher

resolvent:

When the token expires, run the kubedm token create command to regenerate the token

Problem 1: when running]kubedm join, an error occurs, as follows

# kubeadm join 10.118.80.93:6443 --token pa0gxw.4vx2wud1e7e0rzbx  --discovery-token-ca-cert-hash sha256:c4493c04d789463ecd25c97453611a9dfacb36f4d14d5067464832b9e9c5039a
[preflight] Running pre-flight checks
error execution phase preflight: couldn't validate the identity of the API Server: cluster CA found in cluster-info ConfigMap is invalid: none of the public keys "sha256:8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f" are pinned
To see the stack trace of this error execute with --v=5 or higher

resolvent:

The discovery token CA cert hash fails. Run the following command to retrieve the discovery token CA cert hash value

# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f

Use the output hash value

--discovery-token-ca-cert-hash sha256:8e2f94e2f4f1b66c45d941c0a7f72e328c242346360751b5c1cf88f437ab854f

Problem 2: cni config uninitialized error

Through k8s the built-in UI, you can see that the status of the newly added node is KubeletNotReady. The prompt information is as follows:,

[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful, runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized, CSINode is not yet initialized, missing node capacity for resources: ephemeral-storage]

Solution: reinstall the CNI network plug-in (in practice, the virtual machine is used, perhaps because the snapshot used at that time does not contain the network plug-in), then clean up the node again, and finally rejoin the node

# CNI_VERSION="v0.8.2"
# mkdir -p /opt/cni/bin
# curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_VERSION}/cni-plugins-linux-amd64-${CNI_VERSION}.tgz" | sudo tar -C /opt/cni/bin -xz

clear

If you use one-time servers in the cluster for testing, you can directly shut down these servers without further cleaning. You can use kubectl config delete cluster to delete the local reference to the cluster (the author has not tried).

However, if you want to clean up the cluster more cleanly, you should first empty the node data to ensure that the node data is cleared, and then delete the node

Remove node

Operation on control panel node machine

First run the following command on the control panel node machine to tell the control panel node machine to forcibly delete the node data to be deleted

kubectl drain <node name> --delete-emptydir-data --force --ignore-daemonsets

The practice is as follows:

# kubectl get nodes
NAME                    STATUS   ROLES                  AGE   VERSION
k8snode1                Ready    <none>                 82m   v1.20.5
localhost.localdomain   Ready    control-plane,master   24h   v1.20.5
# kubectl drain k8snode1 --delete-emptydir-data --force --ignore-daemonsets
node/k8snode1 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/kube-flannel-ds-4xqcc, kube-system/kube-proxy-c7qzs
evicting pod default/nginx-deployment-64859b8dcc-v5tcl
evicting pod default/nginx-deployment-64859b8dcc-qjrld
evicting pod default/nginx-deployment-64859b8dcc-rcvc8
pod/nginx-deployment-64859b8dcc-rcvc8 evicted
pod/nginx-deployment-64859b8dcc-qjrld evicted
pod/nginx-deployment-64859b8dcc-v5tcl evicted
node/k8snode1 evicted
# kubectl get nodes
NAME                    STATUS   ROLES                  AGE   VERSION
localhost.localdomain   Ready    control-plane,master   24h   v1.20.5

Operation on target node machine

Log in to the target node machine and execute the following command

# kubeadm reset

The above commands will not reset and clean iptables and IPVS tables. If iptables needs to be reset, you also need to run the following commands manually:

iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If you need to reset IPVS, you must run the following command.

ipvsadm -C

Note: do not reset the network without special requirements

Delete node profile

# rm -rf /etc/cni/net.d
# rm -f $HOME/.kube/config

Operation on control panel node machine

Delete node kubectl delete node < node name > by executing the command

###Delete undeleted pod s
# kubectl delete pod kube-flannel-ds-4xqcc -n kube-system --force
# kubectl delete pod kube-proxy-c7qzs -n kube-system --force
# kubectl delete node k8snode1
node "k8snode1" deleted

After deletion, if you need to rejoin the node, you can run the join with appropriate parameters through kubedm join

Clean the control panel

You can use the kubedm reset command on the node machine in the control panel. Click to view the kubedm reset command reference

This article is transferred from: https://www.cnblogs.com/shouke/p/15318151.html

Topics: Docker iptables bootstrap etcd kube