Enterprise operation and maintenance practice -- k8s learning notes and k8s scheduling

Posted by jakebrewer1 on Mon, 25 Oct 2021 06:12:18 +0200

1. Introduction to kubernetes dispatching

The scheduler uses kubernetes' watch mechanism to discover newly created pods in the cluster that have not yet been scheduled to nodes. The scheduler will schedule each unscheduled Pod found to run on a suitable Node.

Kube scheduler is the default scheduler for Kubernetes clusters and is part of the cluster control surface. If you really want or need this, Kube scheduler is designed to allow you to write a scheduling component and replace the original Kube scheduler.

Factors to be considered when making scheduling decisions include: individual and overall resource requests, hardware / software / policy constraints, affinity and anti affinity requirements, data locality, interference between loads, etc.

The default policy can refer to: https://kubernetes.io/zh/docs/concepts/scheduling/kube-scheduler/

2.nodename node selection constraint

NodeName is the simplest method for node selection constraints, but it is generally not recommended. If nodeName is specified in PodSpec, it takes precedence over other node selection methods.

Some limitations of using nodeName to select nodes:

If the specified node does not exist.
If the specified node has no resources to accommodate the pod, the pod scheduling fails.
Node names in a cloud environment are not always predictable or stable.

3.nodeSelector affinity

nodeSelector is the simplest recommended form of node selection constraint. Add labels to the selected nodes and schedule through labels.

3.1. Node affinity

Create a directory and write a resource list
vi pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    disktype: ssd

Execute and view status

kubectl apply -f pod.yaml 
kubectl get pod -o wide


The reason for the successful operation is that our node has an ssd tag

kubectl get nodes --show-labels 


Node affinity pod example

kubectl delete -f pod.yaml
vi pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:  #Must meet
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
              - ssd
              - fc

In the experiment just now, we found that nginx pod is on server3
Remove the label on 3 before performing the view

kubectl label nodes server3 disktype-
kubectl apply -f pod.yaml
kubectl get pod -o wide

Discovery is scheduled to server2
kubectl label nodes server3 disktype=ssd
Add the tag back and edit pod.yaml again
kubectl delete -f pod.yaml
vi pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            operator: In
            values:
              - ssd
              - fc
      preferredDuringSchedulingIgnoredDuringExecution: #Propensity satisfaction
        - weight: 1
          preference:
            matchExpressions:
            - key: role
              operator: In
              values:
              - prod

After execution, the view is again scheduled back to server3

kubectl apply -f  pod.yaml 
kubectl get pod -o wide

3.2 example

Keep the previous nginx pod
Pod affinity, mysql container affinity, nginx pod
Note that the mysql image version used is image: mysql:5.7
There needs to be in the warehouse.

vi pod1.yaml

apiVersion: v1
kind: Pod
metadata:
  name: mysql
  labels:
    app: mysql
spec:
  containers:
  - name: mysql
    image: mysql:5.7
    env:
     - name: "MYSQL_ROOT_PASSWORD"
       value: "westos"
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: env    ###Find it through kubectl get pod -- show labels
            operator: In
            values:
            - test   ##ditto
        topologyKey: kubernetes.io/hostname

Execute the listing pod2.yaml to view the information

kubectl apply -f pod1.yaml
kubectl get pod -o wide


pod anti affinity

vi pod1.yaml

 podAntiAffinity:

Execute view
We can see that the mysql node is not on the same node as the nginx service. It separates the service from the data and is scheduled to server2.

4. Taints

NodeAffinity Node affinity is an attribute defined on the Pod, which enables the Pod to be scheduled to a Node according to our requirements. On the contrary, Taints can make the Node refuse to run the Pod or even expel the Pod.

Taints is an attribute of a Node. After setting taints, Kubernetes will not schedule the Pod to this Node.
Therefore, Kubernetes sets a property tolerance for the Pod. As long as the Pod can tolerate the stains on the Node, Kubernetes will ignore the stains on the Node and can (not necessarily) schedule the Pod.

You can use the command kubectl taint to add a taint to the node:

kubectl taint nodes node1 key=value:NoSchedule / / create
kubectl describe nodes server1 |grep Taints / / query
kubectl taint nodes node1 key:NoSchedule - / / delete

Where [effect] can take the value: [NoSchedule | PreferNoSchedule | NoExecute]

NoSchedule: POD will not be scheduled to nodes marked tails.
PreferNoSchedule: the soft policy version of NoSchedule.
NoExecute: this option means that once Taint takes effect, if the running POD in this node does not have a corresponding tolerance setting, it will be evicted directly.

For example, the Kubernetes cluster host is tainted. Therefore, generally, this node is not selected as the deployment node when deploying pod. The taint information of matser is as follows:

 kubectl  describe  nodes server1 | grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule

4.1 adding stains

4.1.1 NoSchedule

kubectl taint node server3 k1=v1:NoSchedule
kubectl delete -f pod.yaml 
kubectl apply -f  pod.yaml 
kubectl get pod -o wide

It was found that the pod originally running on server3 could not run

4.1.2 .NoExecute

Create a controller

vi deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
     # tolerations:
     #- operator: "Exists"

Add NoExecute:
This option means that once Taint takes effect, if the running POD in this node does not have a corresponding tollate setting, it will be evicted directly. Expelled to other node nodes.
kubectl taint nodes server2 key=value:NoExecute
Because there is no tolerance and other node s, all pod s are pending
If we remove the stain of server3, we will schedule all servers to srever3

kubectl taint node server3 k1=v1:NoSchedule-
kubectl get pod -o wide

4.2 add tolerance

Originally, server1 did not join the cluster. Now we set it to join the cluster
Let's test:

vi deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 30  #The number is too small. The probability of server1 scheduling is small
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
      tolerations:
      - operator: "Exists"    ##Tolerate all things

Delete execution view

kubectl delete -f deployment.yaml 
kubectl apply -f  deployment.yaml 
kubectl get pod -o wide


server1 found

4.3cordon,drain,delete

Instructions affecting pod scheduling include cordon, drain and delete. Later created pods will not be scheduled to this node, but the degree of violence is different.
cordon stop scheduling:
The impact is minimal. Only the node will be set to scheduling disabled. The newly created pod will not be scheduled to the node. The original pod of the node will not be affected and will still provide services to the outside world normally.
drain expulsion node:
First expel the pod on the node, recreate it on other nodes, and then set the node to schedulendisabled.
Delete delete node
The most violent one is to expel the pod on the node and recreate it on other nodes. Then, delete the node from the master node, and the master loses control over it. To restore scheduling, you need to enter the node node and restart the kubelet service

vi deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 6
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
      tolerations:
      - operator: "Exists"

Execution view scheduling node

kubectl apply -f deployment.yaml
kubectl get pod -o wide

4.3.1 cordon

Turn off cluster scheduling for server3

kubectl cordon server3
kubectl get pod -o wide
kubectl get node

All were expelled to server2
Reschedule server3

kubectl uncordon server3
kubectl get node

4.3.2 drain

kubectl drain server2 --ignore-daemonsets

Everything in server2 is dispatched to other nodes

kubectl get node

kubectl uncordon server2
restart

4.3.3 delete

Directly delete node server3
kubectl delete node server3

Rejoin server3 to the scheduling node
The kubelet service needs to be restarted on server3

server3:
systemctl restart kubelet
server1:
kubectl get node

5. Add a new cluster node

Because the token will only exist for 23 hours, it needs to be recreated

kubeadm token create
kubeadm token list

Token cert hash is also required:
This value will not change. Just check it out

openssl x509 -pubky -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubinoutform der 2>/dev/null | \
openssl dgst -sha256 -hex | sed 's/^.* //'

Then join the cluster with the following instructions

kubeadm join 172.25.0.2:6443 --token ******** --discovery-token-ca-cert-hash sha256: *******************

Topics: Operation & Maintenance Docker Kubernetes