Usually, the default k8s scheduling method is used, but in some cases, we need to run the pod on the node with the characteristic label to run them all. At this time, the pod's scheduling policy can not use the default k8s scheduling policy. At this time, we need to specify the scheduling policy, telling k8s that we need to schedule the pod on those nodes.
Normally, the nodeSelector scheduling strategy is used directly.Labels are a common way to label resources in k8s. We can label nodes with special labels, and then nodeSelector will schedule pod s on nodes with specified labels.
Here's an example:
First, view the label information of the node, and view the label of the node with the following command:
$ kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS master Ready master 147d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=master,node-role.kubernetes.io/master= node02 Ready <none> 67d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,course=k8s,kubernetes.io/hostname=node02 node03 Ready <none> 127d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,jnlp=haimaxy,kubernetes.io/hostname=node03
You can then add a label to the node02 node:
$ kubectl label nodes node02 com=yijiadashuju node "node02" labeled
You can then see if the above labels are valid by using the--show-labels parameter above.When a node is labeled, these labels can be used when dispatching by adding a nodeSelector field to the spec field of Pod, which contains the label of the node we need to be dispatched.For example, to force a Pod to be dispatched to the node 02, you can use the nodeSelector to represent it: (pod-selector-demo.yaml)
apiVersion: v1 kind: Pod metadata: labels: app: busybox-pod name: test-busybox spec: containers: - command: - sleep - "3600" image: busybox imagePullPolicy: Always name: test-busybox nodeSelector: com: yijiadashuju
Then, after executing the pod-selector-demo.yaml file, you can view the node information of the pod running using the following command
kubectl get pod -o wide -n default
You can also use the description command to see which node the pod is dispatched to:
$ kubectl create -f pod-selector-demo.yaml pod "test-busybox" created $ kubectl describe pod test-busybox Name: test-busybox Namespace: default Node: node02/10.151.30.63 ...... QoS Class: BestEffort Node-Selectors: com=youdianzhishi Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulMountVolume 55s kubelet, node02 MountVolume.SetUp succeeded for volume "default-token-n9w2d" Normal Scheduled 54s default-scheduler Successfully assigned test-busybox to node02 Normal Pulling 54s kubelet, node02 pulling image "busybox" Normal Pulled 40s kubelet, node02 Successfully pulled image "busybox" Normal Created 40s kubelet, node02 Created container Normal Started 40s kubelet, node02 Started container
As you can see from the execution above, pod is on the node02 node through the default-scheduler scheduler.However, this scheduling method is mandatory.If there are insufficient resources on node02, the state of the pod will always be pending.This is the use of nodeselector.
From the above introduction, we can see that nodeselector is very convenient to use, but there are many shortcomings, that is, it is not flexible enough, the control granularity is large, and there are still many inconveniences in actual use.Next let's look at first-affinity and anti-affinity scheduling.
Affinity and Antiaffinity Scheduling
The default scheduling process for k8s actually goes through two phases: predicates and priorities.Using the default dispatch process, k8s will dispatch pods to resource-rich nodes, use the nodeselector's dispatch method, and dispatch pods with specified labels.Then in the actual production ring ongoing environment, we need to dispatch the pod to a set of nodes with some labels to meet the actual needs, at this time we need nodeAffinity (node affinity), podAffinity(pod affinity), and podAntiAffinity(pod anti-affinity).
Affinity can be divided into hard and soft affinity.
Soft affinity: If the schedule does not meet the requirements, you can continue to schedule, that is, to meet the best, can not and does not matter
Hard affinity: refers to the dispatch must meet specific requirements, if not, then the pod will not be dispatched to the current node
Rules can be set:
Soft policy: preferredDuringScheduling IgnoredDuringExecution
Hard policy: requiredDuringScheduling IgnoredDuringExecution
Node affinity is primarily used to control which nodes a pod can be deployed on and on which nodes it cannot be deployed.It can make some simple logical combinations, not just simple equal matching.
Next, let's look at an example where Deployment is used to manage three copies of the pod and nodeAffinity is used to control the dispatch of the pod, as follows: (node-affinity-demo.yaml)
apiVersion: apps/v1beta1 kind: Deployment metadata: name: affinity labels: app: affinity spec: replicas: 3 revisionHistoryLimit: 15 template: metadata: labels: app: affinity role: test spec: containers: - name: nginx image: nginx:1.7.9 ports: - containerPort: 80 name: nginxweb affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: # Hard Policy nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: NotIn values: - node03 preferredDuringSchedulingIgnoredDuringExecution: # Soft Policy - weight: 1 preference: matchExpressions: - key: com operator: In values: - yijiadashuju
When this pod is dispatched, the first requirement is that it cannot run on node 03, but it will be dispatched on this node first if any node satisfies labels of com:yijiadashuju.
Next, look at the node information:
$ kubectl get nodes --show-labels NAME STATUS ROLES AGE VERSION LABELS master Ready master 154d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=master,node-role.kubernetes.io/master= node02 Ready <none> 74d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,com=yijiadashuju,course=k8s,kubernetes.io/hostname=node02 node03 Ready <none> 134d v1.10.0 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,jnlp=haimaxy,kubernetes.io/hostname=node03
You can see that the node02 node has a label of com=yijiadashuju, which is scheduled first as required. Next, create a pod, and then use the descirbe command to view the scheduling.
$ kubectl create -f node-affinity-demo.yaml deployment.apps "affinity" created $ kubectl get pods -l app=affinity -o wide NAME READY STATUS RESTARTS AGE IP NODE affinity-7b4c946854-5gfln 1/1 Running 0 47s 10.244.4.214 node02 affinity-7b4c946854-l8b47 1/1 Running 0 47s 10.244.4.215 node02 affinity-7b4c946854-r86p5 1/1 Running 0 47s 10.244.4.213 node02
As you can see from the results, the pod s are all deployed to the node02 node.
Kubernetes now provides the following operators
In:label value in a label NotIn:label value is not in a label The value of Gt:label is greater than a value The value of Lt:label is less than a value Exists: A label exists DoesNotExist: A label does not exist
If there are multiple options under nodeSelectorTerms, any of these conditions will be satisfied; if there are multiple options for matchExpressions, these conditions must be met to properly schedule POD s.
podAffinity pod affinity
The affinity of pods is mainly used to solve which pods can be deployed in the same cluster as which pods, i.e. in a topological domain (a cluster composed of nodes); while the anti-affinity of pods is used to solve the problem that pods cannot be deployed with which pods, both of which are used to solve the deployment problem between pods.It is important to note that affinity and anti-affinity between Pods require a lot of processing, which can significantly slow down scheduling in large clusters, is not recommended for clusters with hundreds of nodes, and Pod anti-affinity requires consistent tagging of nodes, that is, each node in the cluster must have an appropriate tag to match the topologyKey.Unexpected behavior may result if some or all nodes lack the specified topologyKey tag.
Here is an example of the affinity between pod s:
apiVersion: v1 kind: Pod metadata: name: with-pod-affinity spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: security operator: In values: - S1 topologyKey: failure-domain.beta.kubernetes.io/zone podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: security operator: In values: - S2 topologyKey: failure-domain.beta.kubernetes.io/zone containers: - name: with-pod-affinity image: k8s.gcr.io/pause:2.0
podAntiAffinity pod antiaffinity
Here is an example of a pod anti-affinity yaml file:
apiVersion: apps/v1 kind: Deployment metadata: name: redis-cache spec: selector: matchLabels: app: store replicas: 3 template: metadata: labels: app: store spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - store topologyKey: "kubernetes.io/hostname" containers: - name: redis-server image: redis:3.2-alpine