Kubernetes workload controller

Posted by dougmcc1 on Sat, 25 Dec 2021 17:58:24 +0100

1 k8s workload controller

What is the workload controller?

  • Workload Controllers is an abstract concept of K8s, which is used for higher-level objects, deployment and management of Pod.

  • Workloads are applications running on kubernetes.

  • Whether your load is a single component or multiple components working together, you can run it in a set of Pods in Kubernetes. In Kuberneres, pod represents a set of containers running on the cluster.

  • Kubernetes Pods has a defined life cycle. For example, when a Pod runs in your cluster and a fatal error occurs on the node where the Pod runs, all Pods on that node will fail. Kubernetes regards such failures as the final state: even if the node returns to normal operation later, you need to create a new Pod to restore the application.

  • However, in order to make life a little easier for users, you don't need to manage each Pod directly. Instead, you can use load resources to manage a set of Pods for you. These resource configuration controllers ensure that the number of Pods of the appropriate type and running state is correct and consistent with the state you specify.

Common workload controllers:

  • Deployment
  • StatefulSet
  • DaemonSet
  • Job
  • CronJob

Function of controller:

  • Manage Pod objects
  • Associate with Pod using tag
  • The controller realizes the operation and maintenance of Pod, such as rolling update, scaling, replica management, maintaining Pod status, etc

2 Deployment

  • A Deployment provides declarative update capabilities for Pods and ReplicaSets.

  • You are responsible for describing the target state in the Deployment, and the Deployment controller changes the actual state to the desired state at a controlled rate. You can define a Deployment to create a new ReplicaSet, or delete an existing Deployment and adopt its resources through the new Deployment.

  • Deployment is very suitable for managing stateless applications on your cluster. All pod s in deployment are equivalent to each other and are replaced when necessary.

Functions of Deployment:
Based on ReplicaSet, it can provide declarative updates for Pod and ReplicaSet resources. It has the following features:

  • Event and status view: you can view the detailed progress and status of the Deployment object upgrade
  • Rollback: when a problem is found after the upgrade operation is completed, the application can be returned to the specified historical version
  • Version record: save every operation of the Deployment object
  • Pause and start: each upgrade can be paused and started at any time
  • Multiple automatic update schemes: Recreate update, RollingUpdate rolling update

Application scenario:

  • website
  • Microservices
  • APi

The update strategy of Deployment is described as follows:

RollingUpdate strategy: the number of pods of old controllers is decreasing, while the number of pods of new controllers is increasing. The following two attributes:

  • maxSurge: the maximum number of total pods existing during upgrade can exceed the expected value, which can be a value or percentage.
  • Maxunavailable: the number of normally available pods (old and new versions) during upgrade cannot be less than the expected number at most, which can be a value or percentage.

Create Deployment

[root@master mainfest]# vim deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy
  labels:
    app: nginx
spec: 
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template: 
    metadata: 
      labels:
        app: nginx
    spec: 
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
        
[root@master mainfest]# kubectl apply -f deploy.yaml 
deployment.apps/deploy created
[root@master mainfest]# kubectl get pod
NAME                     READY   STATUS    RESTARTS   AGE
deploy-6799fc88d8-qmw4h   1/1     Running   0          46s
deploy-6799fc88d8-xn6mk   1/1     Running   0          46s
deploy-6799fc88d8-2dmrq   1/1     Running   0          46s       

In this example:

A deployment named deploy (indicated by the. metadata.name field) was created
This deployment creates three (indicated by the replicas field) copies of the pod
The selector field defines how Deployment finds Pods to manage. Here, you can select the tag (app: nginx) defined in the Pod template. However, more complex selection rules are also possible, as long as the Pod template itself meets the given rules.
The template field contains the following subfields:
Pod is labeled app: nginx using the labels field.
The Pod template specification (i.e. the. template.spec field) instructs Pods to run an nginx container that runs the nginx Docker Hub image of version 1.14.2.
Create a container and name it nginx using the name field.

3 ReplicaSet

The purpose of ReplicaSet is to maintain a stable set of Pod replicas that are running at any time. Therefore, it is usually used to ensure the availability of a given number of identical pods.

How ReplicaSet works

  • RepicaSet is defined by a set of fields, including a selection operator that identifies the set of available pods, a value used to indicate the number of copies that should be maintained, a Pod template used to specify that a new Pod should be created to meet the conditions for the number of copies, and so on. Each ReplicaSet uses the provided Pod template when it creates a new Pod as needed.

  • ReplicaSet via metadata. On Pod The ownerReferences field is connected to the affiliated Pod, which gives the master resource of the current object. The Pod obtained by the ReplicaSet contains the identification information of the master ReplicaSet in its ownerReferences field. It is through this connection that ReplicaSet knows the state of the Pod set it maintains and plans its operation behavior accordingly.

  • The ReplicaSet uses its selection operator to identify the Pod set to obtain. If a Pod does not have an OwnerReference or its OwnerReference is not a controller, and it matches the selection operator of a ReplicaSet, the Pod is immediately obtained by the ReplicaSet.


Differences among ReplicationController, ReplicaSet and Deployment

  • ReplicaSet is a new generation of ReplicationController, and it is recommended to use it instead of - ReplicationController to replicate and manage Pod.

  • At the same time, when using Deployment, the actual Pod is created and managed by the Deployment's replicate, rather than directly created and managed by the Deployment.

  • Every time the Deployment publishes, an RS will be created as a record for rollback

When to use ReplicaSet

  • ReplicaSet ensures that a specified number of Pod replicas are running at any time. However, Deployment is a more advanced concept that manages the ReplicaSet and provides declarative updates and many other useful functions to the Pod. Therefore, we recommend using Deployment instead of directly using ReplicaSet, unless you need to customize the update business process or do not need to update at all.

  • This actually means that you may never need to manipulate the ReplicaSet object: instead, use Deployment and define your application in the spec section.
    Example:

[root@master mainfest]# vim replicaset.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: replicaset
  labels:
    app: httpd
    tier: frontend
spec:
  replicas: 3
  selector:
    matchLabels:
      tier: frontend
  template:
    metadata:
      labels:
        tier: frontend
    spec:
      containers:
      - name: httpd
        image: httpd:latest

[root@master mainfest]# kubectl apply -f replicaset.yaml 
replicaset.apps/replicaset created
[root@master mainfest]# kubectl get rs
NAME               DESIRED   CURRENT   READY   AGE
deploy-585449566   3         3         3       94m
replicaset         3         3         2       49s
[root@master mainfest]# kubectl get pod
NAME                     READY   STATUS    RESTARTS   AGE
deploy-6799fc88d8-qmw4h   1/1    Running   0          46s
deploy-6799fc88d8-xn6mk   1/1    Running   0          46s
deploy-6799fc88d8-2dmrq   1/1    Running   0          46s       
replicaset-de35r         1/1     Running   0          54s
replicaset-l4kx4         1/1     Running   0          54s
replicaset-f295r         1/1     Running   0          54s

4 DaemonSet

The DaemonSet controller ensures that each Node in the cluster only runs a specific Pod copy, implements system level background tasks, and also has a label selector. You can also specify nodes that partially meet the conditions to run a Pod copy, such as monitoring nodes with ssd storage.

It is often used to deploy some cluster log, monitoring or other system management applications. Typical applications include:

  • Log collection, such as fluent D, logstash, etc.
  • System monitoring, such as Prometheus Node Exporter, collected, New Relic agent, Ganglia gmond, etc.
  • System programs, such as Kube proxy, Kube DNS, Glusterd, Ceph, etc.

A simple usage is to start a daemon set on all nodes for each type of daemon. A slightly more complex usage is to deploy multiple daemonsets for the same daemon; Each has different flags, and has different memory and CPU requirements for different hardware types.

Create DaemonSet
You can describe the DaemonSet in the YAML file. For example, the following daemon The YAML file describes a runtime

[root@master mainfest]# vim daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      # this toleration is to have the daemonset runnable on master nodes
      # remove it if your masters can't run pods
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

//Create view
[root@master mainfest]# kubectl create -f daemonset.yaml 
daemonset.apps/fluentd-elasticsearch created
[root@master mainfest]# kubectl get pod -n kube-system
fluentd-elasticsearch-36bv8      1/1     Running   0          23s
fluentd-elasticsearch-b8d73      1/1     Running   0          23s

5 how is daemon pods scheduled

Scheduling via default scheduler

FEATURE STATE: Kubernetes v1.23 [stable]

The daemon set ensures that all eligible nodes run a copy of the Pod. Typically, the node running the Pod is selected by the Kubernetes scheduler. However, the DaemonSet Pods are created and scheduled by the DaemonSet controller. This brings the following problems:

  • Inconsistency of Pod behavior: normal pods are in Pending state when waiting for scheduling after being created, and DaemonSet Pods will not be in Pending state after being created. This confuses users.
  • Pod preemption is handled by the default scheduler. After preemption is enabled, the DaemonSet controller will make scheduling decisions without considering pod priority and preemption.

ScheduleDaemonSetPods allows you to schedule DaemonSets using the default scheduler instead of the DaemonSet controller by placing NodeAffinity conditions instead of Add the spec.nodeName condition to the DaemonSet Pods. The default scheduler then binds the Pod to the target host. If the node affinity configuration of the DaemonSet Pod already exists, it is replaced (the original node affinity configuration is considered before selecting the target host). The DaemonSet controller performs these operations only when creating or modifying the DaemonSet Pod and does not change the spec.template of the DaemonSet.

nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchFields:
      - key: metadata.name
        operator: In
        values:
        - target-host-name
  • In addition, the system automatically adds node kubernetes. IO / unscheduled: NoSchedule tolerance to DaemonSet Pods. When scheduling the DaemonSet Pod, the default scheduler ignores the unscheduled node.

6 Jobs

The Job controller is used for the Pod object to run a one-time task. The process in the container will not restart it after normal operation, but put the Pod object in the "Completed" state. If the process in the container is terminated due to an error, you need to determine whether to restart according to the restart policy configuration, The unfinished Pod object will be scheduled after it is terminated unexpectedly due to the failure of its node. The state transition of the Pod object of the Job controller is shown in the following figure:

  • The Job will create one or more Pods and will continue to retry the execution of Pods until the specified number of Pods are successfully terminated. With the successful completion of Pods, the Job tracks the number of successfully completed Pods. When the number reaches the specified success threshold, the task (i.e. Job) ends. Deleting a Job will clear all the created Pods. Suspending a Job will delete all the active Pods of the Job until the Job is resumed again.

  • In a simple usage scenario, you will create a Job object to run a Pod in a reliable way until it is completed. When the first Pod fails or is deleted (for example, because the node hardware fails or restarts), the Job object will start a new Pod.

  • You can also use Job to run multiple pods in parallel.

[root@master mainfest]# vim jobs.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

[root@master mainfest]# kubectl apply -f jobs.yaml 
job.batch/pi created

[root@master mainfest]# kubectl describe jobs/pi
Name:           pi
Namespace:      default
Selector:       controller-uid=73b759ad-23db-879b-927c-d73146fd359
Labels:         controller-uid=73b759ad-23db-879b-927c-d73146fd359
                job-name=pi
Annotations:    <none>
Parallelism:    1
Completions:    1
Start Time:     Fri, 24 Dec 2021 23:57:23 -0500
Pods Statuses:  1 Running / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=73b759ad-23db-879b-927c-d73146fd359
           job-name=pi
  Containers:
   pi:
    Image:      perl
    Port:       <none>
    Host Port:  <none>
    Command:
      perl
      -Mbignum=bpi
      -wle
      print bpi(2000)
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  48s   job-controller  Created pod: pi-rm6dx

7 CronJob

FEATURE STATE: Kubernetes v1.21 [stable]

CronJob creates Jobs based on interval repeat scheduling.

A CronJob object is like a line in a crontab (cron table) file. It is written in Cron format and executes jobs periodically at a given scheduling time.

be careful:

  • All CronJob schedule s are based on Kube controller manager Time zone.

  • If your control plane runs Kube controller manager in a Pod or a bare container, the time zone set for the container will determine the time zone used by the controller of Cron Job.

  • When creating a manifest for CronJob resources, ensure that the name provided is a legal DNS subdomain name The name cannot exceed 52 characters. This is because the CronJob controller will automatically append 11 characters after the Job name provided, and there is a limit that the maximum length of the Job name cannot exceed 63 characters.

  • CronJob is used to perform periodic actions, such as backup, report generation, etc. Each of these tasks should be configured to repeat periodically (e.g. daily / weekly / monthly); you can define the time interval at which the task starts to execute.

The following CronJob sample list prints out the current time and greeting messages every minute:

[root@master mainfest]# vim cronjob.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: haha
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure


//Run view
[root@master mainfest]# kubectl apply -f cronjob.yaml 
cronjob.batch/hello created

[root@master mainfest]# kubectl get pods
NAME                     READY   STATUS      RESTARTS   AGE
haha-2736939037-qnvvx   0/1     Completed           0          3m
haha-3674348466-mcmxs   0/1     Completed           0          2m
haha-1890349842-6gqv4   0/1     Completed           0          60s
[root@master mainfest]# kubectl logs hello-1640349360-6gqv4
Fri Dec 24 23:59:27 UTC 2021
Hello from the Kubernetes cluster

Topics: Operation & Maintenance Docker Kubernetes