Operator-SDK: Custom CRD for Node request information collection

Posted by amit on Fri, 11 Feb 2022 01:09:07 +0100

Operator-SDK: Custom CRD for Node request information collection

A demo for information collection, which writes a Controller custom CRD to implement Node's request information collection. The primary purpose is to obtain CPU and memory usage. Part of the code refers to the source implementation of describe in the kubectl command.

Cluster Information

First kubectl get node looks at the cluster information, one master node, three slave nodes.

$ kubectl get node
NAME                    STATUS   ROLES                  AGE   VERSION     
10.100.100.130-slave    Ready    <none>                 49d   v1.21.5-hc.1
10.100.100.131-master   Ready    control-plane,master   50d   v1.21.5-hc.1
10.100.100.144-slave    Ready    <none>                 49d   v1.21.5-hc.1
10.100.100.147-slave    Ready    <none>                 49d   v1.21.5-hc.1

Controller Configuration

Create a CRDNoderequest using Operator-SDK.

In controllers/memcached_ Controller. The SetupWithManager() function in go specifies how the controller builds the monitoring CR and other resources that the controller owns and manages.

For (&v1.Node{}) specifies the Node type as the primary resource to monitor. For each Node type Add/Update/Delete event, the reconcile loop sends a reconcile request Request (namespace/name key) to the Node object.

Owns (&v1.Pod{}) specifies the Pod type as the secondary resource to monitor. For each Add/Update/Delete event of type Pod, the event handler maps each event to a reconcile Request in a coordinated request from the deployment Owner.

Watches (&source.Kind{Type: &v1.Pod{}, &handler.EnqueueRequestForObject{}) incorporates changes in Pod into their monitoring.

// SetupWithManager sets up the controller with the Manager.
func (r *NoderequestReconciler) SetupWithManager(mgr ctrl.Manager) error {
	return ctrl.NewControllerManagedBy(mgr).
		//For(&noderequestv1alpha1.Noderequest{}).
		For(&v1.Node{}).
		Owns(&v1.Pod{}).
		Watches(&source.Kind{Type: &v1.Pod{}}, &handler.EnqueueRequestForObject{}).
		WithEventFilter(watchPodChange()).
		Complete(r)
}

Use Predicates to filter events for Operator SDK

Events Assigned to the resource being monitored by the controller Sources Generate. These events are caused by EventHandlers Converts to a request and passes it to Reconcile(). Predicates Allows controllers to filter events before they are provided to EventHandlers. Filtering is useful because controllers may only want to handle certain types of events, and filtering is helpful to reduce the number of interactions with API server s because Reconcile() is only called on events converted by EventHandlers.

WithEventFilter(watchPodChange()) Custom watchPodChange() function monitors the specific changes to the Pod event Add/Update/Delete (event):

func watchPodChange() predicate.Predicate {
	return predicate.Funcs{
		UpdateFunc: func(e event.UpdateEvent) bool {
			if e.ObjectOld.GetNamespace() == "" {
				return false
			} else {
				fmt.Println("update: ", e.ObjectOld.GetName())
				return true
			}
		},
		DeleteFunc: func(e event.DeleteEvent) bool {
			// watch delete
			fmt.Println("delete: ", e.Object.GetName())
			return e.DeleteStateUnknown
		},
		CreateFunc: func(e event.CreateEvent) bool {
			// watch create
			//fmt.Println("create: ", e.Object.GetName())
			if e.Object.GetNamespace() == "" {
				return true
			} else {
				return false
			}
		},
	}
}

Update

Because Node will periodically send heartbeats to ApiService to ensure health check-ups, each Node node will generate a large number of update requests. In order to only monitor changes in Node request caused by changes in Pod, a large number of Node update events need to be filtered. Node's if condition needs to be judged in Event events to distinguish whether Node or Pod is changing. Since the significant difference between Node and Pod is that Node does not have a Namespace, the judgement is e.ObjectOld.GetNamespace() == ".

All creation events for any monitoring resource will be passed to Funcs.UpdateFunc() and filters when the method evaluates to false, and if the Predicate method is not registered for a particular type, events of that type are not filtered.

Delete

When a Pod deletion event is detected, check through e.DeleteStateUnknowowown to see if the Pod has been deleted and all deletion events for any monitoring resource will be passed to Funcs.DeleteFunc() and filters when the method evaluates to false, and if the Predicate method is not registered for a particular type, events of that type are not filtered.

Create

Since creation is accompanied by some update operations when a pod is created, there is no need to monitor the creation events of a Pod because they have already been done in the update, so creation is to monitor the creation of a Node. By e.Object.GetNamespace() == "" determines if a Node is created and updates the information monitoring for the request if true.

Reconcile loop

Reconcile is responsible for enforcing the required CR state on the actual state of the system. Each time a monitored CR or resource changes, it runs and returns the corresponding value based on whether these states match.

Create the variable name to get the name value of the Node node:

var name client.ObjectKey

Identify whether the current request is to a Node node or a Pod by the if condition judgement, which is based on the fact that Node's Namespace is empty and Pod's Namespace is not.

When Pod is judged, req.NamespacedName.Namespace!= When'', by pod: = &v1. Pod{} gets all the Pod information, if it cannot Get it correctly, throws an err and prints the error message FMT in the console. Println ("ERROR[GetPod]:", err).

name.Name = pod.Spec.NodeName gets the name value of the Node node where the current Pod is located.

Node: = &v1. Node{} gets node information, which is pod.Status.Phase judges a Pod in the Running state. The reason for this is that when a Pod is created, the Pod is not in the Running state when the create event occurs. Only when the last update event occurs, the newly created Pod will be in the Running state. If not judged, monitoring event updates will become very frequent, and judging can filter out a large number of Pod request s. Node's CPU and memory information are updated only when Pod is running correctly.

func (r *NoderequestReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	_ = log.FromContext(ctx)

	// your logic here

	var name client.ObjectKey // get node name

	if req.NamespacedName.Namespace != "" {
		pod := &v1.Pod{}
		err := r.Client.Get(ctx, req.NamespacedName, pod)
		if err != nil {
			fmt.Println("ERROR[GetPod]:", err)
			return ctrl.Result{}, nil
		}
		name.Name = pod.Spec.NodeName // use for.pod
		fmt.Println(name.Name)
		node := &v1.Node{}
		err = r.Client.Get(ctx, name, node)
		if err != nil {
			fmt.Println("ERROR[GetNode]:", err)
			return ctrl.Result{}, nil
		}
		if pod.Status.Phase == "Running" {
			compute(ctx, req, name, r, node)
		}
	} else {
		name.Name = req.NamespacedName.Name // use for.node
		fmt.Println(name.Name)
		node := &v1.Node{}
		err := r.Client.Get(ctx, name, node)
		if err != nil {
			fmt.Println("ERROR[GetNode]:", err)
			return ctrl.Result{}, nil
		}
		compute(ctx, req, name, r, node)
	}

	return ctrl.Result{}, nil
}

When Node's request is monitored, the name of the currently updated Node is printed first, and its request is recalculated.

Compute function

The compute function calculates Node's request. The code references the source implementation of describe in the kubectl command. Compute uses the name passed in to determine which Node's node information needs to be recalculated.

Viewing node information using describe

First, look at the master node's information with describe to make sure that the calculated results can be compared with this information. For easy viewing, some of the contents are omitted:

$ kubectl describe node 10.100.100.131-master
Name:               10.100.100.131-master        
Roles:              control-plane,master

...

Addresses:
  InternalIP:  10.100.100.131
  Hostname:    10.100.100.131-master
Capacity:
  cpu:                4
  ephemeral-storage:  51175Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8111548Ki
  pods:               63
Allocatable:
  cpu:                4
  ephemeral-storage:  48294789041
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             7599548Ki
  pods:               63
  
...

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                2632m (65%)   5110m (127%)
  memory             2237Mi (30%)  7223Mi (97%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)

Get all Pod s

Computers are the core content of computing, and their implementation principles need to be explained in detail.

First pods: = &v1. PodList {} gets all Pod information.

Client in ListOption. InNamespace ('') is a Pod that gets a namespace, and when the content is'', that is, empty, all Pods are obtained.

opts := []client.ListOption{
		client.InNamespace(""),
	}

Then by allocatable: = node. Status. Capacity gets the allocatable cpu and memory size of the current node to provide a cardinality for the calculation.

Get cpu and mem of request s for all Pod s

The following references the source implementation of describe in the kubectl command.

For, Pod: = range pods. Itemss will loop through all Pods, but first you need to judge which Pods can be counted by the if condition, i.e. Pods that exclude Succeed and Failed states, and the source implementation of describe calculation request in K8s also excludes those two states.

Also, depending on the current value of Node's name passed in, make sure that the Pod information calculated is from the same Node node, and that Pods not on this Node will not be included in the calculation. The if criteria are as follows:

if pod.Status.Phase != "Succeed" && pod.Status.Phase != "Failed" && pod.Spec.NodeName == name.Name

Take the request for Cpu calculation as an example, dividing the Cpu size required for the current Pod by the allocatable CPU size and taking a percentage, you can get the resources that the current Pod occupies, and the calculation for mem is the same:

fractionCpuReq := float64(container.Resources.Requests.Cpu().MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100

To make it easy to see if all Pods are calculated correctly, print the information of all Pods calculated on the console, and for easy viewing, because when Cpu or mem is zero at the same time, the calculation of this Pod will not affect the final result. Only when Cpu or mem has a value of not zero will the calculated request result be affected. So you need to filter out Pods with Cpu or mem equal to 0:

if container.Resources.Requests.Cpu().String() != "0" || container.Resources.Requests.Memory().String() != "0" {
					fmt.Printf("ReqC: %s(%d%%)\tReqM:  %s(%d%%)\tLimC: %s(%d%%)\tLimM:  %s(%d%%)\n",
						container.Resources.Requests.Cpu().String(),
						int64(fractionCpuReq),
						container.Resources.Requests.Memory().String(),
						int64(fractionMemoryReq),
						container.Resources.Limits.Cpu().String(),
						int64(fractionCpuLimits),
						container.Resources.Limits.Memory().String(),
						int64(fractionMemoryLimits),
					)
				}

addResourceList function

The purpose of the addResourceList function is to add the value of all request s for the Pod that need to be calculated to the queue. The name will have two values, cpu and mem, to distinguish the calculation of cpu and MEM for ease of calculation:

// addResourceList adds the resources in newList to list
func addResourceList(list, new v1.ResourceList) {
   for name, quantity := range new {
      if value, ok := list[name]; !ok {
         list[name] = quantity.DeepCopy()
      } else {
         value.Add(quantity)
         list[name] = value
      }
   }

Sum

Loop through the Pod list to queue req and limits for each Pod. Since a Pod with a limit of 0 represents no request limit, such a Pod needs to be removed in the calculation and not in the Running state.

PodReqs, podLimits: = v1. ResourceList {}, v1. ResourceList {} stores the req and limits of all Pods, and stores their req and limits in the list for each Pod that is looped. The sum code is as follows:

// sum
				podReqs, podLimits := v1.ResourceList{}, v1.ResourceList{}
				addResourceList(reqs, container.Resources.Requests)
				addResourceList(limits, container.Resources.Limits)
				// Add overhead for running a pod to the sum of requests and to non-zero limits:
				if pod.Spec.Overhead != nil {
					addResourceList(reqs, pod.Spec.Overhead)
					for name, quantity := range pod.Spec.Overhead {
						if value, ok := limits[name]; ok && !value.IsZero() {
							value.Add(quantity)
							limits[name] = value
						}
					}
				}

Define reqs, limits to store request s and limits for all Pod s needed for calculation

reqs, limits := map[v1.ResourceName]resource.Quantity{}, map[v1.ResourceName]resource.Quantity{}

Save the req and limits of each Pod in reqs, limits. Take req as an example, podReqName determines whether the contents of the cpu or mem are stored in the corresponding value:

for podReqName, podReqValue := range podReqs {
					if value, ok := reqs[podReqName]; !ok {
						reqs[podReqName] = podReqValue.DeepCopy()
					} else {
						value.Add(podReqValue)
						reqs[podReqName] = value
					}
				}
				for podLimitName, podLimitValue := range podLimits {
					if value, ok := limits[podLimitName]; !ok {
						limits[podLimitName] = podLimitValue.DeepCopy()
					} else {
						value.Add(podLimitValue)
						limits[podLimitName] = value
					}
				}

At this point cpuReqs, cpuLimits, memoryReqs, memoryLimits get the values of the cpu and mem corresponding to the request, the cpu and mem of the limits:

cpuReqs, cpuLimits, memoryReqs, memoryLimits := reqs[v1.ResourceCPU], limits[v1.ResourceCPU], reqs[v1.ResourceMemory], limits[v1.ResourceMemory]

Take CpuReqs for example, divide the total of all cpuReqs by the allocatable cpu value, and take the percentage:

fractionCpuReqs = float64(cpuReqs.MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100

The same applies to other content:

if allocatable.Cpu().MilliValue() != 0 {
		fractionCpuReqs = float64(cpuReqs.MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
		fractionCpuLimits = float64(cpuLimits.MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
	}
	fractionMemoryReqs := float64(0)
	fractionMemoryLimits := float64(0)
	if allocatable.Memory().Value() != 0 {
		fractionMemoryReqs = float64(memoryReqs.Value()) / float64(allocatable.Memory().Value()) * 100
		fractionMemoryLimits = float64(memoryLimits.Value()) / float64(allocatable.Memory().Value()) * 100
	}

The compute source code is as follows:

func compute(ctx context.Context, req ctrl.Request, name client.ObjectKey, r *NoderequestReconciler, node *v1.Node) {
	pods := &v1.PodList{} // get all pods
	opts := []client.ListOption{
		client.InNamespace(""),
	}
	err := r.Client.List(ctx, pods, opts...)
	if err != nil {
		fmt.Println("ERROR[List]:", err)
	}

	allocatable := node.Status.Capacity
	if len(node.Status.Allocatable) > 0 {
		allocatable = node.Status.Allocatable
	}

	reqs, limits := map[v1.ResourceName]resource.Quantity{}, map[v1.ResourceName]resource.Quantity{}

	// get request cpu & mem
	for _, pod := range pods.Items {
		if pod.Status.Phase != "Succeed" && pod.Status.Phase != "Failed" && pod.Spec.NodeName == name.Name {
			for _, container := range pod.Spec.Containers {
				// pod
				fractionCpuReq := float64(container.Resources.Requests.Cpu().MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
				fractionMemoryReq := float64(container.Resources.Requests.Memory().Value()) / float64(allocatable.Memory().Value()) * 100
				fractionCpuLimits := float64(container.Resources.Limits.Cpu().MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
				fractionMemoryLimits := float64(container.Resources.Limits.Memory().Value()) / float64(allocatable.Memory().Value()) * 100
				if container.Resources.Requests.Cpu().String() != "0" || container.Resources.Requests.Memory().String() != "0" {
					fmt.Printf("ReqC: %s(%d%%)\tReqM:  %s(%d%%)\tLimC: %s(%d%%)\tLimM:  %s(%d%%)\n",
						container.Resources.Requests.Cpu().String(),
						int64(fractionCpuReq),
						container.Resources.Requests.Memory().String(),
						int64(fractionMemoryReq),
						container.Resources.Limits.Cpu().String(),
						int64(fractionCpuLimits),
						container.Resources.Limits.Memory().String(),
						int64(fractionMemoryLimits),
					)
				}
				// sum
				podReqs, podLimits := v1.ResourceList{}, v1.ResourceList{}
				addResourceList(reqs, container.Resources.Requests)
				addResourceList(limits, container.Resources.Limits)
				// Add overhead for running a pod to the sum of requests and to non-zero limits:
				if pod.Spec.Overhead != nil {
					addResourceList(reqs, pod.Spec.Overhead)
					for name, quantity := range pod.Spec.Overhead {
						if value, ok := limits[name]; ok && !value.IsZero() {
							value.Add(quantity)
							limits[name] = value
						}
					}
				}
				for podReqName, podReqValue := range podReqs {
					if value, ok := reqs[podReqName]; !ok {
						reqs[podReqName] = podReqValue.DeepCopy()
					} else {
						value.Add(podReqValue)
						reqs[podReqName] = value
					}
				}
				for podLimitName, podLimitValue := range podLimits {
					if value, ok := limits[podLimitName]; !ok {
						limits[podLimitName] = podLimitValue.DeepCopy()
					} else {
						value.Add(podLimitValue)
						limits[podLimitName] = value
					}
				}
			}
		}
	}
	fmt.Printf("Resource\tRequests\tLimits\n")
	fmt.Printf("--------\t--------\t------\n")

	cpuReqs, cpuLimits, memoryReqs, memoryLimits := reqs[v1.ResourceCPU], limits[v1.ResourceCPU], reqs[v1.ResourceMemory], limits[v1.ResourceMemory]
	fractionCpuReqs := float64(0)
	fractionCpuLimits := float64(0)
	if allocatable.Cpu().MilliValue() != 0 {
		fractionCpuReqs = float64(cpuReqs.MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
		fractionCpuLimits = float64(cpuLimits.MilliValue()) / float64(allocatable.Cpu().MilliValue()) * 100
	}
	fractionMemoryReqs := float64(0)
	fractionMemoryLimits := float64(0)
	if allocatable.Memory().Value() != 0 {
		fractionMemoryReqs = float64(memoryReqs.Value()) / float64(allocatable.Memory().Value()) * 100
		fractionMemoryLimits = float64(memoryLimits.Value()) / float64(allocatable.Memory().Value()) * 100
	}

	fmt.Printf("%s\t%s (%d%%)\t%s (%d%%)\n", v1.ResourceCPU, cpuReqs.String(), int64(fractionCpuReqs), cpuLimits.String(), int64(fractionCpuLimits))
	fmt.Printf("%s\t%s (%d%%)\t%s (%d%%)\n", v1.ResourceMemory, memoryReqs.String(), int64(fractionMemoryReqs), memoryLimits.String(), int64(fractionMemoryLimits))

	fmt.Println("--------------------------------------------")
}

final result

You can see that request s and limits for three slave nodes and one maste node are output in a format similar to that of describe, and the final result is also output for each Pod that participates in the calculation:

10.100.100.130-slave
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 500m(12%) LimM:  512Mi(3%)
ReqC: 0(0%)     ReqM:  200Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 100m(2%)  LimM:  128Mi(0%)
ReqC: 102m(2%)  ReqM:  180Mi(1%)        LimC: 250m(6%)  LimM:  180Mi(1%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  60Mi(0%)
ReqC: 1(25%)    ReqM:  2Gi(13%) LimC: 2(50%)    LimM:  4Gi(26%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  20Mi(0%) LimC: 100m(2%)  LimM:  30Mi(0%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 100m(2%)  LimM:  128Mi(0%)
ReqC: 250m(6%)  ReqM:  250Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
Resource        Requests        Limits
--------        --------        ------
cpu     2062m (51%)     3370m (84%)
memory  3177Mi (20%)    5209Mi (33%)
--------------------------------------------
10.100.100.131-master
ReqC: 0(0%)     ReqM:  200Mi(2%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 102m(2%)  ReqM:  180Mi(2%)        LimC: 250m(6%)  LimM:  180Mi(2%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  60Mi(0%)
ReqC: 100m(2%)  ReqM:  100Mi(1%)        LimC: 200m(5%)  LimM:  200Mi(2%)
ReqC: 200m(5%)  ReqM:  256Mi(3%)        LimC: 1(25%)    LimM:  2Gi(27%)
ReqC: 100m(2%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 250m(6%)  ReqM:  250Mi(3%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 500m(12%) ReqM:  128Mi(1%)        LimC: 1(25%)    LimM:  256Mi(3%)
ReqC: 50m(1%)   ReqM:  128Mi(1%)        LimC: 100m(2%)  LimM:  256Mi(3%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  40Mi(0%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  40Mi(0%)
ReqC: 100m(2%)  ReqM:  150Mi(2%)        LimC: 100m(2%)  LimM:  150Mi(2%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 100m(2%)  LimM:  128Mi(1%)
ReqC: 250m(6%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 200m(5%)  LimM:  256Mi(3%)
ReqC: 200m(5%)  ReqM:  256Mi(3%)        LimC: 1(25%)    LimM:  2Gi(27%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 500m(12%) LimM:  512Mi(6%)
ReqC: 100m(2%)  ReqM:  100Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 200m(5%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 50m(1%)   ReqM:  20Mi(0%) LimC: 500m(12%) LimM:  1Gi(13%)
Resource        Requests        Limits
--------        --------        ------
cpu     2632m (65%)     5110m (127%)
memory  2237Mi (30%)    7223Mi (97%)
--------------------------------------------
10.100.100.144-slave
ReqC: 102m(2%)  ReqM:  180Mi(1%)        LimC: 250m(6%)  LimM:  180Mi(1%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  60Mi(0%)
ReqC: 500m(12%) ReqM:  4Gi(26%) LimC: 2(50%)    LimM:  8Gi(52%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 500m(12%) LimM:  512Mi(3%)
ReqC: 200m(5%)  ReqM:  512Mi(3%)        LimC: 1(25%)    LimM:  1Gi(6%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 100m(2%)  LimM:  128Mi(0%)
ReqC: 1(25%)    ReqM:  2Gi(13%) LimC: 2(50%)    LimM:  4Gi(26%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 200m(5%)  LimM:  256Mi(1%)
ReqC: 250m(6%)  ReqM:  250Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 50m(1%)   ReqM:  100Mi(0%)        LimC: 50m(1%)   LimM:  100Mi(0%)
ReqC: 0(0%)     ReqM:  200Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 100m(2%)  ReqM:  70Mi(0%) LimC: 0(0%)     LimM:  170Mi(1%)
Resource        Requests        Limits
--------        --------        ------
cpu     2812m (70%)     6420m (160%)
memory  7935Mi (51%)    14793Mi (95%)
--------------------------------------------
10.100.100.147-slave
ReqC: 102m(2%)  ReqM:  180Mi(1%)        LimC: 250m(6%)  LimM:  180Mi(1%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  60Mi(0%)
ReqC: 250m(6%)  ReqM:  250Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  100Mi(0%)        LimC: 200m(5%)  LimM:  200Mi(1%)
ReqC: 100m(2%)  ReqM:  70Mi(0%) LimC: 0(0%)     LimM:  170Mi(1%)
ReqC: 100m(2%)  ReqM:  20Mi(0%) LimC: 100m(2%)  LimM:  30Mi(0%)
ReqC: 200m(5%)  ReqM:  512Mi(3%)        LimC: 1(25%)    LimM:  1Gi(6%)
ReqC: 100m(2%)  ReqM:  128Mi(0%)        LimC: 500m(12%) LimM:  512Mi(3%)
Resource        Requests        Limits
--------        --------        ------
cpu     962m (24%)      2070m (51%)
memory  1280Mi (8%)     2176Mi (14%)
--------------------------------------------

The mater node obtains the correct result as compared with the previous calculation directly using the describe command:

Resource           Requests      Limits
  --------           --------      ------
  cpu                2632m (65%)   5110m (127%)
  memory             2237Mi (30%)  7223Mi (97%)

At the same time, there is a periodic update Pod in the cluster, so the controller can detect events correctly and recalculate the corresponding Node:

update:  velero-745bf958b4-xpfw8
update:  velero-745bf958b4-xpfw8
10.100.100.131-master
ReqC: 0(0%)     ReqM:  200Mi(2%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  25Mi(0%) LimC: 100m(2%)  LimM:  25Mi(0%)
ReqC: 102m(2%)  ReqM:  180Mi(2%)        LimC: 250m(6%)  LimM:  180Mi(2%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  60Mi(0%)
ReqC: 200m(5%)  ReqM:  256Mi(3%)        LimC: 1(25%)    LimM:  2Gi(27%)
ReqC: 100m(2%)  ReqM:  100Mi(1%)        LimC: 200m(5%)  LimM:  200Mi(2%)
ReqC: 100m(2%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 250m(6%)  ReqM:  250Mi(3%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 500m(12%) ReqM:  128Mi(1%)        LimC: 1(25%)    LimM:  256Mi(3%)
ReqC: 50m(1%)   ReqM:  128Mi(1%)        LimC: 100m(2%)  LimM:  256Mi(3%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 100m(2%)  LimM:  128Mi(1%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  40Mi(0%)
ReqC: 10m(0%)   ReqM:  20Mi(0%) LimC: 20m(0%)   LimM:  40Mi(0%)
ReqC: 100m(2%)  ReqM:  150Mi(2%)        LimC: 100m(2%)  LimM:  150Mi(2%)
ReqC: 250m(6%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 200m(5%)  LimM:  256Mi(3%)
ReqC: 200m(5%)  ReqM:  256Mi(3%)        LimC: 1(25%)    LimM:  2Gi(27%)
ReqC: 100m(2%)  ReqM:  128Mi(1%)        LimC: 500m(12%) LimM:  512Mi(6%)
ReqC: 100m(2%)  ReqM:  100Mi(1%)        LimC: 0(0%)     LimM:  0(0%)
ReqC: 200m(5%)  ReqM:  0(0%)    LimC: 0(0%)     LimM:  0(0%)
ReqC: 50m(1%)   ReqM:  20Mi(0%) LimC: 500m(12%) LimM:  1Gi(13%)
Resource        Requests        Limits
--------        --------        ------
cpu     2632m (65%)     5110m (127%)
memory  2237Mi (30%)    7223Mi (97%)
--------------------------------------------

Topics: Go Kubernetes Back-end Container Cloud Native

Programmer Think

Operator-SDK: Custom CRD for Node request information collection

Operator-SDK: Custom CRD for Node request information collection

Cluster Information

Controller Configuration

Use Predicates to filter events for Operator SDK

Update

Delete

Create

Reconcile loop

Compute function

Viewing node information using describe

Get all Pod s

Get cpu and mem of request s for all Pod s

addResourceList function

Sum

final result

Hot Topics