Article directory
- Deploy Kube Prometheus
- Modify grafana
- Create directory on 32.94, create pv, pvc
- Create svc of grafana MySQL
- Create cm for grafana.ini
- Modify the configuration in grafana deployment, add the operation of mounting configmap and pvc
- Resubmit grafana deploy
- Creating ingress for grafana
- Create a service for the Kube scheduler
- Create the service of Kube Controller Manager
- There is also proxy monitoring in grafana
- Change the storage of prometheus to local pvc
- Try to modify the resource file of prometheus to change the running parameters for prom
- Change the storage of prom to local pvc
- Change the storage of alertmanager to local pvc
- Create the serviceMonitor of ingress nginx
- The error that prom can't access resources caused by insufficient permissions of the default rbac
- spec.ports added in ingress-nginx-svc
- Add dashboard of ingress nginx in grafana
- metrics api in Kube prom
- Test hpav2
- Submit resources under 'experimental / custom metrics API'
- Try a new image on 36.55
- Change image, resubmit
- Reference resources
We found the existence of Kube Prometheus on the Prometheus operator page. This is a project that takes Prom operator as the core. It defines the monitoring resources related to cluster and should be used for deployment.
This paper is based on Kube Prometheus
- Modify the storage of grafana, Prometheus and alertmanager to local pvc
- Added inress access
- Add the service monitor of ingress nginx (verify that the added monitoring is correct)
- Modified the error of custom metrics and implemented hpaV2
Deploy Kube Prometheus
Download related files
# Decompression error wget https://github.com/coreos/kube-prometheus/archive/v0.3.0.tar.gz
Kube Prometheus project composition
Kube Prometheus is roughly divided into the following parts
- grafana
- kube-state-metrics
- alertmanager
- node-exporter
- prometheus-adapter
- prometheus
- serviceMonitor
It includes Kube state metrics and Prometheus adapter projects. It will be described separately for Prometheus adapter later.
Submission of resources
Follow the instructions in the documentation to commit the resources in setup first
Submit files in setup
[root@docker-182 manifests]# k apply -f setup/ namespace/monitoring created Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com configured customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com configured Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com configured Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com configured clusterrole.rbac.authorization.k8s.io/prometheus-operator created clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created deployment.apps/prometheus-operator created service/prometheus-operator created serviceaccount/prometheus-operator created [root@bj-k8s-master-56 ~]# k -n monitoring get all NAME READY STATUS RESTARTS AGE pod/prometheus-operator-6685db5c6-fsfsp 1/1 Running 0 80s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/prometheus-operator ClusterIP None <none> 8080/TCP 81s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/prometheus-operator 1/1 1 1 81s NAME DESIRED CURRENT READY AGE replicaset.apps/prometheus-operator-6685db5c6 1 1 1 81s
Submit files in manifest
[root@docker-182 manifests]# k apply -f . alertmanager.monitoring.coreos.com/main created secret/alertmanager-main created service/alertmanager-main created serviceaccount/alertmanager-main created servicemonitor.monitoring.coreos.com/alertmanager created secret/grafana-datasources created configmap/grafana-dashboard-apiserver created configmap/grafana-dashboard-cluster-total created configmap/grafana-dashboard-controller-manager created configmap/grafana-dashboard-k8s-resources-cluster created configmap/grafana-dashboard-k8s-resources-namespace created configmap/grafana-dashboard-k8s-resources-node created configmap/grafana-dashboard-k8s-resources-pod created configmap/grafana-dashboard-k8s-resources-workload created configmap/grafana-dashboard-k8s-resources-workloads-namespace created configmap/grafana-dashboard-kubelet created configmap/grafana-dashboard-namespace-by-pod created configmap/grafana-dashboard-namespace-by-workload created configmap/grafana-dashboard-node-cluster-rsrc-use created configmap/grafana-dashboard-node-rsrc-use created configmap/grafana-dashboard-nodes created configmap/grafana-dashboard-persistentvolumesusage created configmap/grafana-dashboard-pod-total created configmap/grafana-dashboard-pods created configmap/grafana-dashboard-prometheus-remote-write created configmap/grafana-dashboard-prometheus created configmap/grafana-dashboard-proxy created configmap/grafana-dashboard-scheduler created configmap/grafana-dashboard-statefulset created configmap/grafana-dashboard-workload-total created configmap/grafana-dashboards created deployment.apps/grafana created service/grafana created serviceaccount/grafana created servicemonitor.monitoring.coreos.com/grafana created clusterrole.rbac.authorization.k8s.io/kube-state-metrics created clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created deployment.apps/kube-state-metrics created role.rbac.authorization.k8s.io/kube-state-metrics created rolebinding.rbac.authorization.k8s.io/kube-state-metrics created service/kube-state-metrics created serviceaccount/kube-state-metrics created servicemonitor.monitoring.coreos.com/kube-state-metrics created clusterrole.rbac.authorization.k8s.io/node-exporter created clusterrolebinding.rbac.authorization.k8s.io/node-exporter created daemonset.apps/node-exporter created service/node-exporter created serviceaccount/node-exporter created servicemonitor.monitoring.coreos.com/node-exporter created apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured clusterrole.rbac.authorization.k8s.io/prometheus-adapter created clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader configured clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created configmap/adapter-config created deployment.apps/prometheus-adapter created rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created service/prometheus-adapter created serviceaccount/prometheus-adapter created clusterrole.rbac.authorization.k8s.io/prometheus-k8s created clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created servicemonitor.monitoring.coreos.com/prometheus-operator created prometheus.monitoring.coreos.com/k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s-config created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created role.rbac.authorization.k8s.io/prometheus-k8s created prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created service/prometheus-k8s created serviceaccount/prometheus-k8s created servicemonitor.monitoring.coreos.com/prometheus created servicemonitor.monitoring.coreos.com/kube-apiserver created servicemonitor.monitoring.coreos.com/coredns created servicemonitor.monitoring.coreos.com/kube-controller-manager created servicemonitor.monitoring.coreos.com/kube-scheduler created servicemonitor.monitoring.coreos.com/kubelet created
crd resources created by Kube Prometheus
[root@bj-k8s-master-56 ~]# k get crd -o wide NAME CREATED AT alertmanagers.monitoring.coreos.com 2019-11-26T03:48:24Z podmonitors.monitoring.coreos.com 2020-03-04T07:11:14Z prometheuses.monitoring.coreos.com 2019-11-26T03:48:24Z prometheusrules.monitoring.coreos.com 2019-11-26T03:48:24Z servicemonitors.monitoring.coreos.com 2019-11-26T03:48:24Z
The prometheus resource defines how the prometheus service should run
[root@bj-k8s-master-56 ~]# k -n monitoring get prometheus NAME AGE k8s 36m
Similarly, alertmanager defines the operation of alertmanager resources
[root@bj-k8s-master-56 ~]# kubectl -n monitoring get alertmanager NAME AGE main 37m
prometheus and alertmanager are both statefullset controllers
[root@bj-k8s-master-56 ~]# k -n monitoring get statefulset -o wide NAME READY AGE CONTAINERS IMAGES alertmanager-main 3/3 34m alertmanager,config-reloader quay.io/prometheus/alertmanager:v0.18.0,quay.io/coreos/configmap-reload:v0.0.1 prometheus-k8s 1/2 33m prometheus,prometheus-config-reloader,rules-configmap-reloader quay.io/prometheus/prometheus:v2.11.0,quay.io/coreos/prometheus-config-reloader:v0.34.0,quay.io/coreos/configmap-reload:v0.0.1
Modify grafana
The default grafana does not have a configmap for the configuration file, but uses sqlite stored in / var/lib/grafana mounted in emptydir
- Create pvc of local type for grafana and mount it to / var/lib/grafana
- Create cm of profile and mount
- Creating an ingress resource
Create directory on 32.94, create pv, pvc
[root@bj-k8s-node-84 ~]# mkdir /data/apps/data/pv/monitoring-grafana [root@bj-k8s-node-84 ~]# chown 65534:65534 /data/apps/data/pv/monitoring-grafana [root@docker-182 grafana]# k apply -f grafana-local-pv.yml,grafana-local-pvc.yml persistentvolume/grafana-pv created persistentvolumeclaim/grafana-pvc created
Create database
MariaDB [(none)]> create database k8s_55_grafana default character set utf8; Query OK, 1 row affected (0.01 sec) MariaDB [(none)]> grant all on k8s_55_grafana.* to grafana@'%'; Query OK, 0 rows affected (0.05 sec) MariaDB [(none)]> flush privileges; Query OK, 0 rows affected (0.03 sec)
Create svc of grafana MySQL
[root@docker-182 grafana]# k apply -f grafana-mysql_endpoint.yaml service/grafana-mysql created endpoints/grafana-mysql created
Create cm for grafana.ini
[root@docker-182 grafana]# k55 apply -f grafana-config_cm.yaml configmap/grafana-config created
Modify the configuration in grafana deployment, add the operation of mounting configmap and pvc
[root@docker-182 grafana]# cp /data/apps/soft/ansible/kubernetes/kube-prometheus-0.3.0/manifests/grafana-deployment.yaml ./ [root@docker-182 grafana]# diff /data/apps/soft/ansible/kubernetes/kube-prometheus-0.3.0/manifests/grafana-deployment.yaml ./grafana-deployment.yaml 35a36,38 > - mountPath: /etc/grafana/grafana.ini > name: grafana-ini > subPath: grafana.ini 124c127,129 < - emptyDir: {} --- > #- emptyDir: {} > - persistentVolumeClaim: > claimName: grafana-pvc 203a209,211 > - configMap: > name: grafana-config > name: grafana-ini
Resubmit grafana deploy
[root@docker-182 grafana]# k apply -f grafana-deployment.yaml deployment.apps/grafana configured
Creating ingress for grafana
[root@docker-182 ingress-nginx]# cat ../kube-prometheus/grafana/grafana_ingress.yaml apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: grafana-ingress namespace: monitoring annotations: # use the shared ingress-nginx kubernetes.io/ingress.class: "nginx" nginx.ingress.kubernetes.io/rewrite-target: /$2 #nginx.ingress.kubernetes.io/app-root: / spec: rules: - http: paths: - path: /mygrafana(/|$)(.*) backend: serviceName: grafana servicePort: 3000 [root@docker-182 grafana]# k apply -f grafana_ingress.yaml ingress.networking.k8s.io/grafana-ingress created
Create a service for the Kube scheduler
By default, only the service monitor of the Kube scheduler is available. The svc of the defined monitoring object Kube system / Kube scheduler is not implemented, so you need to add one manually.
[root@bj-k8s-master-56 ~]# k -n monitoring get servicemonitor kube-scheduler NAME AGE kube-scheduler 19h
[root@docker-182 kube-prometheus]# k apply -f prometheus-kubeSchedulerService.yaml service/kube-scheduler created Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply endpoints/kube-scheduler configured
Create the service of Kube Controller Manager
[root@docker-182 kube-prometheus]# k apply -f prometheus-kubeControllerManagerService.yaml service/kube-controller-manager created Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply endpoints/kube-controller-manager configured
There is also proxy monitoring in grafana
This is supposed to be monitoring for the Kube proxy, but the resource file defining the Kube proxy is not found. Let's skip it for now.
Change the storage of prometheus to local pvc
I want to find out how to modify the working mode of the Prometheus operator on the basic resources (that is, modify the generated statefullset before submitting resources, rather than after submitting them to the cluster).
It is found that the Kube Prometheus project uses jsonnet language. According to his documents, it needs to be customized in this language, and then generate relevant yaml files.
Of course, I can also modify the generated yaml file in his manifests, which is used here.
Try to modify the resource file of prometheus to change the running parameters for prom
# spec: New containers: - name: prometheus args: - --web.console.templates=/etc/prometheus/consoles - --web.console.libraries=/etc/prometheus/console_libraries - --config.file=/etc/prometheus/config_out/prometheus.env.yaml - --storage.tsdb.path=/prometheus - --storage.tsdb.retention.time=360h # It used to be 24 hours - --web.enable-lifecycle - --storage.tsdb.no-lockfile - --web.route-prefix=/ [root@docker-182 manifests]# k apply -f prometheus-prometheus.yaml prometheus.monitoring.coreos.com/k8s configured
It is feasible to do so.
Then we need to change the storage volume of prometheus to pvc to prevent the data from being emptied after pod reconstruction.
Change the storage of prom to local pvc
- Create pv
- Create pvc, prometheus-k8s-db-prometheus-k8s-0 and prometheus-k8s-db-prometheus-k8s-1
- Then reference in statefulset (the following is an example of reference in statefuls, which is added in prometheus resource, which will be slightly different)
volumeClaimTemplates: - metadata: name: db spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 1Gi
The name of pvc is volume_name + '-' + pod_name. In the existing statefullset, the name of data volume is prometheus-k8s-db,
So that's what the name looks like, and it has to be these two names.
Originally, statefullset can dynamically generate pvc using storageClass, but there is no relevant storage resource here, so you choose to manually create pvc and then reference it.
Create directory and pvc
# 1000 and 1001 are uid and gid of the user running prom on the host [root@docker-182 kube-prometheus]# ansible 10.111.32.94 -m file -a "path=/data/apps/data/pv/prometheus-k8s-db-prometheus-k8s-0 state=directory owner=1000 group=1001" [root@docker-182 kube-prometheus]# ansible 10.111.32.178 -m file -a "path=/data/apps/data/pv/prometheus-k8s-db-prometheus-k8s-1 state=directory owner=1000 group=1001" # Create pv and pvc [root@docker-182 kube-prometheus]# k55 apply -f prometheus-k8s-db-prometheus-k8s-0_pv.yml persistentvolume/prometheus-k8s-db-prometheus-k8s-0 created [root@docker-182 kube-prometheus]# k55 apply -f prometheus-k8s-db-prometheus-k8s-0_pvc.yml persistentvolumeclaim/prometheus-k8s-db-prometheus-k8s-0 created [root@docker-182 kube-prometheus]# k55 apply -f prometheus-k8s-db-prometheus-k8s-1_pv.yml persistentvolume/prometheus-k8s-db-prometheus-k8s-1 created [root@docker-182 kube-prometheus]# k55 apply -f prometheus-k8s-db-prometheus-k8s-1_pvc.yml persistentvolumeclaim/prometheus-k8s-db-prometheus-k8s-1 created
Submit update
# prometheus-prometheus.yaml added under spec storage: volumeClaimTemplate: metadata: name: prometheus-k8s-db spec: accessModes: - ReadWriteOnce resources: requests: storage: 200Gi [root@docker-182 kube-prometheus]# k apply -f prometheus-prometheus.yaml
Check the status of pvc and the information in pod, successful.
[root@bj-k8s-master-56 ~]# k -n monitoring get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE grafana-pvc Bound grafana-pv 16Gi RWO local-storage 5d1h prometheus-k8s-db-prometheus-k8s-0 Bound prometheus-k8s-db-prometheus-k8s-0 200Gi RWO local-storage 3m43s prometheus-k8s-db-prometheus-k8s-1 Bound prometheus-k8s-db-prometheus-k8s-1 200Gi RWO local-storage 3m33s # The original volume information is - emptyDir: {} name: prometheus-k8s-db # Covering the original emptyDir volumes: - name: prometheus-k8s-db persistentVolumeClaim: claimName: prometheus-k8s-db-prometheus-k8s-0
Change the storage of alertmanager to local pvc
The pvc name should be alertmanager-main-db-alertmanager-main-0,
alertmanager-main-db-alertmanager-main-1,alertmanager-main-db-alertmanager-main-2
Create pvc
[root@docker-182 kube-prometheus]# ansible 10.111.32.94 -m file -a "path=/data/apps/data/pv/alertmanager-main-db-alertmanager-main-0 state=directory owner=1000 group=1001" [root@docker-182 kube-prometheus]# ansible 10.111.32.94 -m file -a "path=/data/apps/data/pv/alertmanager-main-db-alertmanager-main-1 state=directory owner=1000 group=1001" [root@docker-182 kube-prometheus]# ansible 10.111.32.178 -m file -a "path=/data/apps/data/pv/alertmanager-main-db-alertmanager-main-2 state=directory owner=1000 group=1001" # Submit pv and pvc resources [root@docker-182 alertmanager]# ls -1r |while read line; do k apply -f ${line};done persistentvolume/alertmanager-main-db-alertmanager-main-2 created persistentvolumeclaim/alertmanager-main-db-alertmanager-main-2 created persistentvolume/alertmanager-main-db-alertmanager-main-1 created persistentvolumeclaim/alertmanager-main-db-alertmanager-main-1 created persistentvolume/alertmanager-main-db-alertmanager-main-0 created persistentvolumeclaim/alertmanager-main-db-alertmanager-main-0 created
Modify the alertmanager resource file and submit changes
# Add storage parameter under spec storage: volumeClaimTemplate: metadata: name: alertmanager-main-db spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi # Submit changes [root@docker-182 alertmanager]# k apply -f alertmanager-alertmanager.yaml alertmanager.monitoring.coreos.com/main configured
Verification is correct.
[root@bj-k8s-master-56 ~]# k -n monitoring get pvc |grep alertmanager alertmanager-main-db-alertmanager-main-0 Bound alertmanager-main-db-alertmanager-main-1 10Gi RWO local-storage 4h1m alertmanager-main-db-alertmanager-main-1 Bound alertmanager-main-db-alertmanager-main-2 10Gi RWO local-storage 4h1m alertmanager-main-db-alertmanager-main-2 Bound alertmanager-main-db-alertmanager-main-0 10Gi RWO local-storage 4h1m [root@bj-k8s-master-56 ~]# k -n monitoring get statefulset alertmanager-main -o yaml ... volumeClaimTemplates: - metadata: creationTimestamp: null name: alertmanager-main-db spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi volumeMode: Filesystem status: phase: Pending ... [root@bj-k8s-master-56 ~]# k -n monitoring get pod -o wide |grep alertmanager alertmanager-main-0 2/2 Running 0 3m56s 10.20.60.180 bj-k8s-node-84.tmtgeo.com <none> <none> alertmanager-main-1 2/2 Running 0 3m56s 10.20.245.249 bj-k8s-node-178.tmtgeo.com <none> <none> alertmanager-main-2 2/2 Running 0 3m56s 10.20.60.179 bj-k8s-node-84.tmtgeo.com <none> <none>
Create the serviceMonitor of ingress nginx
[root@docker-182 ingress-nginx]# cat ingress-serviceMonitor.yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app.kubernetes.io/name: ingress-nginx name: ingress-nginx namespace: monitoring spec: endpoints: - interval: 15s port: "10254" bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token jobLabel: app.kubernetes.io/name namespaceSelector: matchNames: - ingress-nginx selector: matchLabels: app.kubernetes.io/name: ingress-nginx
The error that prom can't access resources caused by insufficient permissions of the default rbac
After submitting, prom reported an error
level=error ts=2020-03-10T10:33:21.196Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:263: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"ingress-nginx\"" level=error ts=2020-03-10T10:33:22.197Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:264: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"ingress-nginx\"" level=error ts=2020-03-10T10:33:22.198Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:265: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"ingress-nginx\""
At first glance, there must be no permission, but why can resources in the default Kube system namespace be obtained?
Create a clusterRole and bind it to prometheus-k8s sa (although you can also directly change the original clusterRole configuration)
[root@docker-182 kube-prometheus]# cat my-prometheus-clusterRoleBinding.yml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: my-prometheus rules: - apiGroups: [""] resources: - nodes - services - endpoints - pods verbs: - get - list - watch - apiGroups: [""] resources: - configmaps verbs: - get - nonResourceURLs: - /metrics verbs: - get --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: my-prometheus roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: my-prometheus subjects: - kind: ServiceAccount name: prometheus-k8s namespace: monitoring [root@docker-182 kube-prometheus]# k apply -f my-prometheus-clusterRoleBinding.yml clusterrole.rbac.authorization.k8s.io/my-prometheus created clusterrolebinding.rbac.authorization.k8s.io/my-prometheus created
After configuring clusterRoleBinding, you can find the endpoints of ingress nginx, but only 80 and 443 exist in the default endpoints, and its metrics port 10254 is not defined in the configuration of daemonset, so it can't be detected here.
[root@bj-k8s-master-56 ~]# k -n ingress-nginx get endpoints -o wide NAME ENDPOINTS AGE ingress-nginx 10.111.32.178:80,10.111.32.94:80,10.111.32.178:443 + 1 more... 4d17h
spec.ports added in ingress-nginx-svc
- name: metrics port: 10254 targetPort: 10254
Submit changes
[root@docker-182 ingress-nginx]# k apply -f ingress-nginx-svc.yaml service/ingress-nginx configured # 10254 already exists in endpoints [root@bj-k8s-master-56 ~]# k -n ingress-nginx get endpoints NAME ENDPOINTS AGE ingress-nginx 10.111.32.178:80,10.111.32.94:80,10.111.32.178:10254 + 3 more... 4d17h
Modified in serviceMonitor of ingress nginx
endpoints: - interval: 15s port: metrics bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
In this way, the target of ingress nginx is added.
Add dashboard of ingress nginx in grafana
Load the dashboard from https://github.com/kubernetes/ingress-nginx/tree/master/deploy/grafana/dashboards
I found that many label s in the dashboard Request Handling Performance are old and useless.
There are many things in nginx progress controller that are useless
metrics api in Kube prom
Kube state metrics and Prometheus adapter are included in Kube Prometheus.
The apiservice of Prometheus adapter is v1beta1.metrics.k8s.io.
[root@bj-k8s-master-56 ~]# k get apiservice |grep prome v1beta1.metrics.k8s.io monitoring/prometheus-adapter True 54d
This is not true. The apiservice of v1beta1.metrics.k8s.io belongs to the metrics server of kubernetes. After Kube Prom uses it, it will cause some problems in applications that depend on this api, such as hpa resources
[root@bj-k8s-master-56 ~]# kubectl -n kube-system get pod -o wide |grep metrics metrics-server-7ff49d67b8-mczv8 1/1 Running 2 51d 10.20.245.239 bj-k8s-node-178.tmtgeo.com <none> <none>
Test hpav2
hpa v2
[root@docker-182 hpa]# k apply -f . horizontalpodautoscaler.autoscaling/metrics-app-hpa created deployment.apps/metrics-app created service/metrics-app created servicemonitor.monitoring.coreos.com/metrics-app created
Report errors
Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedComputeMetricsReplicas 18m (x12 over 21m) horizontal-pod-autoscaler Invalid metrics (1 invalid out of 1), last error was: failed to get object metric value: unable to get metric http_requests: unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered Warning FailedGetPodsMetric 73s (x80 over 21m) horizontal-pod-autoscaler unable to get metric http_requests: unable to fetch metrics from custom metrics API: no custom metrics API (custom.metrics.k8s.io) registered
Submit resources under experimental / custom metrics API
Repair the aipservice of metrics server first
[root@docker-182 metrics-server]# k apply -f metrics-apiservice.yaml apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured
[root@docker-182 custom-metrics-api]# ls *.yaml |while read line; do k apply -f ${line};done clusterrolebinding.rbac.authorization.k8s.io/custom-metrics-server-resources created apiservice.apiregistration.k8s.io/v1beta1.custom.metrics.k8s.io created clusterrole.rbac.authorization.k8s.io/custom-metrics-server-resources created configmap/adapter-config configured clusterrolebinding.rbac.authorization.k8s.io/hpa-controller-custom-metrics created servicemonitor.monitoring.coreos.com/sample-app created service/sample-app created deployment.apps/sample-app created horizontalpodautoscaler.autoscaling/sample-app created [root@docker-182 custom-metrics-api]# pwd /data/apps/soft/ansible/kubernetes/kube-prometheus-0.3.0/experimental/custom-metrics-api
v1beta1.metrics.k8s.io returns to normal, and v1beta1.custom.metrics.k8s.io reports an error
[root@bj-k8s-master-56 ~]# k get apiservices |grep metric v1beta1.custom.metrics.k8s.io monitoring/prometheus-adapter False (FailedDiscoveryCheck) 3m5s v1beta1.metrics.k8s.io kube-system/metrics-server True 54d
The error message is
Status: Conditions: Last Transition Time: 2020-03-11T07:45:25Z Message: failing or missing response from https://10.20.60.171:6443/apis/custom.metrics.k8s.io/v1beta1: bad status from https://10.20.60.171:6443/apis/custom.metrics.k8s.io/v1beta1: 404 Reason: FailedDiscoveryCheck Status: False Type: Available Events: <none> [root@bj-k8s-master-56 ~]# k -n monitoring get pod -o wide |grep adap prometheus-adapter-68698bc948-qmpvr 1/1 Running 0 7d 10.20.60.171 bj-k8s-node-84.tmtgeo.com <none> <none>
I don't really have access to the relevant api
[root@bj-k8s-master-56 ~]# curl -i -k https://10.20.60.171:6443/apis/custom.metrics.k8s.io HTTP/1.1 404 Not Found Content-Type: application/json Date: Wed, 11 Mar 2020 07:56:10 GMT Content-Length: 229 { "paths": [ "/apis", "/apis/metrics.k8s.io", "/apis/metrics.k8s.io/v1beta1", "/healthz", "/healthz/ping", "/healthz/poststarthook/generic-apiserver-start-informers", "/metrics", "/version" ] }
Try a new image on 36.55
No image found for latest label
[root@bj-k8s-node-84 ~]# docker pull quay.io/coreos/k8s-prometheus-adapter-amd64:latest Error response from daemon: manifest for quay.io/coreos/k8s-prometheus-adapter-amd64:latest not found
- https://quay.io/repository/coreos/k8s-prometheus-adapter-amd64 :
Found a new tag on its website: v0.6.0 (you can directly modify the image to let kubelet download the image itself, but the network is not good, so manually down to all node s)
[root@bj-k8s-node-84 ~]# docker pull quay.io/coreos/k8s-prometheus-adapter-amd64:v0.6.0
Change image, resubmit
[root@docker-182 adapter]# grep image: prometheus-adapter-deployment.yaml image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.6.0 [root@docker-182 adapter]# k apply -f prometheus-adapter-deployment.yaml deployment.apps/prometheus-adapter configured
It's back to normal
[root@bj-k8s-master-56 ~]# k get apiservices |grep custom v1beta1.custom.metrics.k8s.io monitoring/prometheus-adapter True 117m [root@bj-k8s-master-56 ~]#k -n monitoring get pod -o wide | grep adapter prometheus-adapter-7b785b6685-z6gfp 1/1 Running 0 91s 10.20.60.183 bj-k8s-node-84.tmtgeo.com <none> <none> [root@bj-k8s-master-56 ~]# curl -i -k https://10.20.60.183:6443/apis/custom.metrics.k8s.io HTTP/1.1 200 OK Content-Type: application/json Date: Wed, 11 Mar 2020 09:41:45 GMT Content-Length: 303 { "kind": "APIGroup", "apiVersion": "v1", "name": "custom.metrics.k8s.io", "versions": [ { "groupVersion": "custom.metrics.k8s.io/v1beta1", "version": "v1beta1" } ], "preferredVersion": { "groupVersion": "custom.metrics.k8s.io/v1beta1", "version": "v1beta1" } }
hpav2 is back to normal
[root@bj-k8s-master-56 ~]# k get hpa NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE metrics-app-hpa Deployment/metrics-app 36133m/800m 2 10 4 159m myapp Deployment/myapp 23%/60% 2 5 2 178m sample-app Deployment/sample-app 400m/500m 1 10 1 126m
Now think about it. When Kube Prometheus is deployed, the Prometheus adapter takes up v1beta1.metrics.k8s.io, and the top command can be used normally. This is because the image of quay.io/cores/k8s-prometheus-adapter-amd64: v0.5.0 is the reason that there is only / apis/metrics.k8s.io/v1beta1. This record is in curl above.
Reference resources
- https://github.com/coreos/prometheus-operator : prometheus-operator github page
- https://github.com/coreos/kube-prometheus : kube-prometheus github page
- Https://github.com/coreos/prometheus-operator/blob/master/documentation/api.md: prometheus-operator API documentation
- Https://www.cnblogs.com/skyflag/p/11480988.html: kubernetes monitoring ultimate solution - Kube promethues (a deployment instance)