Deploying kubernetes 1.16.0 high availability cluster requires only two steps

Posted by greenie__ on Fri, 27 Sep 2019 09:22:08 +0200

Course

wget https://github.com/fanux/sealos/releases/download/v2.0.7/sealos && chmod +x sealos && mv sealos /usr/bin 

sealos init --passwd YOUR_SERVER_PASSWD \
    --master 192.168.0.2  --master 192.168.0.3  --master 192.168.0.4  \
    --node 192.168.0.5 \
    --pkg-url https://sealyun.oss-cn-beijing.aliyuncs.com/cf6bece970f6dab3d8dc8bc5b588cc18-1.16.0/kube1.16.0.tar.gz \
    --version v1.16.0

Then there's no more.

Overview and Design Principles

sealos is designed to be a simple, clean, lightweight and stable kubernetes installation tool, which can well support high availability installation. In fact, it is not difficult to make a thing powerful, but it is more difficult to make it simple and flexible.
Therefore, these principles must be followed in the implementation.

sealos features and advantages:

  • Support offline installation, separation of tools from resource bundles (binary configuration files mirror yaml files, etc.) so that different versions can replace different offline bundles
  • Certificate extension
  • Easy to use
  • Support for custom configuration
  • Kernel load, extremely stable, because simple, so the troubleshooting problem is extremely simple

Why not use ansilbe

Version 1.0 is really implemented with ansible, but users still need to install ansible first, install ansible need to install python and some dependencies, etc., in order to avoid users so much trouble to put ansible in the container for users to use. If you don't want to configure keyless password using username and need ssh-pass and so on, in a word, it can't satisfy me, it's not the simplicity I want.

So I want to come to a binary file tool, without any dependency, file distribution and remote commands are achieved by calling sdk, so do not rely on anything else, finally let me this cleanliness addict satisfied.

Why not keep alived haproxy

Happroxy running with static pod is not too much a problem to manage, keeping alived now most open source ansible scripts are equipped with yum or apt, which is very uncontrollable, with the following disadvantages:

  • Source inconsistencies may lead to version inconsistencies. Versions may not always be the same as configuration files. I have checked that scripts are not valid and I have not found the reason until later I learned that it was the version reason.
  • Some environments that depend on Libraries can't be installed directly because the system can't be installed.
  • Looked at a lot of installation scripts on the Internet, many detection scripts and weight adjustment methods are not right, directly to detect the absence of haproxy process, in fact, should be to detect whether apiserver is healthz, api hanging even if haproxy in the cluster will be abnormal, that is, pseudo-high availability.
  • Management is inconvenient. Monitoring cluster by prometheus can directly monitor static pod, but running with system D needs to be monitored separately, and restarting what needs to be pulled up separately. It's not as clean and concise as kubelet's unified management.
  • We've also seen keepalived fill up the CPU.

So in order to solve this problem, I left keepalived in the container (the mirror provided by the community is basically unavailable) in the middle of the transformation has also occurred a lot of problems, and ultimately it is good to solve.

All in all, tired does not love, so I wonder if I can get rid of haproxy and keepalived and come up with a simpler and more reliable solution, and I really find it...

Why not use envoy or nginx for local loads

We solve the problem of high availability through local load

Explain the local load, which is to start a load balancing on each node. The upstream is three master s. There are many ipvs envoy nginx and so on. Finally, we use the kernel ipvs.

I don't want to use envoy and so on to run a process on each node and consume more resources. ipvs actually run an extra process lvscare, but lvscare is only responsible for managing ipvs rules, similar to kube-proxy, the real traffic still goes from a very stable kernel, and does not need to move the package to the user mode to deal with.

There's a problem with implementation that makes it awkward to use envoy and so on. If load balancing is not established in join, it will get stuck and kubelet will not start. So you need to start envory first, which means you can't manage it with static pod. The same problem with the above keepalived host deployment is with St. Atomic pod will depend on each other, logic deadlock, Chicken said to have eggs first, eggs said to have chickens first, and finally no one.

Using ipvs is different. I can set up ipvs rules before joining, then join in, and then guard the rules. Once the apiserver is inaccessible, the corresponding ipvs rules on all node s are automatically cleaned up and added back when the master returns to normal.

Why customize kubeadm

First of all, because kubeadm wrote the certificate time to death, so it needs to be customized to 99 years. Although most people can sign a new certificate by themselves, we still do not want to rely on individual tools, and directly change the source code.

Secondly, it is most convenient to modify the kubeadm code when doing local load, because we need to do two things when joining. First, we need to create ipvs rules before joining, and second, we need to create static pod. If this block does not customize kubeadm, we will report the existing error of static pod directory. It is not elegant to ignore this error. And some useful SDKs have been provided in kubeadm for us to implement this function.

After that, the core functions are integrated into kubeadm. sealos becomes a lightweight tool to distribute and execute upper-level commands. When adding nodes, we can use kubeadm directly.

Using tutorials

Installation dependency

Installation tutorial

Multi master HA:

sealos init --master 192.168.0.2 \
    --master 192.168.0.3 \
    --master 192.168.0.4 \
    --node 192.168.0.5 \
    --user root \
    --passwd your-server-password \
    --version v1.14.1 \
    --pkg-url /root/kube1.14.1.tar.gz     

Or single master multiple node s:

sealos init --master 192.168.0.2 \
    --node 192.168.0.5 \
    --user root \
    --passwd your-server-password \
    --version v1.14.1 \
    --pkg-url /root/kube1.14.1.tar.gz 

Use a key-free or key pair:

sealos init --master 172.16.198.83 \
    --node 172.16.198.84 \
    --pkg-url https://YOUR_HTTP_SERVER/kube1.15.0.tar.gz \
    --pk /root/kubernetes.pem \
    --version v1.15.0
- master server address list
 --node node node server address list
 - user server ssh user name
 passwd server ssh user password
 -- pkg-url offline package location, can be placed in a local directory, can also be placed on an http server, sealos will wget to install the target machine
 --version kubernetes version
 - PK SSH private key address, configure key-free default is / root/.ssh/id_rsa

Other flags:

 - kubeadm-config string kubeadm-config.yaml kubeadm configuration file, you can customize the kubeadm configuration file
 VIP string virtual ip (default "10.103.97.2") virtual ip under local load is not recommended for modification and is inaccessible outside the cluster

Check whether the installation is normal:

[root@iZj6cdqfqw4o4o9tc0q44rZ ~]# kubectl get node
NAME                      STATUS   ROLES    AGE     VERSION
izj6cdqfqw4o4o9tc0q44rz   Ready    master   2m25s   v1.14.1
izj6cdqfqw4o4o9tc0q44sz   Ready    master   119s    v1.14.1
izj6cdqfqw4o4o9tc0q44tz   Ready    master   63s     v1.14.1
izj6cdqfqw4o4o9tc0q44uz   Ready    <none>   38s     v1.14.1
[root@iZj6cdqfqw4o4o9tc0q44rZ ~]# kubectl get pod --all-namespaces
NAMESPACE     NAME                                              READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-5cbcccc885-9n2p8          1/1     Running   0          3m1s
kube-system   calico-node-656zn                                 1/1     Running   0          93s
kube-system   calico-node-bv5hn                                 1/1     Running   0          2m54s
kube-system   calico-node-f2vmd                                 1/1     Running   0          3m1s
kube-system   calico-node-tbd5l                                 1/1     Running   0          118s
kube-system   coredns-fb8b8dccf-8bnkv                           1/1     Running   0          3m1s
kube-system   coredns-fb8b8dccf-spq7r                           1/1     Running   0          3m1s
kube-system   etcd-izj6cdqfqw4o4o9tc0q44rz                      1/1     Running   0          2m25s
kube-system   etcd-izj6cdqfqw4o4o9tc0q44sz                      1/1     Running   0          2m53s
kube-system   etcd-izj6cdqfqw4o4o9tc0q44tz                      1/1     Running   0          118s
kube-system   kube-apiserver-izj6cdqfqw4o4o9tc0q44rz            1/1     Running   0          2m15s
kube-system   kube-apiserver-izj6cdqfqw4o4o9tc0q44sz            1/1     Running   0          2m54s
kube-system   kube-apiserver-izj6cdqfqw4o4o9tc0q44tz            1/1     Running   1          47s
kube-system   kube-controller-manager-izj6cdqfqw4o4o9tc0q44rz   1/1     Running   1          2m43s
kube-system   kube-controller-manager-izj6cdqfqw4o4o9tc0q44sz   1/1     Running   0          2m54s
kube-system   kube-controller-manager-izj6cdqfqw4o4o9tc0q44tz   1/1     Running   0          63s
kube-system   kube-proxy-b9b9z                                  1/1     Running   0          2m54s
kube-system   kube-proxy-nf66n                                  1/1     Running   0          3m1s
kube-system   kube-proxy-q2bqp                                  1/1     Running   0          118s
kube-system   kube-proxy-s5g2k                                  1/1     Running   0          93s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44rz            1/1     Running   1          2m43s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44sz            1/1     Running   0          2m54s
kube-system   kube-scheduler-izj6cdqfqw4o4o9tc0q44tz            1/1     Running   0          61s
kube-system   kube-sealyun-lvscare-izj6cdqfqw4o4o9tc0q44uz      1/1     Running   0          86s

Clear

sealos clean \
    --master 192.168.0.2 \
    --master 192.168.0.3 \
    --master 192.168.0.4 \
    --node 192.168.0.5 \
    --user root \
    --passwd your-server-password

Video tutorial

Add node

Get join command and execute on master:

kubeadm token create --print-join-command

You can use super kubeadm, but you need to add a -- master parameter to join:

cd kube/shell && init.sh
echo "10.103.97.2 apiserver.cluster.local" >> /etc/hosts   # using vip
kubeadm join 10.103.97.2:6443 --token 9vr73a.a8uxyaju799qwdjv \
    --master 10.103.97.100:6443 \
    --master 10.103.97.101:6443 \
    --master 10.103.97.102:6443 \
    --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866

You can also use the sealos join command:

sealos join 
    --master 192.168.0.2 \
    --master 192.168.0.3 \
    --master 192.168.0.4 \
    --vip 10.103.97.2 \       
    --node 192.168.0.5 \            
    --user root \             
    --passwd your-server-password \
    --pkg-url /root/kube1.15.0.tar.gz 

Using a custom kubeadm configuration file

For example, we need to add sealyun.com to our certificate:

Get the configuration file template first:

sealos config -t kubeadm >>  kubeadm-config.yaml.tmpl

Modify the kubeadm-config.yaml.tmpl and add sealyun.com to the file. Notice that the other parts are not moved. sealos will automatically fill in the contents of the template:

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: {{.Version}}
controlPlaneEndpoint: "apiserver.cluster.local:6443"
networking:
  podSubnet: 100.64.0.0/10
apiServer:
        certSANs:
        - sealyun.com # this is what I added
        - 127.0.0.1
        - apiserver.cluster.local
        {{range .Masters -}}
        - {{.}}
        {{end -}}
        - {{.VIP}}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
        excludeCIDRs: 
        - "{{.VIP}}/32"

Use -- kubeadm-config to specify the configuration file template.

sealos init --kubeadm-config kubeadm-config.yaml.tmpl \
    --master 192.168.0.2 \
    --master 192.168.0.3 \
    --master 192.168.0.4 \
    --node 192.168.0.5 \
    --user root \
    --passwd your-server-password \
    --version v1.14.1 \
    --pkg-url /root/kube1.14.1.tar.gz 

Version upgrade

This tutorial takes 1.14 version upgraded to 1.15 as an example. Other versions are quite different in principle, so I understand this other reference. Official course that will do

upgrade process

  1. Upgrade kubeadm, all nodes import mirrors
  2. Upgrade control node
  3. Upgrade kubelet on master (control node)
  4. Upgrade other master (control node)
  5. Upgrade node
  6. Verify cluster status

Upgrade kubeadm

Copy offline packages to all nodes to execute CD kube/shell & & sh init.sh
This updates the kubeadm kubectl kubelet bin file and imports the higher version image

Upgrade control node

kubeadm upgrade plan
kubeadm upgrade apply v1.15.0

Restart kubelet:

systemctl restart kubelet

In fact, the upgrade of kubelet is simple and rough. We just need to copy the new version of kubelet under / usr/bin and restart the kubelet service. If the program is using without overwriting, then stop the kubelet and copy again. The kubelet bin file is in the conf/bin directory.

Upgrade other control nodes

kubeadm upgrade apply

Upgrade node

Expulsion nodes (whether or not to deport depends on the situation, and it's okay to like rude straightforwardness)

kubectl drain $NODE --ignore-daemonsets

Update the kubelet configuration:

kubeadm upgrade node config --kubelet-version v1.15.0

Then upgrade kubelet as well as replace binary and restart kubelet service

systemctl restart kubelet

Recall lost love:

kubectl uncordon $NODE

Verification

kubectl get nodes

If the version information is correct, it's basically ok ay.

What did kubeadm upgrade apply do

  1. Check whether the cluster can be upgraded
  2. Execute version upgrade strategy which versions can be upgraded between
  3. Confirm that the mirror is available
  4. Executing control component upgrades and rolling back if they fail are actually containers such as apiserver controller manager scheduler
  5. Perform upgrades to kube-dns and kube-proxy
  6. Create a new certificate file and back up the old one if it lasts more than 180 days

Source code compilation

Because netlink libraries are used, compilation in containers is recommended

docker run --rm -v $GOPATH/src/github.com/fanux/sealos:/go/src/github.com/fanux/sealos -w /go/src/github.com/fanux/sealos -it golang:1.12.7  go build

If you use go mod to specify compilation through vendor:

go build -mod vendor

principle

Execution process

  • Copy offline installation packages to target machines (masters and nodes) via sftp or wget
  • Execute kubeadm init on master 0
  • Execute kubeadm join on other masters and set up control surface. This process will start etcd on other masters and cluster with etcd of master 0, and start control formation (apiserver controller, etc.)
  • join node node, which configures ipvs rules, / etc/hosts, etc. on node

    One of the details is that all access to apiserver is through domain name, because the master links itself, node needs to link multiple masters through virtual ip. The kubelet and kube-proxy of each node access the address of apiserver is different, and kubeadm can only specify an address in the configuration file, so Use a domain name but each node resolves differently.

The advantage of using domain names is that when the IP address changes, it only needs to be modified and parsed.

Local Kernel Load

In this way, access to masters on each node through local kernel load balancing is achieved:

  +----------+                       +---------------+  virturl server: 127.0.0.1:6443
  | mater0   |<----------------------| ipvs nodes    |    real servers:
  +----------+                      |+---------------+            10.103.97.200:6443
                                    |                             10.103.97.201:6443
  +----------+                      |                             10.103.97.202:6443
  | mater1   |<---------------------+
  +----------+                      |
                                    |
  +----------+                      |
  | mater2   |<---------------------+
  +----------+

An lvscare static pod is set up on the node to guard the ipvs. Once the apiserver is inaccessible, the corresponding IPVS rules on all nodes are automatically cleaned up and the master is added back when it returns to normal.

So add three things to your node and you can see it intuitively:

cat /etc/kubernetes/manifests   # This adds lvscare's static pod
ipvsadm -Ln                     # You can see the ipvs rules created
cat /etc/hosts                  # Address Resolution of Virtual IP Address Added

Customized kubeadm

There are very few changes to kubeadm, mainly the extension of certificate time and the extension of join command. It mainly talks about the transformation of join command.

First, the join command is added -- the master parameter is used to specify the master address list

flagSet.StringSliceVar(
    &locallb.LVScare.Masters, "master", []string{},
    "A list of ha masters, --master 192.168.0.2:6443  --master 192.168.0.2:6443  --master 192.168.0.2:6443",
)

So you can get the master address list to do ipvs

If the control node is not a single master, then create an ipvs rule. The control node does not need to be created, even its own apiserver can:

if data.cfg.ControlPlane == nil {
            fmt.Println("This is not a control plan")
            if len(locallb.LVScare.Masters) != 0 {
                locallb.CreateLocalLB(args[0])
            }
        } 

Then create lvscare static pod to guard ipvs:

if len(locallb.LVScare.Masters) != 0 {
                locallb.LVScareStaticPodToDisk("/etc/kubernetes/manifests")
            }

So even if you don't use sealos, you can use custom kubeadm to install the cluster directly. It's just a little troublesome.

kubeadm configuration file

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.14.0
controlPlaneEndpoint: "apiserver.cluster.local:6443" # apiserver DNS name
apiServer:
        certSANs:
        - 127.0.0.1
        - apiserver.cluster.local
        - 172.20.241.205
        - 172.20.241.206
        - 172.20.241.207
        - 172.20.241.208
        - 10.103.97.1          # virturl ip
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"
ipvs:
        excludeCIDRs: 
        - "10.103.97.1/32" # Note that not adding this kube-proxy will clean up your rules

Master 0 10.103.97.100

echo "10.103.97.100 apiserver.cluster.local" >> /etc/hosts # The address of master 0 is resolved
kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs  
mkdir ~/.kube && cp /etc/kubernetes/admin.conf ~/.kube/config
kubectl apply -f https://docs.projectcalico.org/v3.6/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

Master 1 10.103.97.101

echo "10.103.97.100 apiserver.cluster.local" >> /etc/hosts #It resolves the address of master 0 in order to join in properly.
kubeadm join 10.103.97.100:6443 --token 9vr73a.a8uxyaju799qwdjv \
    --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 \
    --experimental-control-plane \
    --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07 

sed "s/10.103.97.100/10.103.97.101/g" -i /etc/hosts  # Parsing and replacing it with your own address, otherwise you will depend on the pseudo-high availability of master 0

Master 2 10.103.97.102, same as Master 1

echo "10.103.97.100 apiserver.cluster.local" >> /etc/hosts
kubeadm join 10.103.97.100:6443 --token 9vr73a.a8uxyaju799qwdjv \
    --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 \
    --experimental-control-plane \
    --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07  

sed "s/10.103.97.100/10.103.97.101/g" -i /etc/hosts

On nodes

When join ing, add -- Master specifies a list of master addresses

echo "10.103.97.1 apiserver.cluster.local" >> /etc/hosts   # Need to parse into virtual ip
kubeadm join 10.103.97.1:6443 --token 9vr73a.a8uxyaju799qwdjv \
    --master 10.103.97.100:6443 \
    --master 10.103.97.101:6443 \
    --master 10.103.97.102:6443 \
    --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866

Structural Analysis of Offline Package

.
├── bin  # The specified version of the bin file requires only these three, and the other components run into the container.
│   ├── kubeadm
│   ├── kubectl
│   └── kubelet
├── conf
│   ├── 10-kubeadm.conf  # The new version of this file is not used. I generate it directly in the shell, so that I can detect the cgroup driver.
│   ├── dashboard
│   │   ├── dashboard-admin.yaml
│   │   └── kubernetes-dashboard.yaml
│   ├── heapster
│   │   ├── grafana.yaml
│   │   ├── heapster.yaml
│   │   ├── influxdb.yaml
│   │   └── rbac
│   │       └── heapster-rbac.yaml
│   ├── kubeadm.yaml # kubeadm configuration file
│   ├── kubelet.service  # Kubelet system D configuration file
│   ├── net
│   │   └── calico.yaml
│   └── promethus
├── images  # All mirror packages
│   └── images.tar
└── shell
    ├── init.sh  # Initialization script
    └── master.sh # Run master script

The init.sh script copies the bin file under $PATH, configures system d, closes swap firewall, etc., and imports the image needed by the cluster.

master.sh mainly executes kubeadm init

There are configuration files for kubeadm, calico yaml files and so on that I need under conf.

sealos calls both. So most compatible versions can be fine-tuned scripts.
One key HA for kubernetes

Topics: Linux kubelet Kubernetes ansible ssh