HAProxy algorithm summary

Posted by tamilmani on Sat, 22 Jan 2022 04:36:47 +0100

1. HAProxy scheduling algorithm:

HAProxy indicates the scheduling algorithm for the backend server through the fixed parameter balance, which can be configured in the listen or backend options.
The scheduling algorithms of HAProxy are divided into static and dynamic scheduling algorithms, but some algorithms can be converted to each other in static and dynamic algorithms according to parameters.
Official documents https://cbonte.github.io/haproxy-dconv/2.0/configuration.html#4

1.1 static algorithm

Static algorithm: polling fair scheduling according to the pre-defined rules, regardless of the current load, number of connections and corresponding speed of the back-end server, and the weight cannot be modified in real time. It can only take effect by restarting HAProxy.

Server dynamic weight adjustment:

[root@node1 ~]# yum install -y socat
#Socat is a multifunctional network tool under Linux. Its name comes from Socket CAT. The main feature of socat is to establish a channel between two data streams and support many protocols and link modes. Such as IP, TCP, UDP, IPv6, Socket files, etc.

[root@node1 ~]# echo "show info" |socat stdio /var/lib/haproxy/haproxy.sock1

#Get server weight
[root@node1 ~]# echo "get weight web_80/web01" |socat stdio /var/lib/haproxy/haproxy.sock1
1 (initial 1)
#Set weight
[root@node1 ~]# echo "set weight web_80/web01 2" |socat stdio /var/lib/haproxy/haproxy.sock1

1.1.1,static-rr

Static RR: weight based polling scheduling. It does not support weight adjustment at runtime and slow start of back-end servers. There is no limit on the number of back-end servers

[root@node1 ~]# vim /etc/haproxy/haproxy.cfg
listen web_80
        bind 10.10.100.101:80
        mode http
        balance static-rr
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5
        
#test
[root@node1 ~]# while true; do curl -L http://10.10.100.101; sleep 1s; done
web02 10.10.100.103
web01 10.10.100.102
web01 10.10.100.102
web02 10.10.100.103

1.1.2,first

First: scheduling is performed from top to bottom according to the position of the server in the list. However, new requests will be allocated to the next server only when the number of connections of the first server reaches the upper limit. Therefore, the weight setting of the server will be ignored

[root@node1 ~]# vim /etc/haproxy/haproxy.cfg
listen web_80
        bind 10.10.100.101:80
        mode http
        balance first
        server web01 10.10.100.102:80 maxconn 2 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

#test
[root@node1 ~]# while true; do curl http://10.10.100.101; sleep 0.1;done

1.2 dynamic algorithm

Dynamic algorithm: the scheduling is adjusted appropriately based on the status of the back-end server. For example, priority is given to the server with low current load, and the weight can be dynamically adjusted during haproxy operation without restart.

1.2.1,roundrobin

Roundrobin: the weight based polling dynamic scheduling algorithm supports the runtime adjustment of the weight, which is not completely equal to the rr rotation training mode in lvs. The roundrobin in HAProxy supports slow startup (the newly added server will gradually increase the forwarding number). Each backend supports up to 4095 realserver s. The roundrobin is the default scheduling algorithm and supports the dynamic adjustment of the weight of the real server.

listen web_80
        bind 10.10.100.101:80
        mode http
        balance roundrobin
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

#test
[root@node1 ~]# while true; do curl http://10.10.100.101; sleep 1;done
web01 10.10.100.102
web02 10.10.100.103
web01 10.10.100.102
web02 10.10.100.103

#Dynamically adjust weights
[root@node1 ~]# echo "get weight web_80/web01" |socat stdio /var/lib/haproxy/haproxy.sock1
1 (initial 1)

[root@node1 ~]# echo "set weight web_80/web01 3" |socat stdio /var/lib/haproxy/haproxy.sock1

[root@node1 ~]# echo "get weight web_80/web01" |socat stdio /var/lib/haproxy/haproxy.sock1
3 (initial 1)

1.2.2,leastconn

The least connection weighted by leastconn is dynamic. It supports weight runtime adjustment and slow start, that is, the priority scheduling with the least back-end server connections (new client connections), which is more suitable for long connection scenarios, such as MySQL.

listen web_80
        bind 10.10.100.101:80
        mode http
        balance leastconn
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

1.3 other algorithms

Other algorithms can be used as static algorithms or dynamic algorithms through options

1.3.1,source

The source address hash is based on the user's source address hash and forwards the request to the back-end server. By default, the static mode is taken, but it can be changed through the options supported by hash type. Subsequent requests for the same source address will be forwarded to the same back-end web server, which is more suitable for scenarios such as session holding / caching services.

There are two server selection calculation methods for the source address to forward the client request to the back-end server, namely modular method and consistency hash

1.3.1.1 map base modeling method

Map based: modeling method, which is based on the hash array of the total weight of the server. The hash is static, that is, it does not support online weight adjustment and slow start. It schedules the back-end server evenly. The disadvantage is that when the total weight of the server changes, that is, when a server goes online or offline, the overall scheduling result will change due to the change of the weight.
The so-called modulo operation is to calculate the remainder after the division of two numbers, 10% 7 = 3, 7% 4 = 3, and modulo based on the weight: (2 ^ 32-1)% (1 + 1 + 2)

Schematic diagram of mold taking method

Example of mold configuration

listen web_80
        bind 10.10.100.101:80
        mode http
        balance source
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5
        
1.3.1.2 consistency hash

Consistency hash, which is dynamic, supports online weight adjustment and slow start. The advantage is that when the total weight of the server changes, the impact on the scheduling result is local and will not cause large changes

Mapping relationship between Hash object and backend server:

Consistency hash diagram

Example:

listen web_80
        bind 10.10.100.101:80
        mode http
        balance source
        #Specifies to use a consistent hash
        hash-type consistent
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

1.3.2,uri

Based on hashing the uri requested by the user and forwarding the request to the back-end specified server, you can also define whether to use modular method or consistent hash through map based and consistent.

http://example.org/absolute/URI/with/absolute/path/to/resource.txt  #URI/URL ftp://example.org/resource.txt  #URI/URL /relative/URI/with/absolute/path/to/resource.txt  #URI

1.3.2.1 configuration example of uri modulus method:
listen web_80
        bind 10.10.100.101:80
        mode http
        balance uri
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

1.3.2.2 example of uri consistency hash configuration:
listen web_80
        bind 10.10.100.101:80
        mode http
        balance uri
        hash-type consistent
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

#By accessing different uri tests, the same uri will be scheduled to the same server
[root@node1 ~]# while true; do curl http://10.10.100.101; sleep 1;done
web02 10.10.100.103
web02 10.10.100.103
web02 10.10.100.103

[root@node1 ~]# while true; do curl http://10.10.100.101/web/index.html; sleep 1;done
web03 10.10.100.103
web03 10.10.100.103
web03 10.10.100.103

1.3.3,url_param:

url_param hash es the parameter name in the params part of the URL requested by the user, divides it by the total weight of the server, and then sends it to a selected server; It is usually used to track users to ensure that requests from the same user are always sent to the same real server

Assume URL= http://www.cwy.com/foo/bar/index.php?k1=v1&k2=v2

Then:
host = www.cwy.com
url_param = "k1=v1&k2=v2"

1.3.3.1,url_ Configuration example of param mode taking method:
listen web_80
        bind 10.10.100.101:80
        mode http
        balance url_param name,age
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5


1.3.3.2,url_ Example of param consistency hash configuration:
listen web_80
        bind 10.10.100.101:80
        mode http
        balance url_param name,age
        hash-type consistent
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

##test

1.3.4,hdr

hash the specified information in each http header request of the user. Here, the http header specified by name will be taken out and hashed, and then divided by the total weight of the server and sent to a selected server. If there is no valid value, the default polling scheduling will be used.

1.3.4.1 configuration example of hdr modulus algorithm
listen web_80
        bind 10.10.100.101:80
        mode http
        #User agent browser type
        balance hdr(User-Agent)
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5
1.3.4.2. hdr consistency hash configuration
listen web_80
        bind 10.10.100.101:80
        mode http
        balance hdr(User-Agent)
        hash-type consistent
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5


test

1.3.5,rdp-cookie

The load of RDP cookies on remote windows desktop. Use cookies to keep the session

1.3.5.1 configuration example of RDP cookie mode taking method
listen RDP
        bind 10.10.100.101:3389
        mode tcp
        balance rdp-cookie
        server rdp01 10.10.100.06:3389 weight 1 check inter 3000 fall 3 rise 5

1.3.4.2 RDP cookie consistency hash configuration
listen RDP
        bind 10.10.100.101:3389
        mode tcp
        balance rdp-cookie
        hash-type consistent
        server rdp01 10.10.100.06:3389 weight 1 check inter 3000 fall 3 rise 5

1.3.6,random

In version 1.9, a load balancing algorithm called random is added, which is based on a random number as the key of the consistency hash. Random load balancing is very useful for large server farms or frequently adding or deleting servers, because it can avoid the hammering effect caused by roundrobin or leastconn in this case

1.3.6.1 random configuration example
listen web_80
        bind 10.10.100.101:80
        mode http
        balance random
        server web01 10.10.100.102:80 weight 1 check inter 3000 fall 3 rise 5
        server web02 10.10.100.103:80 weight 1 check inter 3000 fall 3 rise 5

1.4 algorithm summary

Static RR --------- > TCP / HTTP static
First ------------------- > TCP / HTTP static

Roundrobin -------- > TCP / HTTP dynamic
Leastconn -------- > TCP / HTTP dynamic
Random ------------------ > TCP / HTTP dynamic

source------------>tcp/http
Uri--------------->http
url_ Param -------- > HTTP depends on Hash_ Is type consistent
hdr--------------->http
rdp-cookie-------->tcp

1.5 usage scenarios of each algorithm

first # uses less

Static RR # makes a web cluster for session sharing
roundrobin
random

leastconn # database
source # session persistence based on client's public IP

URI ------------------------- > http # cache server, CDN service provider, Lanxun, Baidu, alicloud, Tencent
url_param--------->http

hdr # performs the next step based on the header of the client request message

RDP cookies # are rarely used

Topics: Linux Load Balance haproxy