Weighted polling algorithm (wrr), this test site, the probability is a little high!

Posted by greeneel on Wed, 05 Jan 2022 16:19:21 +0100

Near the end of the year, recruiters and job seekers are busy with each other.

Today is different from the past, selling miserably has become the mainstream melody, which has also exacerbated the anxiety of employees. Many people, who have worked for more than ten years without touching the algorithm, now have to look like squatting in the self-study room, holding up big books and dying to read them.

alas.

Recently, I communicated with many small partners who participated in the interview and found that a relatively high-frequency algorithm problem appeared. Different from the regular algorithm problems such as linked list, tree and dynamic programming, the weighted polling algorithm has many small skills and is also more in practical application. The smoothest Nginx polling algorithm, if you haven't seen it, will never be written.

The so-called Weighted Round Robin algorithm is actually Weighted Round Robin, or wrr for short. When configuring the upstream of Nginx, the weighted polling is actually wrr.

upstream backend {
   ip_hash;
   server 192.168.1.232 weight=4; 
   server 192.168.1.233 weight=3;
   server 192.168.1.234 weight=1;
}

1. Core data structure

In order to facilitate coding, we abstract a class called Element for each scheduled unit. Among them, peer refers to the specific scheduled resource, such as IP address, and weight refers to the relevant weight of the resource.

public class Element {
    protected String peer;
    protected int weight;

    public Element(String peer, int weight){
        this.peer = peer;
        this.weight = weight;
    }
}

Then our specific scheduling interface will directly return the address of peer.

public interface IWrr {
    String next();
}

We will directly test the scheduling of IWrr interface in the code. For example, the test code of the three resources with weights of 7, 2 and 1 is as follows.

Element[] elements = new Element[]{
 new Element("A", 7),
 new Element("B", 2),
 new Element("C", 1),
};
int count = 10;
IWrr wrr = new WrrSecurityLoopTreeMap(elements);
for (int i = 0; i < count; i++) {
    System.out.print(wrr.next() + ",");
}
System.out.println();

The above code has called the interface 10 times. We hope that the code implementation will be scheduled in the proportion of 7, 2 and 1.

2. Random number version

The simplest way is to use random numbers. Of course, only when the number of requests is large, the random distribution will approach the proportion of 7, 2 and 1. This is usually no problem. For example, the Robion component of spring cloud uses random polling.

We first calculate the total weight value and record it as total, then take the random number in the total interval for each call, and then traverse all the weight data in turn.

The time complexity of the next method is O(n) in the worst case.

The call sequence obtained by random scheduling is also random. It is friendly for scenarios like micro service node polling. However, for some services with small calls, some nodes may starve to death. After all, they are random numbers.

public class WrrRnd implements IWrr {
    final int total;
    final Element[] elements;
    final Random random = new SecureRandom();

    public WrrRnd(Element[] elements) {
        this.total = Arrays.stream(elements)
                .mapToInt(ele -> ele.weight)
                .sum();

        this.elements = elements;
    }

    @Override
    public String next() {
        final int n = elements.length;
        int index = n - 1;
        int hit = random.nextInt(total);

        for(int i = 0; i < n; i++){
            if(hit >= 0) {
                hit -= elements[i].weight;
            }else{
                index = i - 1;
                break;
            }
        }

        return elements[index].peer;
    }
}

3. Incremental version

Random numbers are good in most cases, but sometimes we do need very accurate scheduling results. In this case, it is common to use an atomic incremental counter to store the current scheduling times.

So the logic is clear. We can directly use atomic classes to implement this counter.

The code is similar to the above, except that when obtaining the hit variable, we replace the random number acquisition method with the self increasing method.

//customary
int hit = random.nextInt(total);

current. Of course, it also has a small problem, that is, the value of int may be used up. This small problem is fixed in the following code.

int hit = count.getAndIncrement() % total;

4. Red black tree version

Whether it is random number or sequential polling, their time complexity is relatively high, because it needs to traverse all configuration items every time until it reaches the value we need. To improve its running efficiency, we can change time in space with the help of Java TreeMap.

The following is an implementation method of thread safe version, which uses physical storage to solve the time consumption. The bottom layer of TreeMap is a red black tree, which implements the function of sorting according to the size of keys. Its average time complexity is log(n).

By directly converting the logic of the above code into TreeMap storage, we can obtain the nearest scheduling unit through the ceilingEntry method.

On concurrency, CAS primitives are used directly. At this time, instead of self increasing, we strictly control the maximum below total and deal with the conflict through spin.

public class WrrSecurityLoopTreeMap implements IWrr {
    final int total;
    final AtomicInteger count = new AtomicInteger();
    final TreeMap<Integer, Element> pool = new TreeMap<>();

    public WrrSecurityLoopTreeMap(Element[] elements) {
        int total = 0;
        for (Element ele : elements) {
            total += ele.weight;
            pool.put(total - 1, ele);
        }
        this.total = total;
    }

    @Override
    public String next() {
        final int modulo = total;
        for (; ; ) {
            int hit = count.get();
            int next = (hit + 1) % modulo;
            if (count.compareAndSet(hit, next) && hit < modulo) {
                return pool.ceilingEntry(hit).getValue().peer;
            }
        }
    }
}

5. LVS version

In the above versions (except random), one of the biggest problems is scheduling imbalance. When our ratio is 7, 2 and 1, its scheduling result is a, a, a, a, B, B, C,.

We hope that the scheduling can be smoother, rather than pressing on node A. The following is an algorithm in LVS code, which uses the maximum number of conventions to realize polling. Although it cannot achieve very smooth polling, it is at least much better than the self increasing code above.

The execution process of this code includes two parts: one is to calculate the maximum common divisor gcd, and the other is the polling algorithm.

For the weights of 7, 2 and 1, its scheduling results are a, a, a, a, a, B, a, B, C. compared with the sequential polling method, it has some improvements. When the weight values of these nodes are similar, the LVS version will show better load balancing effect.

Let's first calculate the gcd of the greatest common divisor in the constructor. Then, the polling algorithm is calculated based on this maximum common divisor.

According to the address introduced, it is easy to write the corresponding algorithm.

http://kb.linuxvirtualserver.org/wiki/Weighted_Round-Robin_Scheduling

The following is the specific code.

public class WrrGcd implements IWrr {
    final int gcd;
    final int max;
    final Element[] elements;

    public WrrGcd(Element[] elements) {
        Integer gcd = null;
        int max = 0;
        for (Element ele : elements) {
            gcd = gcd == null ? ele.weight : gcd(gcd, ele.weight);
            max = Math.max(max, ele.weight);
        }
        this.gcd = gcd;
        this.max = max;
        this.elements = elements;
    }

    int i = -1;
    int cw = 0;
    @Override
    public String next() {
        for (; ; ) {
            final int n = elements.length;
            i = (i + 1) % n;
            if (i == 0) {
                cw = cw - gcd;
                if (cw <= 0) {
                    cw = max;
                    if (cw == 0) {
                        return null;
                    }
                }
            }
            if(elements[i].weight >= cw){
                return elements[i].peer;
            }
        }
    }

    private int gcd(int a, int b) {
        return b == 0 ? a : gcd(b, a % b);
    }
}

6. Nginx version

This version of nginx goes to a higher level and can achieve the effect of A,A,B,A,A,C,A,A,B,A. On the premise of ensuring accurate weight, the call is dispersed as much as possible.

This algorithm is ingenious and can be said to be a very talented algorithm. If you haven't touched it, you can't write it.

Although the algorithm is relatively simple, it is not easy to prove the accuracy of the algorithm. The specific process of certification can refer to the following links.

https://tenfy.cn/2018/11/12/smooth-weighted-round-robin/

Look at our code, which encapsulates a class called Wrr. Based on the original weight, this class adds a current weight value current. Current will change every time it is called.

In each round of calls, the weight value of the corresponding node will be added to current, and then the one with the largest current value will be selected as the scheduling node of this round.

For the selected node, all weight values total will be subtracted, and then the next scheduling will be carried out. The only problem is that when there are many nodes, its time complexity is always O(n), and the execution efficiency should be discounted.

public class WrrSmooth implements IWrr {
    class Wrr {
        Element ele;
        int current = 0;
        Wrr(Element ele){
            this.ele = ele;
        }
    }

    final Wrr[] cachedWeights;

    public WrrSmooth(Element[] elements) {
        this.cachedWeights = Arrays.stream(elements)
                .map(Wrr::new)
                .collect(Collectors.toList())
                .toArray(new Wrr[0]);
    }

    @Override
    public String next() {
        int total = 0;
        Wrr shed = cachedWeights[0];

        for(Wrr item : cachedWeights){
            int weight = item.ele.weight;
            total +=  weight;

            item.current += weight;
            if(item.current > shed.current){
                shed = item;
            }
        }
        shed.current -= total;
        return shed.ele.peer;
    }
}

This version of Nginx is very simple. It is recommended to understand well and master the writing methods of red black tree and Ningx version.

End

General interviews actually focus on random numbers and increasing versions. Of course, the red black tree version can also be considered. As for the writing methods of LVS and Nginx, if you haven't encountered them before, you probably can't write them unless you are a genius.

But if you are a genius, do you still need such a vulgar interview?

Programmer Think