Three powerful tools for protecting highly concurrent systems: caching, degradation and current limiting.
What is current limiting? If I haven't read too many books, flow restriction is to limit flow.
We all know that the processing capacity of the server has an upper limit. If the upper limit is exceeded and requests continue to be allowed in, uncontrollable consequences may occur.
By limiting the flow, when the number of requests exceeds the threshold, the system will queue up and even refuse service, which can make the system lose service rather than no service when it can't bear high concurrency.
For example, there is a shortage of masks everywhere. In order to alleviate the situation that citizens can't buy masks, the Wuhan government has launched an appointment service. Only those who make an appointment can buy a small number of masks at the designated drugstore.
This is the current limit in life. I hope you can protect yourself and pay attention to protection during this period:
Next, let's share with you the common methods of interface current limiting. Some algorithms are roughly implemented in python + redis. The key is illustration! You taste, you taste~
Fixed window method
The fixed window method is the simplest in the current limiting algorithm. For example, I want to limit the number of requests to 100 within one minute. The maximum number of requests is 100 within one minute from now. At the end of this minute, reset the counter, recalculate and start again and again.
Pseudo code implementation
def can_pass_fixed_window(user, action, time_zone=60, times=30): """ :param user: User unique ID :param action: User access interface ID(That is, the user's actions on the client) :param time_zone: Time period limited by the interface :param time_zone: How many requests are allowed to pass within a limited time period """ key = '{}:{}'.format(user, action) # redis_conn indicates the redis connection object count = redis_conn.get(key) if not count: count = 1 redis_conn.setex(key, time_zone, count) if count < times: redis_conn.incr(key) return True return False
Although this method is simple, there is a big problem that it can not deal with the sudden traffic within the two time boundaries. As shown in the figure above, if 100 requests come in one second before and one second after the counter is cleared, the server will receive twice (200) requests in a short time, which may crush the system. The above problems will be caused because our statistical accuracy is not enough. In order to reduce the impact of critical problems, we can use the sliding window method.
Sliding window method
The sliding window method is simply that the time window will continue to move as time goes by. A counter constantly maintains the number of requests in the window, so as to ensure that the maximum number of requests will not be exceeded in any time period. For example, the current time window is 0s~60s and the number of requests is 40. After 10s, the time window becomes 10s~70s and the number of requests is 60.
The sliding and counter of time window can be realized by using the sorted set of redis. The score value is represented by the millisecond timestamp. The window boundary can be calculated by using the current timestamp time window size, and then a window can be circled by making a range filter according to the score value; The value of value is only used as the unique identification of user behavior. It is also good to use the millisecond timestamp. Finally, count the number of requests in the window and make a judgment.
Pseudo code implementation
def can_pass_slide_window(user, action, time_zone=60, times=30): """ :param user: User unique ID :param action: User access interface ID(That is, the user's actions on the client) :param time_zone: Time period limited by the interface :param time_zone: How many requests are allowed to pass within a limited time period """ key = '{}:{}'.format(user, action) now_ts = time.time() * 1000 # It doesn't matter what value is here, as long as the uniqueness of value is guaranteed. Here, the millisecond timestamp is used as the unique value value = now_ts # Left boundary of time window old_ts = now_ts - (time_zone * 1000) # Record behavior redis_conn.zadd(key, value, now_ts) # Delete data before time window redis_conn.zremrangebyscore(key, 0, old_ts) # Gets the number of behaviors in the window count = redis_conn.zcard(key) # Set an expiration time to avoid taking up space redis_conn.expire(key, time_zone + 1) if not count or count < times: return True return False
Although the sliding window method avoids the problem of time limit, it still can not solve the problem of excessive concentration of requests on fine time granularity. For example, it limits the number of requests to no more than 60 in one minute, and the requests are sent in 59s, which greatly reduces the effect of sliding window. In order to make the traffic smoother, we can use more advanced token bucket algorithm and leaky bucket algorithm.
Token bucket method
The idea of the token bucket algorithm is not complicated. It first generates tokens at a fixed rate, puts the tokens into a bucket with a fixed capacity, discards the tokens exceeding the bucket capacity, and obtains the tokens once for each request. It is stipulated that only the request to obtain the token can be released, and the request not to obtain the token will be discarded.
Pseudo code implementation
Token bucket method, specific steps:
- When the request comes, the number of generated tokens is calculated. The generation rate is limited
- If too many tokens are generated, the tokens are discarded
- A request with a token can pass, otherwise it will be rejected
def can_pass_token_bucket(user, action, time_zone=60, times=30): """ :param user: User unique ID :param action: User access interface ID(That is, the user's actions on the client) :param time_zone: Time period limited by the interface :param time_zone: How many requests are allowed to pass within a limited time period """ # Pour water when the request comes. There is a limit on the pouring rate key = '{}:{}'.format(user, action) rate = times / time_zone # Token generation speed capacity = times # Barrel capacity tokens = redis_conn.hget(key, 'tokens') # Look how many tokens are in the bucket last_time = redis_conn.hget(key, 'last_time') # Last token generation time now = time.time() tokens = int(tokens) if tokens else capacity last_time = int(last_time) if last_time else now delta_tokens = (now - last_time) * rate # Token generated after a period of time if delta_tokens > 1: tokens = tokens + tokens # Add token if tokens > tokens: tokens = capacity last_time = time.time() # Record token generation time redis_conn.hset(key, 'last_time', last_time) if tokens >= 1: tokens -= 1 # When the request comes in, the token is reduced by 1 redis_conn.hset(key, 'tokens', tokens) return True return False
The token bucket method limits the average inflow rate of requests. Its advantage is that it can deal with sudden requests to a certain extent and maintain the source characteristics of traffic to a certain extent. It is not difficult to implement and is suitable for most application scenarios.
Leaky bucket algorithm
The idea of leaky bucket algorithm is a little opposite to token bucket algorithm. You can imagine the request as water flow. The water flow can flow into the leakage bucket at any rate, and the leakage bucket will flow out of the water at a fixed rate. If the inflow speed is too high, it will cause water overflow, and the overflow request will be discarded.
It can be seen from the figure above that the leaky bucket method is characterized by not limiting the rate of request inflow, but limiting the rate of request outflow. In this way, the burst traffic can be shaped into a stable traffic without overclocking.
One thing about the implementation of leaky bucket algorithm is worth noting. When browsing the relevant content, I found that most of the pseudo code implementations of leaky bucket algorithm on the Internet are only implemented
According to Wikipedia, there are two implementation theories of leaky bucket algorithm, which are based on meter and queue. Their specific ideas are different. Let me briefly introduce them.
Leakage bucket based on meter
The implementation based on meter is relatively simple. In fact, it has a counter. When there is a message to be sent, it depends on whether the counter is enough. If the counter is not full, the message can be processed. If the counter is not enough to send the message, the message will be discarded.
So how does this counter come from? The counter based on meter is the sending frequency. For example, if you set the frequency to no more than 5 pieces / s, the counter is 5. Each message you send in one second will be reduced by one. When you send item 6, the timer is not enough, and the message will be discarded.
This implementation is somewhat similar to the fixed window method introduced at the beginning, but if the time granularity is smaller, the pseudo code will not be used.
Leaky bucket based on queue
The implementation based on queue is complex, but the principle is relatively simple. It also has a counter. This counter does not represent the rate limit, but the size of the queue. Here, when there is a message to be sent, see whether there is a location in the queue. If so, put the message into the queue, which provides services in the form of FIFO; If the queue has no location, the message will be discarded.
After the message is put into the queue, a timer needs to be maintained. The cycle of the timer is the frequency cycle we set. For example, if we set the frequency to 5 pieces / s, the cycle of the timer is 200ms. The timer goes to the queue to get the message every 200ms. If there is a message, it will be sent out. If there is no message, it will be empty.
Note that many pseudo code implementations of the leaky bucket method on the Internet only realize the part of water flowing into the bucket, but do not realize the key part of water leaking out of the bucket. If only the first half is implemented, it is actually no different from the token bucket 😯
If you think the above is too difficult to implement, I suggest you try the redis cell module!
redis-cell
Redis 4.0 provides a current limiting redis module called redis cell. The module also uses the funnel algorithm and provides atomic current limiting instructions. With this module, the current limiting problem is very simple. This module needs to be installed separately. There are many installation tutorials on the Internet. It has only one instruction: cl.thread
CL.THROTTLE user123 15 30 60 1 ▲ ▲ ▲ ▲ ▲ | | | | └───── apply 1 operation (default if omitted) Water droplets consumed per request | | └──┴─────── 30 operations / 60 seconds Rate of water leakage | └───────────── 15 max_burst Capacity of leaky bucket └─────────────────── key "user123" User behavior
After executing the above command, redis will return the following information:
> cl.throttle laoqian:reply 15 30 60 1) (integer) 0 # 0 means yes, 1 means No 2) (integer) 16 # Leaky bucket capacity 3) (integer) 15 # Leaky bucket remaining space left_quota 4) (integer) -1 # If rejected, how long will it take to try again (the leaky bucket has space, in seconds) 5) (integer) 2 # After how long, the leaky bucket is completely empty (in seconds)
With the above redis module, you can easily deal with most current limiting scenarios.
Finally, thank you for reading. Each of your likes, comments and sharing is our greatest encouragement. Refill ~
If you have any questions, please discuss them in the comment area!