How to implement distributed locks with Redis?

Posted by Virii on Thu, 03 Mar 2022 13:15:01 +0100

brief introduction

  I believe that the biggest motivation for many people to learn distributed locks is not their own system needs, but the needs of interviewers... Of course, this also shows that the distribution lock is very important. It is often used as an examination question. Before learning, we need to clarify several questions.

1, Is the lock important?

   of course, it is important. As long as you access critical resources, you will use locks, otherwise there will be thread safety problems.

2, Then why don't we use Java's own Lock? For example, should synchronized and Lock be implemented by themselves?

   one problem that needs to be clarified here is that these Java built-in locks work in the same JVM. If they are distributed services, there will be concurrent access to services under multiple JVM virtual machines, and these locks will not work. In distributed environment, locks need to be provided by third-party services.

3, What are the commonly used distributed lock implementation schemes?
  • Distributed lock based on MySQL
  • Redis based distributed lock
  • Distributed lock based on ZooKeeper

   in fact, to put it bluntly, as long as the database that can store data can realize distributed locking, because we only need to tell other threads that the current resources are occupied. In fact, synchronized and Lock also store a flag to tell other threads whether the current resources are occupied or not. It's no mystery.

Three implementation methods

  first of all, we need to understand what requirements distributed locks should meet:

  1. Mutex. (in a distributed cluster, the same method can only be obtained by one thread on one machine at a time).
  2. Reentrant. (recursive calls should not be blocked to avoid deadlocks).
  3. Lock timeout. (avoid deadlocks, loops and other unexpected situations).
  4. Locking and unlocking must be the same client. (unless the lock is automatically retracted when it expires, locking and unlocking need to be the same client).

Next, briefly introduce the implementation principles of the next three locks, as well as their advantages and disadvantages.

Distributed lock based on MySQL

   MySQL based locks are mainly implemented in two ways: pessimistic locks and optimistic locks.

  1. Pessimistic lock: it mainly uses select... Where... for update to lock and operate the queried row. It should be noted that "where name = lock", and the name field must go through the index, otherwise the table will be locked.
  2. Optimistic lock: it is based on the idea of CAS. It does not think that lock contention often occurs. It can be detected only after the update version fails. When lock contention is not much, it is a good solution, but too much contention wastes CPU resources.
advantage:
  • The implementation is relatively simple. MySql solves the problem of competition.
  • The architecture is relatively simple, which no longer requires redundant third-party components, making the whole system simpler.
Disadvantages:
  • The performance is relatively poor and there is a risk of locking the table.
  • After the non blocking operation fails, it needs to poll, occupy cpu resources and MySQL database resources.
  • Not submitting or polling for a long time will occupy too many MySQL connection resources.

Redis based distributed lock

  for some operations and features of Redis, please refer to this article: Redis learning notes . Here we will directly introduce the three commands that need to be used: SETNX, expire and delete.

  1. SETNX key val
    SETNX is the abbreviation of SET if Not eXists. Its function is: if the key does not exist, store the key value pair and return 1; If the key exists, do nothing and return 0.
  2. expire key <timeout>
    Set a timeout for the key, with the unit of second. After this time, the key will be automatically deleted to avoid deadlock caused by downtime.
  3. delete key
    Delete the specified key.

The realization idea is:

  1. When acquiring a lock, use setnx to lock. key is the lock name and value is a randomly generated UUID.
  2. After obtaining the lock successfully, use the expire command to add a timeout time for the lock. If it exceeds this time, you will give up obtaining the lock.
  3. When releasing a lock, judge whether it is the lock through UUID. If it is the lock, execute delete to release the lock.
advantage:
  • Relying on the high concurrency of Redis, the performance is excellent.
  • The expiration time is not easy to control, so the renewal of lock needs to be considered.
Disadvantages:
  • The implementation is relatively complex, and there are too many factors to consider.
  • Non blocking. After the operation fails, it needs to poll and occupy cpu resources.
  • If the master node hangs up and fails to synchronize successfully, multiple nodes may acquire locks.

Distributed lock based on ZooKeeper

   the data storage data model of ZooKeeper is a znode tree. The path divided by slash (/) is a znode (such as / locks/my_lock). A series of data will be saved on each znode at the same time. Znode can be divided into four types: persistent node, persistent sequential node, temporary node and temporary sequential node.
   ZooKeeper distributed locks are implemented based on temporary sequential nodes. A lock can be understood as a node on ZooKeeper. When a lock needs to be obtained, a temporary sequential node is created under this lock node. When multiple clients acquire locks at the same time, multiple temporary sequential nodes are created in sequence, but only the node with the first sequence number can acquire the lock successfully. Other nodes monitor the changes of the previous node in sequence. When the listener releases the lock, the listener can obtain the lock immediately.
   another purpose of using temporary sequence node is that if a client creates a temporary sequence node and unexpectedly goes down, ZooKeeper will automatically delete the corresponding temporary sequence node after sensing that a client goes down, which is equivalent to automatically releasing the lock.

advantage:
  • Effectively solve the problem of single point of downtime and have high availability.
  • It can be used as a reentrant lock.
  • It can be used as blocking lock.
Disadvantages:
  • Nodes need to be created and deleted frequently, and the performance is not as good as Redis.
  • The client lost contact with Zookeeper for a long time and the lock was released.

summary

In terms of ease of implementation: MySQL Database > zookeeper > redis cache.
Performance comparison: Redis cache > zookeeper > MySQL database.
In terms of reliability: zookeeper > MySQL Database > redis cache.

Redis implementation code

   when using Redis for code implementation, we need to consider the following issues:

  • Deadlock caused by downtime: set the expiration time.
  • Business time is longer than expiration time: start a daemon and renew the lock.
  • The lock is released by others: the lock is written with a unique ID. when releasing the lock, check the ID first and then release it.

So what if it's too troublesome to implement? Just use the library encapsulated by others: Redisson. It is not only more convenient, but also more stable. The code is as follows:

// 1. Construct redisson to realize the necessary Config of distributed lock
Config config = new Config();
config.useSingleServer().setAddress("redis://127.0.0.1:6379").setPassword("password").setDatabase(0);

// 2. Construct RedissonClient
RedissonClient redissonClient = Redisson.create(config);

// 3. Get lock object instance
RLock rLock = redissonClient.getLock(lockKey);
try {
   
    // 4. Try to acquire the lock
    boolean res = rLock.tryLock((long)waitTimeout, (long)leaseTime, TimeUnit.SECONDS);
    if (res) {
        // The lock is successfully obtained and the business is processed
    }
} catch (Exception e) {
    // Continue waiting, or do something else
    throw new RuntimeException("aquire lock fail");
}finally{
    // In any case, unlock it in the end
    rLock.unlock();
}

Redisson is implemented as follows:
The implementation method of tryLock is as follows:

	public boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException {
        long time = unit.toMillis(waitTime);
        long current = System.currentTimeMillis();
        long threadId = Thread.currentThread().getId();
        
        // Lock acquisition attempt
        Long ttl = tryAcquire(leaseTime, unit, threadId);
        // lock acquired
        if (ttl == null) {
            return true;
        }

        // If the time taken to apply for a lock is greater than or equal to the maximum waiting time, the application for a lock fails
        time -= System.currentTimeMillis() - current;
        if (time <= 0) {
            acquireFailed(threadId);
            return false;
        }

        current = System.currentTimeMillis();

        /**
         * Subscribe to lock release events and block waiting for lock release through await method, which effectively solves the problem of waste of resources caused by invalid lock application:
         * When this Await returns false, indicating that the waiting time has exceeded the maximum waiting time for obtaining the lock. Unsubscribe and return the failure to obtain the lock
         * When this Await returns true and enters the loop to try to obtain the lock
         */
        RFuture<RedissonLockEntry> subscribeFuture = subscribe(threadId);
        // Within the await method, CountDownLatch is used to implement blocking and obtain the result of asynchronous execution of subscription (Netty's Future is applied)
        if (!subscribeFuture.await(time, TimeUnit.MILLISECONDS)) {
            if (!subscribeFuture.cancel(false)) {
                subscribeFuture.onComplete((res, e) -> {
                    if (e == null) {
                        unsubscribe(subscribeFuture, threadId);
                    }
                });
            }
            acquireFailed(threadId);
            return false;
        }

        try {
            // Calculate the total time taken to acquire the lock. If it is greater than or equal to the maximum waiting time, acquiring the lock fails
            time -= System.currentTimeMillis() - current;
            if (time <= 0) {
                acquireFailed(threadId);
                return false;
              }

            /**
             * After receiving the lock release signal, within the maximum waiting time, cycle one attempt after another to obtain the lock
             * If the lock is obtained successfully, it will immediately return true,
             * If the lock has not been obtained within the maximum waiting time, it is considered that obtaining the lock failed, and false is returned to end the cycle
             */
            while (true) {
                long currentTime = System.currentTimeMillis();

                // Try to acquire the lock again
                ttl = tryAcquire(leaseTime, unit, threadId);
                // lock acquired
                if (ttl == null) {
                    return true;
                }
                // If the maximum waiting time is exceeded, false is returned to end the cycle, and lock acquisition fails
                time -= System.currentTimeMillis() - currentTime;
                if (time <= 0) {
                    acquireFailed(threadId);
                    return false;
                }

                /**
                 * Blocking waiting lock (blocking by semaphore (shared lock), waiting for unlocking message):
                 */
                currentTime = System.currentTimeMillis();
                if (ttl >= 0 && ttl < time) {
                    //If the remaining time (ttl) is less than the wait time, obtain a license from the semaphore of the Entry within the ttl time (unless interrupted or there is no license available).
                    getEntry(threadId).getLatch().tryAcquire(ttl, TimeUnit.MILLISECONDS);
                } else {
                    //Then wait for the semaphore to pass within the wait time range
                    getEntry(threadId).getLatch().tryAcquire(time, TimeUnit.MILLISECONDS);
                }

                // Update the remaining waiting time (maximum waiting time - elapsed blocking time)
                time -= System.currentTimeMillis() - currentTime;
                if (time <= 0) {
                    acquireFailed(threadId);
                    return false;
                }
            }
        } finally {
            // Unsubscribe from the unlock message whether the lock is obtained or not
            unsubscribe(subscribeFuture, threadId);
        }
        return get(tryLockAsync(waitTime, leaseTime, unit));
    }

Redisson watchdog lock renewal mechanism

	private <T> RFuture<Long> tryAcquireAsync(long leaseTime, TimeUnit unit, long threadId) {
		
		// If there is an expiration time, the lock is acquired in the normal way
        if (leaseTime != -1) {
            return tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL_LONG);
        }

        // First, execute the method of acquiring the lock according to the expiration time of 30 seconds
        RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL_LONG);
        
        // If the lock is still held, start the scheduled task to continuously refresh the expiration time of the lock
        ttlRemainingFuture.onComplete((ttlRemaining, e) -> {
            if (e != null) {
                return;
            }

            // lock acquired
            if (ttlRemaining == null) {
                scheduleExpirationRenewal(threadId);
            }
        });
        return ttlRemainingFuture;
    }

Renewal principle
The lua script is used to reset the lock time to 30s

/*
 Watch Dog The mechanism is actually a background scheduled task thread. After obtaining the lock successfully, the thread holding the lock will be put into a redissonlock EXPIRATION_ RENEWAL_ In map,
 Then check every 10 seconds (internalLockLeaseTime / 3) if the client still holds the lock key
 (Judging whether the client still holds a key is actually traversing the expiration_ RENEWAL_ The thread id in the map is checked in Redis according to the thread id. if it exists, the time of the key will be extended,
 Then it will continue to prolong the survival time of the lock key. If the service goes down, the thread of Watch Dog mechanism will disappear,
 At this time, the expiration time of the key will not be extended. It will expire automatically after 30s, and other threads can obtain the lock.
*/
private void scheduleExpirationRenewal(long threadId) {
    ExpirationEntry entry = new ExpirationEntry();
    ExpirationEntry oldEntry = EXPIRATION_RENEWAL_MAP.putIfAbsent(getEntryName(), entry);
    if (oldEntry != null) {
        oldEntry.addThreadId(threadId);
    } else {
        entry.addThreadId(threadId);
        renewExpiration();
    }
}

protected RFuture<Boolean> renewExpirationAsync(long threadId) {
    return commandExecutor.evalWriteAsync(getName(), LongCodec.INSTANCE, RedisCommands.EVAL_BOOLEAN,
            "if (redis.call('hexists', KEYS[1], ARGV[2]) == 1) then " +
                "redis.call('pexpire', KEYS[1], ARGV[1]); " +
                "return 1; " +
            "end; " +
            "return 0;",
        Collections.<Object>singletonList(getName()),
        internalLockLeaseTime, getLockName(threadId));
}

Topics: Java Redis Back-end Distribution Distributed lock