Some students are so pissing. You can understand it. If you don't do it yourself, how can you understand it thoroughly? Let's do it together!
Usage scenario and model selection
The distributed multi node deployment mode makes it possible for shared variables to be operated at the same time. In case of data consistency requirements, global locking measures need to be taken to ensure the consistency requirements under concurrent operations, such as inventory deduction, shelves and updates of the same commodity, etc.
Common distributed locks are implemented by Zookeeper and Redis. How to choose?
In the production environment, performance is often given priority. Compared with their respective advantages and disadvantages, we generally prefer redis.
Implementing distributed locks from 0 to 1
Step 1: basic ability construction of locking and unlocking
Jedis.set(key, value, params) 👏🏻
The enhanced set command added after 2.6 is really good. It solves the atomic demand of setting lock timeout when locking and prevents deadlock caused by service downtime~
(1) A distributed lock object with lock unlocking function must have at least jedis client, corresponding redis key and lock timeout:
//Building distributed lock objects public class DistributedLock { private Jedis jedis; private String lockName; private long lockExpireSecond; public DistributedLock(Jedis jedis, String lockName, long lockExpireSecond) { this.jedis = jedis; this.lockName = lockName; this.lockExpireSecond = lockExpireSecond; } }
(2) Using SetParams provided by jedis, NX and PX are displayed in jedis Completion setting of one-time atoms in set operation:
public void lock() throws BizException { String lockResult = null; try { //Set NX PX parameters SetParams params = new SetParams(); params.nx(); params.px(TimeUnit.SECONDS.toMillis(lockExpireSecond)); //Execute locking, and value is temporarily fixed string lockResult = this.jedis.set(this.lockName, "lockValue", params); } catch (Exception e) { LOG.error("lock error",e); } if ("OK".equals(lockResult)) { LOG.debug("locked success,lockName:{}",lockName); } else { throw new BizException("Get lock failed."); } }
(3) Use jedis Del command completes unlocking:
public boolean unlock() { boolean unlockResult=false; try { this.jedis.del(this.lockName); unlockResult=true; }catch (Exception e){ LOG.error("unLock error",e); } return unlockResult; }
Step 2: failed to lock. End directly? I hope to try more
From the above constructor and lock() implementation, it is found that the current implementation belongs to a one-off deal. If it is unsuccessful, it will become benevolence. In fact, this does not meet our production needs. In many scenarios, the business execution speed is very fast. Just wait a little. What can we do?
User defined retry times and waiting interval, limited retry waiting
//New retry interval attribute private long retryIntervalTime; //Initialize retry interval by constructor public DistributedLock(Jedis jedis, String lockName, long lockExpireSecond, long retryIntervalTime) { ...slightly this.retryIntervalTime = retryIntervalTime; } //Add input parameter and lock timeout public void lock(long timeout,TimeUnit unit) throws TimeoutException { String lockResult = null; try { //Set NX PX parameters SetParams params = new SetParams(); params.nx(); params.px(TimeUnit.SECONDS.toMillis(lockExpireSecond)); //Lock start time long startTime=System.nanoTime(); //Cyclic finite wait while (!"OK".equals(lockResult=this.jedis.set(this.lockName, "lockValue", params))&&!isTimeout(startTime,unit.toNanos(timeout))){ Thread.sleep(retryIntervalTime); } } catch (Exception e) { LOG.error("lock error",e); } //Modify the thrown exception type to timeout exception if ("OK".equals(lockResult)) { LOG.debug("locked success,lockName:{}",lockName); } else { throw new TimeoutException("Get lock failed because of timeout."); } }
step3: you can only unlock the lock you added, and others' locks can't be moved
Consider a problem: in order to prevent machine downtime after locking, we set an expiration time for the lock, so as to ensure that the lock can also provide lock operation for subsequent businesses when the service node is down and can not be unlocked.
In the figure above, the uncontrollable business execution time (or unexpected pauses such as GC) brings problems to the use of distributed locks.
Let's look at problem 1 first: user thread 1} released the lock of thread 2! What shall I do?
Lock, save the thread ID, unlock the verification, and do not release the lock that is not your own
//Other attributes are omitted, and the lockOwner ID is added private String lockOwner; //Initializes the lockOwner identity through the constructor public DistributedLock(Jedis jedis, String lockName, String lockOwner, long lockExpireSecond, long retryIntervalTime) { ...slightly this.lockOwner = lockOwner; } public void lock(long timeout,TimeUnit unit) throws TimeoutException { String lockResult = null; try { //Set NX PX parameters SetParams params = new SetParams(); params.nx(); params.px(TimeUnit.SECONDS.toMillis(lockExpireSecond)); //Lock start time long startTime=System.nanoTime(); // The value at set time is changed to lockOwner while (!"OK".equals(lockResult=this.jedis.set(this.lockName, this.lockOwner, params))&&!isTimeout(startTime,unit.toNanos(timeout))){ Thread.sleep(retryIntervalTime); } } catch (Exception e) { LOG.error("lock error",e); } ...slightly } public boolean unlock() { boolean unlockResult=false; try { // Get the value first and match it with the current lockOwner before unlocking if (this.lockOwner.equals(this.jedis.get(this.lockName))) { this.jedis.del(this.lockName); unlockResult = true; } }catch (Exception e){ LOG.error("unLock error",e); } return unlockResult; }
Some students said that this unlocked place needs to be wrapped into atomic operations with lua. In terms of function alone, the above implementation is also OK, because the following operations will be carried out only when the obtained result matches itself. The purpose of packaging lua scripts should be mainly to reduce one transmission and improve execution efficiency.
Step4: concurrency conflict caused by insufficient expire time
That is, problem 2 in the previous figure: when thread 1 is still executing, the lock expires and is released, resulting in successful locking of thread 2, which directly leads to business conflicts between threads. What shall I do?
During the lock holding period, the expiration time of the lock can be dynamically extended as needed
The scheme selection for triggering lock delay is also a major event. jdk native timer, scheduling thread pool and netty timer can be implemented. Which one is better?
In terms of comprehensive comparison accuracy and resource consumption, the Timer using the time wheel algorithm in Netty should be the first choice. It can manage thousands of connections, schedule heartbeat detection, and use it to make a lock delay?
• first, you need to build a global Timer to store and schedule tasks • second, you need to add a timed trigger task after locking succeeds • third, you need to verify whether the current thread still holds the lock when delaying the operation • finally, you need to cancel the timed task when unlocking • note that the task needs to be registered circularly, taking into account the interruption of the thread
Build a distributed lock context to store the global time wheel scheduler:
public class LockContext { private HashedWheelTimer timer; private LockContext(){ //Time wheel parameters can be obtained from the business's own configuration // long tickDuration=(Long) config.get("tickDuration"); // int tickPerWheel=(int) config.get("tickPerWheel"); // Default 1024 // boolean leakDetection=(Boolean)config.get("leakDetection"); timer = new HashedWheelTimer(new DefaultThreadFactory("distributedLock-timer",true), 10, TimeUnit.MILLISECONDS, 1024, false); }
Pass the context and scheduler into the distributed lock object through the constructor:
public class DistributedLock { //context private LockContext context; //Currently held Timer scheduling object private volatile Timeout lockTimeout; public DistributedLock(Jedis jedis, String lockName, String lockOwner, long lockExpireSecond, long retryIntervalTime, LockContext context) { ...Other attributes are omitted this.context = context; }
After locking is successful, execute the scheduler registration operation:
public void lock(long timeout, TimeUnit unit) throws TimeoutException { //... Locking strategy if ("OK".equals(lockResult)) { LOGGER.info("locked success,lockName:{}",lockName); try { //Registration cycle delay event registerLoopReExpire(); }finally { if (Thread.currentThread().isInterrupted()&&this.lockTimeout!=null){ LOGGER.warn("Thread interrupt, scheduled task cancel"); this.lockTimeout.cancel(); } } } else { throw new TimeoutException("Get lock failed because of timeout."); } }
The method registerloop reexpire() contains the actual task registration and postponement operations:
private void registerLoopReExpire() { LOGGER.info("Distributed lock deferred task registration"); //Each time you register, you assign timeout to the current lock object for cancellation in subsequent unlocking this.lockTimeout = context.getTimer().newTimeout(new TimerTask() { @Override public void run(Timeout timeout) throws Exception { //Verify that the lock is still held and extend the expiration time boolean isReExpired=reExpireLock(lockName,lockOwner); if (isReExpired) { //Adjust yourself and register circularly registerLoopReExpire(); }else { lockTimeout.cancel(); } } }, TimeUnit.SECONDS.toMillis( lockExpireSecond)/2, TimeUnit.MILLISECONDS); LOGGER.info("Distributed lock delay task registration completed"); }
Here are several points to focus on:
• the newTimeout() operation will return a Timeout entity, which we need to rely on to manage the current task, so we need to assign it to the internal object of the lock. • Lock delay needs to be judged according to lockOwner and lockName. Lock can only be added after holding the lock. lua method needs to be used to ensure the atomicity of judgment and execution. • After the postponement operation, follow-up processing needs to be carried out according to the results. If successful, continue to register, and if failed, cancel the current task. • The execution time of the scheduled task should be less than the expiration time of the lock. Take 1 / 2 or 1 / 3 of the expiration time or user-defined input.
Let's verify that we set the lock expiration time to 3 seconds and the service execution time to 10 seconds. Execution:
It can be seen that the scheduled task has been postponed for 6 times. The last registration was successful, but the unlocking task was cancelled after the business was executed.
Summary and review
In this paper, we encode and implement distributed locks from 0 to 1. Various demands from basic capabilities to production environment have been basically filled and improved.
It is worth mentioning that except for the delay function, most of the above capabilities have been tested in the production environment. If you find any problems with the implementation of the extension function, please leave a message to correct and discuss progress together.
Of course, the above contents are still missing, such as the delayed implementation of jedis # operation lua script and the transformation of reentry lock. Due to space reasons, they are not posted. Interested students can continue to improve according to the above ideas.
In addition, our above implementations are based on the master-slave architecture. Therefore, distributed locks may be abnormal in master-slave switching or other downtime scenarios. Personally, I think it is not necessary to sacrifice efficiency to ensure stable redLock in most scenarios. As for this part, in fact, several number masters have described it very well. You can search and have a look.
Finally, when we compare redisson's distributed lock implementation and look back and forth at our own implementation, we will find that the implementation of the main logic is basically the same, but redisson should be more complete in terms of reentry and efficiency (the application of netty Architecture).
Get on paper and finally feel shallow ~ encourage each other~
dd