Write in front
In our daily development, we often encounter the business scenario of "triggering an event after a period of time". For example:
- If the e-commerce platform does not pay within 30 minutes after placing an order, the order will be automatically cancelled
- Red envelopes will not be collected for 24 hours and will be returned automatically
Common solutions
1. Regular scanning
Record the trigger time of the event in advance, and the scheduled task keeps checking the database to compare the trigger time.
This method is not real-time. As the execution frequency of scheduled tasks becomes higher, the real-time triggering will be improved, but frequent scanning increases the pressure of the database, which is also the simplest way.2.jdk solution
jdk provides us with Timer, delay queue and DelayQueue.
This method can be used in a single machine environment with low reliability requirements. Tasks and queues exist in the jvm memory, so it does not support a distributed environment, and the system cannot be recovered after a sudden shutdown.3. Delay message of message middleware
The producer delivers the delayed message, and the consumer can consume the message after the specified time. In this way, for our business development, we only need to pay attention to the message that has just expired.
There are many mature Message Oriented Middleware in the world. As a java developer, I prefer the open source rocketmq, because other message oriented middleware is a black box for me. In rocketmq, you can even debug the source code when you encounter problems and doubts.Initial knowledge of rocketmq delay message
Use the official producer delivery message api
It can be seen that the official method provided for us is setDelayTimeLevel() is not a user-defined delay time.I feel very puzzled about the design of this place, so I have this article. The delay level '5' in the figure is just a level arbitrarily set (corresponding to one minute), which will be analyzed in detail later.
After delivery, open the console immediately to view
The consumption sites corresponding to the four queues in the topic have not changed, so the consumers who subscribe to the topic cannot consume this message immediately
After waiting for delay
It is found that this message appears in this topic, and the time point is just one minute after my delivery time
guess
From the external manifestation, the delay message is not directly delivered to the corresponding topic, but has experienced a "transit" between the producer and the topic. The producer delivers the delay message to the "transit station", and there are other tasks to take the expired message from the "transit station" and send it to the topic.
Source code debug analysis
Tip: if you follow the manufacturer's The send() method may not be easy to find. There are many levels, and the delay level is set to Of Message class setDelayTimeLevel(), there must be a place to call In Message getDelayTimeLevel(), called in several places Place a breakpoint in the getDelayTimeLevel() method to find the processing logic of the delay.Producer delivers commitlog java
Throw ordinary messages with delay level directly to the "transfer station" - a message named schedule_ TOPIC_ Special queue of XXXX
this SCHEDULE_ TOPIC_ The special topic "transit station" of XXXX cannot be viewed on the console, but it exists in our persistence Directory store. As shown in the figure, the queue number is 4, which corresponds to the delay level minus 1 (5-1 = 4), which is consistent with our source code analysis.public PutMessageResult putMessage(final MessageExtBrokerInner msg) { ... if (msg.getDelayTimeLevel() > 0) { //Delay message processing logic if (msg.getDelayTimeLevel() > this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()) { //If the delay level is greater than the maximum value, it is set to the maximum value msg.setDelayTimeLevel(this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()); } //A special topic constant named "SCHEDULE_TOPIC_XXXX" topic = ScheduleMessageService.SCHEDULE_TOPIC; //Queue number is delaylevel - 1 (delay level minus 1) queueId = ScheduleMessageService.delayLevel2QueueId(msg.getDelayTimeLevel()); // Backup real topic, queueId //Set the real topic and queueId of the message as other attributes and save MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_TOPIC, msg.getTopic()); MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_QUEUE_ID, String.valueOf(msg.getQueueId())); msg.setPropertiesString(MessageDecoder.messageProperties2String(msg.getProperties())); //Reset the topic of the message to "SCHEDULE_TOPIC_XXXX" msg.setTopic(topic); msg.setQueueId(queueId); } ... }
Delay logic processing schedulemessageservice java
1. Set the corresponding relationship between delay level and delay duration (parsedelaylevel)
public boolean parseDelayLevel() { HashMap<String, Long> timeUnitTable = new HashMap<String, Long>(); timeUnitTable.put("s", 1000L); timeUnitTable.put("m", 1000L * 60); timeUnitTable.put("h", 1000L * 60 * 60); timeUnitTable.put("d", 1000L * 60 * 60 * 24); //String with different delay time //String levelString = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h" String levelString = this.defaultMessageStore.getMessageStoreConfig().getMessageDelayLevel(); try { String[] levelArray = levelString.split(" "); for (int i = 0; i < levelArray.length; i++) { String value = levelArray[i]; String ch = value.substring(value.length() - 1); Long tu = timeUnitTable.get(ch); int level = i + 1; if (level > this.maxDelayLevel) { this.maxDelayLevel = level; } long num = Long.parseLong(value.substring(0, value.length() - 1)); long delayTimeMillis = tu * num; //Corresponding relationship between storage delay level and delay duration this.delayLevelTable.put(level, delayTimeMillis); } } catch (Exception e) { log.error("parseDelayLevel exception", e); log.info("levelString String = {}", levelString); return false; } return true; }
2. Set a separate round robin task start() for each delay level
public void start() { //cas optimistic lock ensures thread safety if (started.compareAndSet(false, true)) { this.timer = new Timer("ScheduleMessageTimerThread", true); for (Map.Entry<Integer, Long> entry : this.delayLevelTable.entrySet()) { //Traversal delay level Integer level = entry.getKey(); Long timeDelay = entry.getValue(); Long offset = this.offsetTable.get(level); if (null == offset) { offset = 0L; } if (timeDelay != null) { //Create a task to distribute delayed messages for each delay level, with a delay of 1s for the first time this.timer.schedule(new DeliverDelayedMessageTimerTask(level, offset), FIRST_DELAY_TIME); } } /* Persistent task */ ... } }
Distribute delay message deliverdelayedmessagetimertask java
After the delay message of the "transfer station" expires, it is converted into a common message and delivered to the target topic:
public void executeOnTimeup() { /* From schedule_ TOPIC_ Get the message corresponding to a specific delay level in XXXX */ ... //current time long now = System.currentTimeMillis(); //The real time when the message should be sent after the delay long deliverTimestamp = this.correctDeliverTimestamp(now, tagsCode); nextOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE); //Waiting time long countdown = deliverTimestamp - now; if (countdown <= 0) { //It has expired without waiting MessageExt msgExt = ScheduleMessageService.this.defaultMessageStore.lookMessageByOffset( offsetPy, sizePy); if (msgExt != null) { try { //Convert the delayed message into a normal message (remember the above 'CommitLog.java' to convert a normal message into a delayed message) MessageExtBrokerInner msgInner = this.messageTimeup(msgExt); //Send message to destination topic PutMessageResult putMessageResult = ScheduleMessageService.this.writeMessageStore .putMessage(msgInner); if (putMessageResult != null && putMessageResult.getPutMessageStatus() == PutMessageStatus.PUT_OK) { continue; } else { // XXX: warn and notify me //Error retry log.error( "ScheduleMessageService, a message time up, but reput it failed, topic: {} msgId {}", msgExt.getTopic(), msgExt.getMsgId()); ScheduleMessageService.this.timer.schedule( new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset), DELAY_FOR_A_PERIOD); ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset); return; } } catch (Exception e) { log.error( "ScheduleMessageService, messageTimeup execute error, drop it. msgExt=" + msgExt + ", nextOffset=" + nextOffset + ",offsetPy=" + offsetPy + ",sizePy=" + sizePy, e); } } } else { //Countdown > 0, that is, the message has not expired, that is, it needs to wait for countdown milliseconds //Delay countdown milliseconds for recursion. The design here is very clever. It is directly postponed to the time point of message sending, so there is no need to repeatedly judge whether it is expired ScheduleMessageService.this.timer.schedule( new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset), countdown); ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset); return; } ... // If no delay message is found, start the timer recursion again with a delay of 0.1s ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel, failScheduleOffset), DELAY_FOR_A_WHILE); }
RocketMQ delay message summary
- A layer of "transfer station" named schedule is added between the producer and the destination topic_ TOPIC_ Topic of XXXX
- There are 18 queue s in this special topic by default, corresponding to different delay levels
- For each queue in topic, there will be a task to detect whether the message in the queue expires. If it expires, it will be delivered to the final destination topic
characteristic
- 1. Partition all messages with delay level to improve the file search performance
- 2. For each level of partition directory, an ordered queue is maintained from small to large according to the delay time
guess
- If arbitrary precision is supported
To sum up: considering the read-write performance of messages with different delays after persistence and the real-time of delay triggering, RocketMQ delay message introduces the scheme of "delay level" to balance performance and real-time