Message storage structure of RocketMQ

Posted by machina3k on Sat, 30 Oct 2021 09:20:18 +0200

Storage path configuration of files

#Storage path
#commitLog storage path
#Consumption queue storage path storage path
#Message index storage path
#checkpoint file storage path
#abort file storage path

The storage of RocketMQ messages is divided into three parts:


CommitLog: stores the metadata of the message. All messages sent by the Producer are stored in the CommitLog file in the fastest time sequence.

The messages you send have different topics, queueid message. No matter what the topic of the message you send, MQ stores the message in the CommitLog as soon as possible

CommitLog consists of multiple files, each with a fixed size of 1G. Take the offset of the first message as the file name.

How can I find the data in the CommitLog file? I want to find a topic message. How can I find it? After saving the message, CommitLog will distribute the message to two index files, one is ConsumerQueue and the other is IndexFile

The file name of CommitLog is the offset of the message. The size is fixed, that is 1073741824. This is 1G

If the 000000000000000000000000000 file cannot be saved, a new file of the same size will be created immediately. The file name directly uses the index value of the first message in the new file as the file name. For example, it may be 0000000000000000000000, which means that the file is saved from the first 10000 messages

[root@zjj102 commitlog]# ll
 Total consumption 466152
-rw-r--r--. 1 root root 1073741824 10 May 25-21:26 00000000000000000000
[root@zjj102 commitlog]#


ConsumerQueue: the index of the message stored in the CommitLog. A MessageQueue a file that records the current
The CommitLog that MessageQueue is consumed by which consumer groups is the progress of message consumption by a queue

In order to ensure efficiency, the ConsumerQueue does not store message memory, but only the message index of the CommitLog file. In this way, I can quickly find the content of the message in the CommitLog through the ConsumerQueue

The ConsumerQueue file also filters the tag and saves the tag index. If we use tag to filter messages, it will be very fast. Therefore, Alibaba officials suggest that we use tag to filter messages, because the tag filtering efficiency is very high. We filter messages directly based on the underlying ConsumerQueue file of RocketMQ

# Enter the consumequeue folder, where there are multiple folders. The folder name is topic
[root@zjj102 consumequeue]# ls
BatchTest            RMQ_SYS_TRANS_HALF_TOPIC     TagFilterTest
OrderTopicTest       SCHEDULE_TOPIC_XXXX          TopicTest2
# Enter the topic folder of BatchTest
[root@zjj102 consumequeue]# cd BatchTest/
# Inside is the queue folder under this topic. Each folder is the queue name
[root@zjj102 BatchTest]# ls
0  1  2  3
# Enter the name of queue 0, which is the file of queue 0. It is also a binary file. The size of the binary file is fixed, and the file name is the offset of the first content of the file
[root@zjj102 BatchTest]# cd 0/
[root@zjj102 0]# ls
00000000000000000000  00000000000006000000  00000000000012000000


IndexFile: this file provides us with some auxiliary indexing functions based on the ConsumerQueue file, such as timestamp based filtering,

For message query, it provides a method to query messages through key or time interval. This method of finding messages through IndexFile does not affect the main process of sending and consuming messages

IndexFile files can support Key Hash search, TimeStamp search, and so on

  //Instantiate with the specified consumer group name
        DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("defaultGroup");

        //Read message from timestamp
        // Specify the time point through the setConsumeTimestamp method, and then read it later through this time point
	String consumeTimestamp = "Start timestamp";
        consumer.setConsumeTimestamp(consumeTimestamp );

The name of IndexFile is similar to that of CommitLog file, that is, each file is of fixed size, but the file name is in the format of timestamp

[root@zjj102 index]# ls

These three files are binary files

Understand other unimportant documents


This file is an identification file used by RocketMQ to determine whether the program is closed normally. Normally, it is created at startup and deleted when the service is shut down. However, in case of server downtime or abnormal shutdown of services such as kill -9, the abort file will not be deleted. Therefore, RocketMQ can judge that the service was abnormally shut down last time, and some data recovery operations will be carried out later.

Data recovery operations, such as CommitLog distributing messages to the ConsumerQueue file, IndexFile file file, etc. because you closed abnormally last time, the tasks may be distributed halfway, and the RocketMQ process will stop. At this time, restart these tasks after starting RocketMQ


Data save checkpoint


These files save some key configuration information of RocketMQ. For example, Topic configuration, consumer group configuration, consumer group message Offset offset and other information. It is stored in json format. Programmers can open this file for viewing

Many RocketMQ monitoring software read config/*.json files to obtain MQ queue information

The overall message storage structure is shown in the figure below:

Topics: RocketMQ