1, Message queue
In a computer, a message is the data sent by one program to another, and a queue (a data structure) is a container. Everything stored in it is put in first and taken out first. Therefore, message queue is a container that temporarily saves messages during message transmission. The use of queues can avoid the disadvantage that messages need to be received and sent synchronously. The sender can directly put messages into queues to avoid blocking due to waiting for the receiver.
1.1 function
-
Application decoupling:
Split a large system into several small systems, and use message queues to interact with each other or call functions.
-
Flow peak clipping:
In application scenarios with large traffic, such as second kill activities. In this case, the server is easy to crash. But I refuse users directly, which is a very bad user experience for users. Therefore, we can control the speed of processing requests, put the temporarily unprocessed requests into the message queue, and let the user wait for a while, which is much better than rejecting them directly.
-
Message distribution:
It is to send data to other hosts and programs that need to receive data
-
Asynchronous processing:
For example, after a user successfully purchases a certain commodity, he will send SMS and email of the order to him at the same time, and then jump to the success page:
- No message queue: first hand over the order data to the SMS module to send SMS, then hand over to the email module to send email, and then jump to the success page.
- Use message queue: jump directly to the success page. As for mobile phone SMS and e-mail, you can put the order data into the message queue. The SMS module and e-mail module read and send the data asynchronously in the background.
1.2 comparison of mainstream message queues
Kafka | RocketMQ | RabbitMQ | |
---|---|---|---|
Single machine throughput | Class 100000 | Class 100000 | 10000 class |
Message delay | millisecond | millisecond | microsecond |
usability | Very high (distributed) | Very high (distributed) | High (master-slave) |
Message loss | Theoretically, it will not be lost | Theoretically, it will not be lost | low |
Community activity | high | in | high |
2, RabbitMQ installation
2.1 installation
2.1.1 Docker mode
# for RabbitMQ 3.9, the latest series docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.9-management # for RabbitMQ 3.8, # 3.8.x support timeline: https://www.rabbitmq.com/versions.html docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.8-management
2.1.2 native mode (Ubuntu 20.04)
-
Save the following code to rabbitMQ_install.sh file:
#!/usr/bin/sh sudo apt-get install curl gnupg apt-transport-https -y ## Team RabbitMQ's main signing key curl -1sLf "https://keys.openpgp.org/vks/v1/by-fingerprint/0A9AF2115F4687BD29803A206B73A36E6026DFCA" | sudo gpg --dearmor | sudo tee /usr/share/keyrings/com.rabbitmq.team.gpg > /dev/null ## Cloudsmith: modern Erlang repository curl -1sLf https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-erlang/gpg.E495BB49CC4BBE5B.key | sudo gpg --dearmor | sudo tee /usr/share/keyrings/io.cloudsmith.rabbitmq.E495BB49CC4BBE5B.gpg > /dev/null ## Cloudsmith: RabbitMQ repository curl -1sLf https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-server/gpg.9F4587F226208342.key | sudo gpg --dearmor | sudo tee /usr/share/keyrings/io.cloudsmith.rabbitmq.9F4587F226208342.gpg > /dev/null ## Add apt repositories maintained by Team RabbitMQ sudo tee /etc/apt/sources.list.d/rabbitmq.list <<EOF ## Provides modern Erlang/OTP releases ## deb [signed-by=/usr/share/keyrings/io.cloudsmith.rabbitmq.E495BB49CC4BBE5B.gpg] https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-erlang/deb/ubuntu bionic main deb-src [signed-by=/usr/share/keyrings/io.cloudsmith.rabbitmq.E495BB49CC4BBE5B.gpg] https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-erlang/deb/ubuntu bionic main ## Provides RabbitMQ ## deb [signed-by=/usr/share/keyrings/io.cloudsmith.rabbitmq.9F4587F226208342.gpg] https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-server/deb/ubuntu bionic main deb-src [signed-by=/usr/share/keyrings/io.cloudsmith.rabbitmq.9F4587F226208342.gpg] https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-server/deb/ubuntu bionic main EOF ## Update package indices sudo apt-get update -y ## Install Erlang packages sudo apt-get install -y erlang-base \ erlang-asn1 erlang-crypto erlang-eldap erlang-ftp erlang-inets \ erlang-mnesia erlang-os-mon erlang-parsetools erlang-public-key \ erlang-runtime-tools erlang-snmp erlang-ssl \ erlang-syntax-tools erlang-tftp erlang-tools erlang-xmerl ## Install rabbitmq-server and its dependencies sudo apt-get install rabbitmq-server -y --fix-missing
-
Add execution permission to the file:
sudo chmod +x ./rabbitMQ_install.sh
-
Execute script:
sudo bash ./rabbitMQ_install.sh
2.2 usage of management plug-in
-
Start the RabbitMQ service:
sudo systemctl start rabbitmq-server
-
To create a RabbitMQ user:
sudo rabbitmqctl add_user 'user name' 'password'
In fact, RabbitMQ comes with a guest user whose user name and password are guest.
-
Grant administrator privileges:
sudo rabbitmqctl set_user_tags user name administrator
-
Grant all permissions to users in the virtual host:
sudo rabbitmqctl set_permissions -p / user name '.*' '.*' '.*'
-
To enable the management plug-in:
sudo rabbitmq-plugins enable rabbitmq_management
-
To access the administration page: http://localhost:15672/ , just enter the user name and password you created earlier.
3, RabbitMQ quick start
3.1 noun introduction
-
Producing: that is, sending messages. The program that sends messages is called "producer".
-
Consuming: that is, receiving messages. The program receiving messages is called "consumer"
-
Queue: the place where messages are stored. The queue is limited only by the memory and disk of the host. It is essentially a large message buffer.
Multiple producers can send messages to a queue, or multiple consumers can try to receive data from a queue.
Note: producers, consumers and message queues can be placed on different machines; Producers and consumers can also be the same program.
3.2 Hello World!
Next, we write two small programs in python code. The producer sends a message (puts the message into the queue) and the consumer receives a message (takes the message out of the queue). The message content is "Hello World!". This is the simplest use process of RabbitMQ.
-
Install the python client officially recommended by RabbitMQ:
pip install pika
-
Send a message and create a new send Py file:
#!/usr/bin/env python import pika # Connect to local message queuing connection = pika.BlockingConnection(pika.ConnectionParameters('localhost')) channel = connection.channel() # Create a queue with the name hello channel.queue_declare(queue='hello') # send message channel.basic_publish(exchange='', routing_key='hello', # Queue name body='Hello World!') # Message content print(" [x] send out 'Hello World!'") # Close connection connection.close() # The network cache is automatically refreshed to ensure that the message is sent
-
Receive messages and create a new receive Py file:
#!/usr/bin/env python import pika, sys, os def main(): # Connect to local message queuing connection = pika.BlockingConnection(pika.ConnectionParameters('localhost')) channel = connection.channel() # Create a queue with the name hello channel.queue_declare(queue='hello') # Callback function, which is called whenever a message is received def callback(ch, method, properties, body): print(" [x] Received %r" % body) channel.basic_consume(queue='hello', # Specifies the queue to accept messages auto_ack=True, # Whether to call callback function automatically on_message_callback=callback) # Specify callback function print(' [*] Waiting for message. Press CTRL+C sign out') # Enter a loop and wait for the message channel.start_consuming() if __name__ == '__main__': try: main() except KeyboardInterrupt: print('Interrupted') try: sys.exit(0) except SystemExit: os._exit(0)
Note: both consumers and producers have created the same queue to ensure that the queue exists no matter which party is started first.
-
Start consumer:
python receive.py # [*] wait for message Press CTRL+C to exit # [x] Received b'Hello World! '
-
Start producer:
python send.py # [x] Send Hello World! '
3.3 work queue
Task queue (also known as work queue) is a queue that stores some resource intensive tasks, which are encapsulated into messages one by one. The purpose is to avoid the blocking of resource intensive tasks on the current process. In the background, let the special task processing process (worker) take out the task (i.e. message, here called task more appropriate) from the task queue to execute. Moreover, when there are multiple task processing processes working together, it is convenient for tasks to be shared among them.
-
Modify send Py, save as new_task.py, the part to be modified is as follows:
import sys # Tasks to process message = ' '.join(sys.argv[1:]) or "Hello World!" channel.basic_publish(exchange='', routing_key='hello', body=message) # Change to task
-
Modify receive Py, save as worker py:
import time def callback(ch, method, properties, body): print(" [x] Received %r" % body.decode()) # There are several "." in the message body Just sleep for a few seconds and pretend that task processing is time-consuming time.sleep(body.count(b'.')) print(" [x] complete")
3.3.1 cyclic scheduling
One of the advantages of using task queues is that it is easy to scale up. If we have a large backlog of tasks and have no time to deal with them, we can easily add more task processing processes (hereinafter referred to as worker s).
-
Start two terminals and run two workers py:
python worker.py # [*] wait for message Press CTRL+C to exit
-
Start another terminal and run new_task.py:
python new_task.py First message. python new_task.py Second message.. python new_task.py Third message... python new_task.py Fourth message.... python new_task.py Fifth message.....
-
See what the worker received:
# First worker: [x] Received 'First message.' [x] complete [x] Received 'Third message...' [x] complete [x] Received 'Fifth message.....' [x] complete # Second worker: [x] Received 'Second message..' [x] complete [x] Received 'Fourth message....' [x] complete
By default, RabbitMQ sends each message to the next worker in order. Generally speaking, each worker will receive the same number of tasks. This way of distributing tasks is called polling.
3.3.2 message confirmation
In the above code, if a worker is terminated during the execution of a task, the task will be lost. In order to avoid task loss, RabbitMQ provides a message confirmation mechanism:
- If the worker replies with an ack (ack nowledgement), RabbitMQ considers that the task has been successfully processed, and then deletes the task from the task queue.
- If a worker terminates the operation without replying to ack, RabbitMQ considers that the task fails to execute, and will re queue the task and assign it to other running workers.
There is a timeout for message confirmation. The default is 30 minutes. It can help us detect abnormal worker s.
By default, manual message confirmation is on. In the previous example, we use auto_ack=True turns them off.
-
Continue to modify the code and remove the parameters for closing the message confirmation:
def callback(ch, method, properties, body): print(" [x] Received %r" % body.decode()) time.sleep(body.count(b'.')) print(" [x] complete") ch.basic_ack(delivery_tag = method.delivery_tag) # Send ack channel.basic_consume(queue='hello', on_message_callback=callback)
After using this code, even if you use CTRL+C to terminate a worker who is working on a task, there will be no loss.
3.3.3 message persistence
We have learned how to ensure that tasks are not lost even if the worker terminates unexpectedly. But if the RabbitMQ server stops, our tasks will still be lost. To ensure that tasks are not lost, you need to do two things: mark both queues and tasks as persistent.
-
Mark the queue as persistent so that the queue can survive after RabbitMQ restarts:
# Because the hello queue already exists, we have a new queue task_queue channel.queue_declare(queue='task_queue', durable=True)
be careful! Persistent tags should be marked on both producers and consumers.
-
Mark task as persistent:
channel.basic_publish(exchange='', routing_key="task_queue", body=message, properties=pika.BasicProperties( delivery_mode = pika.spec.PERSISTENT_DELIVERY_MODE ))
Marking messages as persistent does not fully guarantee that messages will not be lost. Although it tells RabbitMQ to save the message to disk, there is still a short gap when RabbitMQ receives the message and has not saved the message.
In addition, RabbitMQ does not do fsync(2) processing on every message -- it may just be stored in the cache rather than actually written to disk.
The persistence guarantee is not strong, but it is enough for our simple task queue. If you need a stronger guarantee, you can use it publisher confirm.
3.3.4 fair dispatch
By default, RabbitMQ distributes tasks evenly. Even if some tasks are particularly time-consuming and the worker tasks assigned to the task have piled up, RabbitMQ still distributes tasks evenly.
This is because RabbitMQ only allocates tasks when they enter the queue, does not check the number of unconfirmed messages from workers, and blindly sends the nth task to the nth worker. However, we can change this behavior through setting.
-
Configure fair scheduling in worker:
channel.basic_qos(prefetch_count=1)
The number above is 1, which means that before the worker processes and confirms the completion of the previous task, do not assign a new task to it, but send the task to the next worker who is not working.
If all the worker s are busy, the task queue will be full. You need to pay attention to this and add as many staff as possible, or use Message TTL.
3.4 Publish/Subscribe
Publishers (i.e. producers) send a message, and each subscriber (i.e. consumers) can receive it. This mode is called "publish / subscribe".
3.4.1 Exchanges
Previously, we said: messages sent by producers are saved in queues. That's just for the sake of understanding. In fact, the producer never sends any messages directly to the queue, but to the exchange. The switch decides where the message will go next. The specific rules are defined by the switch type.
The exchange types are direct, topic, header and fanout. Create a fanout type switch named logs:
channel.exchange_declare(exchange='logs', exchange_type='fanout')
The fanout switch is very simple. It simply sends messages to the queue it knows.
Send the message to the created switch:
channel.basic_publish(exchange='logs', # Specify the switch by name routing_key='', body=message)
3.4.2 temporary queue
Previously, we used named queues, such as hello and task_queue. What happens if we create a queue without passing in a name? The answer is that RabbitMQ will create a temporary queue with a name like "mq.gen-JzTY20BRgKO-HjmUJj0wLg".
We can also pass in an exclusive=True to achieve the exclusive effect: when the consumer closes the connection, the queue will disappear.
Create an exclusive temporary queue:
result = channel.queue_declare(queue='', exclusive=True)
3.4.3 binding
Bind the exchange and queue in the receiver, that is, subscription:
channel.queue_bind(exchange='logs', queue=result.method.queue)
After that, the logs switch will add messages to our temporary queue. As long as the recipients are bound to the same switch and queue, they can receive messages.
We can also view the binding information through the following command (provided that the receiver is running):
sudo rabbitmqctl list_bindings
3.5 Routing
At present, in our publish / subscribe model, subscribers will receive all messages from publishers without choice. However, we can use routing when binding_ The key parameter allows the subscriber to select the desired message to receive.
routing_ The meaning of key depends on the type of switch. The previously used fan out type will directly ignore this parameter. Therefore, other types of exchangers need to be used. For example, Direct switch.
3.5.1 direct exchanger
The algorithm behind it is simple:
- Look at basic first_ Routing of publish()_ Key and queue_ Routing of bind()_ Key matching;
- If it matches, it will be put into the queue, and if it does not match, it will be matched with other queues;
- If there is no match, the message is discarded.
# Release news channel.basic_publish(exchange='direct_logs', routing_key='black', body=message) # binding channel.queue_bind(exchange=exchange_name, queue=queue_name, routing_key='black')
RabbitMQ allows the same routing_key binds to multiple queues. The switch sends messages to multiple matching queues.
3.5.2 topic exchanger
Routing of topic switch_ Key must be used Split multiple words, such as orange rabbit. Lazy, up to 255 bytes. Routing during Publishing_ Key means yes division. If there is a match between them, the message is sent to the queue, and if there is no match, it is discarded. This is the same as the direct switch. But the real use of topic switch lies in two special symbols:
- *: stands for matching any word. For example, * orange.* It can be matched to any routing with 3 words of orange in the middle_ key.
- #: represents matching any number of words. For example, rabbit# The beginning of BBIT routing can match any_ Key, and there is no limit to the number of words within the legal range.