Once the message middleware such as Rabbit MQ is used in the project, it means that the volume of the project has reached a certain level, and the stand-alone node is stretched. Therefore, a cluster is usually built. Let's try to build a cluster of three machines.
Node information
Node 1: 192.168.0.116 centos1
Node 2: 192.168.0.117 centos2
Node 3: 192.168.0.118 centos3
Configure hosts file
Configure the hosts file of each node so that each node can identify each other.
# /etc/hosts 192.168.0.116 centos1 192.168.0.117 centos2 192.168.0.118 centos3
Copy cookie file
Edit the cookie file of Rabbit MQ to ensure that the cookie file of each node is the same. Here, use the cookie file of centos1 node to copy the cookie file of centos1 to / var/lib/rabbit/.erlang.cookie or $HOME/.erlang.cookie of centos2 and centos3.
cookie file location
The default path of the cookie file is / var/lib/rabbit/.erlang.cookie or $HOME/.erlang.cookie. Cookies are equivalent to key tokens. Nodes in the cluster need to exchange key tokens to obtain mutual authentication.
- If you use the decompression installation method (binary installation or compilation installation), the file exists in the $home directory. That is, $HOME/.erlang.cookie. If we use root installation, the location is: / root/.erlang.cookie, and other users are / home / user name /. erlang.cookie.
- If you install using the rpm package, this file will exist in the / var/lib/rabbitmq directory.
Find. erlang.cookie file
You can view the information of Rabbit MQ log, as shown below: so my. erlang.cookie file is in / root/.erlang.cookie
[info] <0.270.0> node : rabbit@centos1 home dir : /root (I am root (user initiated) config file(s) : (none) cookie hash : tCXB8mlCcGEGGV1cYRkQCg== log(s) : /usr/local/rabbitmq_server/var/log/rabbitmq/rabbit@centos1.log : /usr/local/rabbitmq_server/var/log/rabbitmq/rabbit@centos1_upgrade.log database dir : /usr/local/rabbitmq_server/var/lib/rabbitmq/mnesia/rabbit@centos1
Configure cluster
There are three ways to configure a cluster. Here, use the rabbitmqctl tool:
- Configure through rabbitmqctl tool;
- Configure through rabbitmq.config configuration file;
- Configure through rabbitmq autocluster plug-in;
Start the Rabbit MQ service on three nodes:
[root@centos1 ~]# rabbitmq-server -detached [root@centos2 ~]# rabbitmq-server -detached [root@centos3 ~]# rabbitmq-server -detached
After startup, the three nodes are still independent. You can use rabbitmqctl cluster_ The status command to view the status.
Join cluster
Based on centos1 node, add centos2 and centos3 nodes to centos1. Taking centos2 as an example, the steps are as follows:
[root@centos2 ~]# rabbitmqctl stop_app Stopping rabbit application on node rabbit@centos2 ... [root@centos2 ~]# rabbitmqctl reset Resetting node rabbit@centos2 ... [root@centos2 ~]# rabbitmqctl join_cluster rabbit@centos1 Clustering node rabbit@centos2 with rabbit@centos1 [root@centos2 ~]# rabbitmqctl start_app Starting node rabbit@centos2 ...
Use the command rabbitmqctl cluster on any node_ status
[ {nodes,[{disc,[rabbit@centos1,rabbit@centos2,rabbit@centos3]}]}, {running_nodes,[rabbit@centos1,rabbit@centos2,rabbit@centos3]}, {cluster_name,<<"rabbit@centos1">>}, {partitions,[]}, {alarms,[{rabbit@centos1,[]},{rabbit@centos2,[]},{rabbit@centos3,[]}]} ]
Web side presentation
Stop a node
The above is how to build a cluster of three nodes. If you stop and start a node, what will be the result? Stop centos2 and try.
[root@centos2 ~]# rabbitmqctl stop_app
rabbitmqctl cluster_ Status view
[ {nodes,[{disc,[rabbit@centos1,rabbit@centos2,rabbit@centos3]}]}, {running_nodes,[rabbit@centos1,rabbit@centos3]}, {cluster_name,<<"rabbit@centos1">>}, {partitions,[]}, {alarms,[{rabbit@centos1,[]},{rabbit@centos3,[]}]} ]
Web side presentation
Cluster node shutdown and startup
If all nodes in the cluster are shut down, you need to ensure that the last node shut down at startup is the first one to start. If the first node started is not the last closed node, the node will wait for the last closed node to start. The waiting time is 30 seconds. If you do not wait, the node that starts first will also fail.
There will be a retry mechanism. By default, it will retry 20 times for 30 seconds each time to wait for the startup of the last shutdown node. Current version: Rabbit MQ: Rabbit MQ 3.8.9 on Erlang 23.0
After the retry fails, the current node will close its own application because of the failure.
Retry log:
2021-11-27 17:12:21.783 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 9 retries left 2021-11-27 17:12:22.206 [debug] <0.2664.0> Lager installed handler lager_backend_throttle into lager_event 2021-11-27 17:12:51.784 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:12:51.784 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 8 retries left 2021-11-27 17:13:21.785 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:13:21.785 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 7 retries left 2021-11-27 17:13:51.786 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:13:51.786 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 6 retries left 2021-11-27 17:14:21.787 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:14:21.787 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 5 retries left 2021-11-27 17:14:51.788 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:14:51.788 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 4 retries left 2021-11-27 17:15:21.789 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:15:21.789 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 3 retries left 2021-11-27 17:15:51.790 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:15:51.790 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 2 retries left 2021-11-27 17:16:21.791 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:16:21.791 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 1 retries left 2021-11-27 17:16:51.792 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]} 2021-11-27 17:16:51.792 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 0 retries left 2021-11-27 17:17:21.793 [error] <0.2656.0> Feature flag `quorum_queue`: migration function crashed: {error,{timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_durable_queue]}} [{rabbit_table,wait,3,[{file,"src/rabbit_table.erl"},{line,120}]},{rabbit_core_ff,quorum_queue_migration,3,[{file,"src/rabbit_core_ff.erl"},{line,60}]},{rabbit_feature_flags,run_migration_fun,3,[{file,"src/rabbit_feature_flags.erl"},{line,1602}]},{rabbit_feature_flags,'-verify_which_feature_flags_are_actually_enabled/0-fun-2-',3,[{file,"src/rabbit_feature_flags.erl"},{line,2269}]},{maps,fold_1,3,[{file,"maps.erl"},{line,233}]},{rabbit_feature_flags,verify_which_feature_flags_are_actually_enabled,0,[{file,"src/rabbit_feature_flags.erl"},{line,2267}]},{rabbit_feature_flags,sync_feature_flags_with_cluster,3,[{file,"src/rabbit_feature_flags.erl"},{line,2082}]},{rabbit_mnesia,ensure_feature_flags_are_in_sync,2,[{file,"src/rabbit_mnesia.erl"},{line,647}]}] 2021-11-27 17:17:21.793 [warning] <0.2656.0> Feature flags: the previous instance of this node must have failed to write the `feature_flags` file at `/usr/local/rabbitmq_server/var/lib/rabbitmq/mnesia/rabbit@centos2-feature_flags`: 2021-11-27 17:17:21.793 [warning] <0.2656.0> Feature flags: - list of previously disabled feature flags now marked as such: [empty_basic_get_metric] 2021-11-27 17:17:21.800 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 9 retries left 2021-11-27 17:17:51.801 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:17:51.801 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 8 retries left 2021-11-27 17:18:21.802 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:18:21.802 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 7 retries left 2021-11-27 17:18:51.803 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:18:51.803 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 6 retries left 2021-11-27 17:19:21.804 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:19:21.804 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 5 retries left 2021-11-27 17:19:51.805 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:19:51.805 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 4 retries left 2021-11-27 17:20:21.806 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:20:21.806 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 3 retries left 2021-11-27 17:20:51.807 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:20:51.807 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 2 retries left 2021-11-27 17:21:21.808 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:21:21.808 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 1 retries left 2021-11-27 17:21:51.809 [warning] <0.2656.0> Error while waiting for Mnesia tables: {timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]} 2021-11-27 17:21:51.809 [info] <0.2656.0> Waiting for Mnesia tables for 30000 ms, 0 retries left 2021-11-27 17:22:21.813 [info] <0.44.0> Application mnesia exited with reason: stopped 2021-11-27 17:22:21.813 [info] <0.44.0> Application mnesia exited with reason: stopped 2021-11-27 17:22:21.817 [error] <0.2656.0> 2021-11-27 17:22:21.818 [error] <0.2656.0> BOOT FAILED 2021-11-27 17:22:21.818 [error] <0.2656.0> =========== 2021-11-27 17:22:21.818 [error] <0.2656.0> Timeout contacting cluster nodes: [rabbit@centos3,rabbit@centos1]. 2021-11-27 17:22:21.818 [error] <0.2656.0> 2021-11-27 17:22:21.818 [error] <0.2656.0> BACKGROUND 2021-11-27 17:22:21.818 [error] <0.2656.0> ========== 2021-11-27 17:22:21.818 [error] <0.2656.0> 2021-11-27 17:22:21.818 [error] <0.2656.0> This cluster node was shut down while other nodes were still running. 2021-11-27 17:22:21.818 [error] <0.2656.0> To avoid losing data, you should start the other nodes first, then 2021-11-27 17:22:21.818 [error] <0.2656.0> start this one. To force this node to start, first invoke 2021-11-27 17:22:21.818 [error] <0.2656.0> "rabbitmqctl force_boot". If you do so, any changes made on other 2021-11-27 17:22:21.818 [error] <0.2656.0> cluster nodes after this one was shut down may be lost. 2021-11-27 17:22:21.818 [error] <0.2656.0> 2021-11-27 17:22:21.818 [error] <0.2656.0> DIAGNOSTICS 2021-11-27 17:22:21.818 [error] <0.2656.0> =========== 2021-11-27 17:22:21.819 [error] <0.2656.0> 2021-11-27 17:22:21.819 [error] <0.2656.0> attempted to contact: [rabbit@centos3,rabbit@centos1] 2021-11-27 17:22:21.819 [error] <0.2656.0> 2021-11-27 17:22:21.819 [error] <0.2656.0> rabbit@centos3: 2021-11-27 17:22:21.819 [error] <0.2656.0> * connected to epmd (port 4369) on centos3 2021-11-27 17:22:21.819 [error] <0.2656.0> * node rabbit@centos3 up, 'rabbit' application not running 2021-11-27 17:22:21.819 [error] <0.2656.0> * running applications on rabbit@centos3: [lager,observer_cli, 2021-11-27 17:22:21.819 [error] <0.2656.0> stdout_formatter, 2021-11-27 17:22:21.819 [error] <0.2656.0> gen_batch_server,aten,cuttlefish, 2021-11-27 17:22:21.819 [error] <0.2656.0> inets,credentials_obfuscation, 2021-11-27 17:22:21.820 [error] <0.2656.0> recon,ranch,jsx,goldrush,xmerl, 2021-11-27 17:22:21.820 [error] <0.2656.0> tools,syntax_tools,ssl, 2021-11-27 17:22:21.820 [error] <0.2656.0> public_key,asn1,crypto,compiler, 2021-11-27 17:22:21.820 [error] <0.2656.0> sasl,stdlib,kernel] 2021-11-27 17:22:21.820 [error] <0.2656.0> * suggestion: use rabbitmqctl start_app on rabbit@centos3 2021-11-27 17:22:21.820 [error] <0.2656.0> rabbit@centos1: 2021-11-27 17:22:21.820 [error] <0.2656.0> * connected to epmd (port 4369) on centos1 2021-11-27 17:22:21.923 [error] <0.2656.0> * node rabbit@centos1 up, 'rabbit' application not running 2021-11-27 17:22:21.924 [error] <0.2656.0> * running applications on rabbit@centos1: [lager,observer_cli, 2021-11-27 17:22:21.924 [error] <0.2656.0> stdout_formatter, 2021-11-27 17:22:21.924 [error] <0.2656.0> gen_batch_server,aten,cuttlefish, 2021-11-27 17:22:21.924 [error] <0.2656.0> inets,credentials_obfuscation, 2021-11-27 17:22:21.924 [error] <0.2656.0> recon,ranch,jsx,goldrush,xmerl, 2021-11-27 17:22:21.924 [error] <0.2656.0> tools,syntax_tools,ssl, 2021-11-27 17:22:21.924 [error] <0.2656.0> public_key,asn1,crypto,compiler, 2021-11-27 17:22:21.924 [error] <0.2656.0> sasl,stdlib,kernel] 2021-11-27 17:22:21.924 [error] <0.2656.0> * suggestion: use rabbitmqctl start_app on rabbit@centos1 2021-11-27 17:22:21.924 [error] <0.2656.0> 2021-11-27 17:22:21.924 [error] <0.2656.0> Current node details: 2021-11-27 17:22:21.924 [error] <0.2656.0> * node name: rabbit@centos2 2021-11-27 17:22:21.924 [error] <0.2656.0> * effective user's home directory: /root 2021-11-27 17:22:21.924 [error] <0.2656.0> * Erlang cookie hash: tCXB8mlCcGEGGV1cYRkQCg== 2021-11-27 17:22:21.924 [error] <0.2656.0> 2021-11-27 17:22:21.924 [error] <0.2656.0> 2021-11-27 17:22:22.926 [info] <0.2655.0> [{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.2655.0>},{registered_name,[]},{error_info,{exit,{{timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]},{rabbit,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,138}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}},{ancestors,[<0.2654.0>]},{message_queue_len,1},{messages,[{'EXIT',<0.2656.0>,normal}]},{links,[<0.2654.0>,<0.44.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,1598},{stack_size,28},{reductions,368}], [] 2021-11-27 17:22:22.926 [error] <0.2655.0> CRASH REPORT Process <0.2655.0> with 0 neighbours exited with reason: {{timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]},{rabbit,start,[normal,[]]}} in application_master:init/4 line 138 2021-11-27 17:22:22.927 [info] <0.44.0> Application rabbit exited with reason: {{timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]},{rabbit,start,[normal,[]]}} 2021-11-27 17:22:22.927 [info] <0.44.0> Application rabbit exited with reason: {{timeout_waiting_for_tables,[rabbit@centos3,rabbit@centos2,rabbit@centos1],[rabbit_user,rabbit_user_permission,rabbit_topic_permission,rabbit_vhost,rabbit_durable_route,rabbit_durable_exchange,rabbit_runtime_parameters,rabbit_durable_queue]},{rabbit,start,[normal,[]]}} 2021-11-27 17:22:22.928 [info] <0.44.0> Application sysmon_handler exited with reason: stopped 2021-11-27 17:22:22.928 [info] <0.44.0> Application sysmon_handler exited with reason: stopped 2021-11-27 17:22:22.932 [info] <0.44.0> Application ra exited with reason: stopped 2021-11-27 17:22:22.932 [info] <0.44.0> Application ra exited with reason: stopped 2021-11-27 17:22:22.933 [info] <0.44.0> Application os_mon exited with reason: stopped 2021-11-27 17:22:22.933 [info] <0.44.0> Application os_mon exited with reason: stopped
Failure information return
[root@centos2 ~]# rabbitmqctl start_app Starting node rabbit@centos2 ... Error: {:rabbit, {{:timeout_waiting_for_tables, [:rabbit@centos3, :rabbit@centos2, :rabbit@centos1], [:rabbit_user, :rabbit_user_permission, :rabbit_topic_permission, :rabbit_vhost, :rabbit_durable_route, :rabbit_durable_exchange, :rabbit_runtime_parameters, :rabbit_durable_queue]}, {:rabbit, :start, [:normal, []]}}}
Eliminate nodes
If the last closed node finally fails to start due to some exceptions, you can use rabbitmqctl forget_ cluster_ The node command knocks the node out of the cluster.
If all nodes in the cluster are shut down due to abnormal factors, the nodes in the cluster will think that they are not the last to shut down. At this time, you need to call rabbitmqctl force_boot command to start a node, and then the cluster can start normally.
[root@centos2 ~]# rabbitmqctl force boot Forcing boot for Mnesia dir /usr/local/rabbitmq_server/var/lib/rabbitmq/mnesia/rabbit@centos2 [root@centos2 ~]# rabbitmq server detached
summary
The above is the process of multi machine and multi node configuration. In particular, keep in mind the startup sequence of nodes in the cluster, and be sure to start the last closed node first to avoid startup failure.