RabbitMQ High Availability: A Comprehensive Guide 101

on message broker, Message Queue, message queuing, RabbitMQ • May 26th, 2022 • Write for Hevo

rabbitmq high availability - featured image

Organizations are using micro-services in applications, as they are easy to undergo any change or functionality. For seamless functioning of micro-service applications, organizations leverage the RabbitMQ Message Broker, which can communicate information among applications. RabbitMQ consists of nodes in a cluster that enables communication between applications. If any node in the cluster fails due to network connectivity, there is a message loss.

Therefore, to avoid such loss, RabbitMQ uses the concept of Mirrored Queues. Mirrored Queue enables you to implement high availability of messages in the RabbitMQ Cluster. As a result, even if any node in the cluster fails, client applications will communicate with the Mirrored Queue, and the messages in the communication will not be lost.  

In this tutorial, you will learn about RabbitMQ High Availability in great detail! Let’s get started!

Table of Contents

Prerequisites

  • Basics understanding of the need for high availability.

What is RabbitMQ?

RabbitMQ high availability: RabbitMQ logo
Image credit: RabbitMQ

Developed in 2007, RabbitMQ is a popular Open-source Message Broker software written in Erlang. It is a lightweight Message Broker that is used right by small startups to large organizations. It can be easily deployed on both On-premises and Cloud services. However, RabbitMQ can also be deployed on Federated and Distributed configurations to provide high availability and scalability.

RabbitMQ supports a wide range of operating systems with programming languages such as Java, .Net, Python, JavaScript, Ruby, Go, and more. It can also be deployed with BOSH, Chef, Docker, Puppet, and more and support several cross-language messaging with these programming languages.

Key Concepts of RabbitMQ

  • Producer: The Producer is responsible for sending messages to the queue based on the queue name.
  • Queue: The Queue is a sequential data structure used as a medium for messages to be stored and transferred.
  • Consumer: The Consumer is responsible for subscribing to and receiving messages from the broker. 
  • Exchange: The exchange is an entry point for the broker that takes messages from the publisher and routes those messages to the appropriate queue.
  • Broker: The Message broker is responsible for providing storage for produced data. This data is meant to be consumed or received by another application connected to the broker. 
  • Channel: Channel is responsible for offering a lightweight connection to a broker with a shared TCP connection.

What are the Key Features of RabbitMQ?

1) Lightweight

RabbitMQ is lightweight and requires less than 40MB of RAM to run the application core and plugins like the Management UI. But, adding messages to queues will increase memory usage. 

2) Third-party plugins

RabbitMQ consists of a flexible plugin system, including third-party plugins for storing messages into databases or using RabbitMQ for database writes.

3) Open-source

RabbitMQ was developed as an open-source project by Cohesive FT, LShift, and LTD. It is owned by Pivotal software and is distributed under the Mozilla public license. Since RabbitMQ is open-source and written in Erlang, RabbitMQ allows flexibility while using Pivotal’s product strength.

4) Flexibility in Controlling Messaging Trade-offs

RabbitMQ enables users to control the trade-offs between messages, throughput, and performance. All the messages in the queues can specify where they should be saved to a disc before their delivery. Queues in a cluster can span multiple servers while ensuring that no messages are lost in the case of a server failure.

5) Distributed Deployment

RabbitMQ can be deployed as clusters for high availability and throughput. These clusters are federated across multiple availability zones and regions to be available quickly.

6) Asynchronous Messaging

RabbitMQ consists of multiple messaging protocols, Message Queuing, Delivery Acknowledgement, Routing to Queues, and different Exchange Types.

7) Design with Minimal Message Loss

RabbitMQ is mainly known for its excellent design with minimal message loss. Due to this design, RabbitMQ is used in many critical projects where message loss is not accepted.

Replicate Data in Minutes Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 100+ Data Sources (including 40+ free data sources) straight into your Data Warehouse or any Databases.

To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!

GET STARTED WITH HEVO FOR FREE[/hevoButton]

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

What is RabbitMQ High Availability?

The essential requirement to provide RabbitMQ with High Availability has a Cluster Setup. RabbitMQ High Availability can only be set up on the sets of nodes that come under the same cluster. RabbitMQ’s High Availability is achieved by mirroring the queues across different RabbitMQ Nodes in a cluster. As a result, you need to set up a cluster as a prerequisite for setting up RabbitMQ’s High Availability.

When you declare a queue by connecting to a node in the cluster, based on the attribute you pass, a queue will be created on any one of the nodes in the cluster. The node where the queue resides is called the Master Node for that particular queue. Later, when you set up RabbitMQ High Availability for that particular queue, a Mirrored Queue will be created in the other nodes of the cluster.

The messages in the master queue will be synched with the mirror queue in other nodes. Although mirror queues exist, all the publish and consumed requests will be directed only to the master queue. Even if the publisher sends the request directly to the mirror queue, the request will be directed to the node which has the master queue. RabbitMQ does this process to maintain message integrity between the master queue and mirrored queues.

How to Set up RabbitMQ High Availability?

Messages in RabbitMQ provide Replication, Fail-over, and High Availability features to protect your databases from infrastructure, maintenance, failures, and upgrades. Deployment of the RabbitMQ High Availability Clusters consists of three nodes where users, virtual hosts, queues, exchanges, bindings, runtime parameters, and more are shared across all these nodes. A new node is resynchronized to the cluster if the node fails or gets deleted.

Nodes in the cluster are identified with the cluster name, consisting of a prefix and a unique hostname. The hostname is necessary to identify within the cluster, so the hostname should be easily identified using a Domain Name System (DNS).

Before starting with a RabbitMQ High Availability, it is assumed that you have gone through the Decoupling of the communication with RabbitMQ

Follow the below steps to set up RabbitMQ High Availability:

Step 1: Define a Docker Bridge Network for providing automatic DNS resolution between containers automatically.

docker network create --subnet=192.168.0.0/16 cluster-network

Step 2: Create cluster nodes using two Docker containers, which run RabbitMQ using the below commands.

docker run -d -h node1.rabbit 
           --net cluster-network --ip 192.168.0.10
           --name rabbitNode1
           --add-host node2.rabbit:192.168.0.11
           -p "4369:4369"
           -p "5672:5672"
           -p "15672:15672"
           -p "25672:25672"
           -p "35672:35672"
           -e "RABBITMQ_USE_LONGNAME=true"
           -e RABBITMQ_ERLANG_COOKIE="cookie"
           rabbitmq:3-management
  
  
docker run -d -h node2.rabbit
           --net cluster-network --ip 192.168.0.11
           --name rabbitNode2
           --add-host node1.rabbit:192.168.0.10
           -p "4370:4369"
           -p "5673:5672"
           -p "15673:15672"
           -p "25673:25672"
           -p "35673:35672"
           -e "RABBITMQ_USE_LONGNAME=true"
           -e RABBITMQ_ERLANG_COOKIE="cookie"
           rabbitmq:3-management

Output:

RabbitMQ high availability: create cluster nodes
Image credit: blexin

Therefore, currently, the two nodes thus created would form separate entities.

A) RabbitNode1

RabbitMQ high availability: cluster node 1
Image credit: blexin

B) RabbitNode2

RabbitMQ high availability: Clusternode 2
Image credit: blexin

Step 3: To join the two nodes in the same cluster, ensure that the specific ports are accessible. 

  • 4369 EPDM: It is a service for peer discovery used by nodes and RabbitMQ CLI tools. Peer discovery locates nodes or peers for data communication in a peer-to-peer network.
  • 5672: It is a port used by AMQP protocol.
  • 25672: It is used for communication between nodes and CLI tools.
  • 35672-35682: It is used by CLI tools for communication with nodes.
  • 15672: It is used by the management UI for instances.

Among the configured environment variables, the RABBITMQ_ERLANG_COOKIE environment variable is defined. This variable is a secret key, allowing two nodes of a cluster to interact with each other.

Step 4: Stop the execution of RabbitMQ on the RabbitMQ2 node.

docker exec rabbitNode2 rabbitmqctl stop_app

Then join the two nodes.

docker exec rabbitNode2 rabbitmqctl join_cluster rabbit@node1.rabbit

Output:

RabbitMQ high availability: joining two nodes
Image credit: blexin

Step 5: Restart the application.

docker exec rabbitNode2 rabbitmqctl start_app

Output:

RabbitMQ high availability: restart application
Image credit: blexin

Although management UI is easy, you can also obtain the information on a cluster by running the below command.

docker exec rabbitNode1 rabbitmqctl cluster_status

Output:

RabbitMQ high availability: RabbitMQ information on cluster
Image credit: blexin

Step 6: You can send a message to your cluster with the Sender application. Once the application gets executed, you can go to the management UI of both nodes and will find that the queue has been created. As a result, the message is correctly appended. This process is replicated on both nodes for RabbitMQ high availability.

A) RabbitNode1

RabbitMQ high availability: info on node 1
Image credit: blexin

B) RabbitNode2

RabbitMQ high availability: info on node 2
Image credit: blexin

If you stop the execution of one of the nodes in the cluster (RabbitNode1) while the message is in the queue, you can find that RabbitNode1 is not running in the management UI of RabbitNode2.

RabbitMQ high availability: no. of queues in the application
Image credit: blexin

While creating a RabbitMQ High Availability Cluster, you have to replicate data and states necessary for the operation of the broker. But in the case of queues located on a single node, the termination of RabbitNode1 will result in losing the message that is sent. Therefore, RabbitMQ allows you to create Highly Available Queues called Mirrored Queues to avoid such data loss.

Step 7: For the configuration of the Cluster Queues to be mirrored, you have to define a policy that is a pattern shared by all the queues represented by a regular expression. Run the below command.

docker exec rabbitNode1 rabbitmqctl set_policy ha "." '{"ha-mode":"all"}'
  • ha: It is the name of the policy. 
  • “.”: It is the pattern.
  • ha-mode: It is the mode that is set to ‘all,’ meaning all queues are highly available.

What Makes Hevo’s ETL Process Best-In-Class

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s Automated, No-code Data Integration Platform empowers you with everything you need to have a smooth Data Integration experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (with 40+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

Step 8: Create the RabbitMQ high Availability Queue again by starting the Sender application. In the queue section of the management UI, you can check the details of the queue.

RabbitMQ high availability: queue demo stats
Image credit: blexin

If you stop the RabbitNode1 container, it will not run out, but the queue will still be available, unlike before. Therefore, the message is not lost.

RabbitMQ high availability: no. of queues after update
Image credit: blexin

Step 9: To receive the message correctly, make a small change to your Consumer Application as a Receiver.

var endPointList = new List<amqptcpendpoint>
            {
                new AmqpTcpEndpoint("localhost", 5672),
                new AmqpTcpEndpoint("localhost", 5673)
            };
            var factory = new ConnectionFactory();
            using (var connection = factory.CreateConnection(endPointList))
</amqptcpendpoint>

Step 10: Define a list of Endpoints (nodes of the cluster) by specifying the ports for the AMQP protocol. Pass this list as a parameter of the CreateConnection method, which is made available by the RabbitMQ Client for .Net. .Net verifies which endpoint is available for the connection to the broker.

Running the Receiver application.

RabbitMQ high availability: reciever application
Image credit: blexin
RabbitMQ high availability: stats of reciever application
Image credit: blexin

Therefore, you can enable RabbitMQ’s High Availability between nodes in clusters by using the concept of Mirrored Queues.

Conclusion

In this article, you learned about deploying RabbitMQ High Availability using a Docker Container. This article also focused on the basic concepts and features of RabbitMQ. Organizations use RabbitMQ as it is Open-source and provides a secured place for messages in micro-services applications. Many well-known organizations, such as Robinhood, Reddit, Accenture, Stack, CircleCI, Tech Stack, Alibaba Travels, and more, use RabbitMQ.

There are various Data Sources that organizations leverage to capture a variety of valuable data points. But, transferring data from these sources into a Data Warehouse for a holistic analysis is a hectic task. It requires you to code and maintains complex functions that can help achieve a smooth flow of data. An Automated Data Pipeline helps in solving this issue and this is where Hevo comes into the picture. Hevo Data is a No-code Data Pipeline and has awesome 100+ pre-built Integrations that you can choose from.

visit our website to explore hevo

Hevo can help you integrate data from 100+ data sources and load them into a destination to analyze real-time data at an affordable price. It will make your life easier and Data Migration hassle-free. It is user-friendly, reliable, and secure.

SIGN UP for a 14-day free trial and see the difference!

Share your experience of learning about RabbitMQ High Availability in the comments section below.

No-code Data Pipeline For your Data Warehouse