RabbitMQ Durable Queue: A Comprehensive Guide 101

on Message Queue, message queuing, RabbitMQ • May 18th, 2022 • Write for Hevo

rabbitmq durable: FI

RabbitMQ is a free and open-source message broker and queueing software that receives messages from producers and distributes them to customers. It acts as a middleman, relieving pressure on web application servers while maximizing data transmission speed.

RabbitMQ Durable queues are those that can withstand a RabbitMQ restart. If a queue is not durable, all messages will be lost if RabbitMQ is shut down for any reason. For messages to survive restarts, both of these configurations must be true.

This article talks about RabbitMQ Durable queues extensively. In addition to that, it also gives a brief introduction to RabbitMQ.

Table Of Contents

What is RabbitMQ?

rabbitmq durable: rabbitmq logo
Image Source

RabbitMQ is an open-source message broker software (also known as message-oriented middleware) that was developed to support the Advanced Message Queuing Protocol (AMQP) and has since been extended with a plug-in architecture to support Simple (or Streaming) Text Oriented Message Protocol(STOMP), Message Query Telemetry Transport (MQTT), and other protocols.

The RabbitMQ server is written in Erlang and uses the Open Telecom Platform framework for clustering and failover. All major programming languages have client libraries for interacting with the broker and The Mozilla Public License applies to the source code.

It is a lightweight messaging system that can be deployed on-premises or in the cloud. Multiple messaging protocols are supported. To meet high-scale, high-availability requirements, It can be deployed in distributed and federated configurations.

Producer, Exchange, Queue, and Consumer are the four main components of RabbitMQ.

rabbitmq durable: rabbitmq architecture
Image Source
  • Producer: Messages are pushed to exchanges by a producer. Messages should not be sent at a rate that exceeds the Consumers’ ability to process them. It’s also in charge of generating routing keys.
  • Exchange: It is essentially a message routing rule. Binding is now required for a message to travel to a queue or a different exchange from the producers.
  • Queue: It’s a storage buffer for messages. Queues are given names to make it easier for applications to find them. Applications can choose their queue names or ask the broker to do so. Because such names are reserved by the broker for internal use, queue names can be up to 255 bytes of UTF-8 characters and cannot begin with ‘amq.’
  • Consumer: It takes messages from the queues and reads them. A pre-fetch limit is something that each customer can set (Otherwise known as QoS limit). This value represents the maximum number of unacknowledged messages the Consumer can handle at any given time.
rabbitmq durable: rabbitmq model
Image Source

Simplify ETL Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from 100+ Data Sources straight into Data Warehouses, or any Databases. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!

GET STARTED WITH HEVO FOR FREE

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

Key Features of RabbitMQ

  • In-built Clustering: RabbitMQ’s clustering was designed with two purposes in mind. In case one node fails, it still allows the consumers and producers to keep functioning in the event and scaling messaging throughput linearly by adding additional nodes.
  • Flexible Routing: RabbitMQ offers several built-in exchange types for routing. For typical routing, messages are routed through exchanges even before arriving at queues. And, for complex routing, users can bind exchanges together or even write their exchange type as a plugin.
  • Reliability: Persistence, Delivery Feedback, Publisher Confirmation, and High Availability are prominent features of RabbitMQ that directly impact performance.
  • Security: RabbitMQ provides security on various tiers. Enforcing SSL-only communication and Client Certificate Checking can help secure client connections. User access may be controlled at the virtual host, ensuring high-level isolation of messages.

Understanding RabbitMQ Durable Queue Aspects

RabbitMQ Durable Queue Aspects: What is a Queue?

A queue is a sequential data structure that allows items to be enqueued (added) at the tail and dequeued (consumed) at the head. In the world of messaging technology, queues play an important role: many protocols and tools assume that publishers and consumers communicate using a queue-like storage mechanism.

  • FIFO Queues are used in RabbitMQ (“first-in, first-out”). Some queue features, such as priorities and consumer requeuing, can influence the ordering as seen by customers. Applications can refer to queues using their names.
  • Queue names can be chosen by applications or generated by the broker. Queue names can contain up to 255 UTF-8 characters.
  • The broker reserves queue names that begin with “amq.” for internal use. If you try to create a queue with a name that breaks this rule, you’ll get a 403 (ACCESS REFUSED) channel-level exception.
  • Queues have characteristics that define their behavior. There is a list of required properties and a list of optional properties:
    • RabbitMQ Durable Queue as a name (the RabbitMQ durable queue will survive a broker restart)
    • Exceptional (used by only one connection and the queue will be deleted when that connection closes)
    • Auto-remove (queue that has had at least one consumer is deleted when the last consumer unsubscribes)
    • Disputes (optional; used by plugins and broker-specific features such as message TTL, queue length limit, etc)
  • In practice, not all property combinations make sense. Auto-delete and exclusive queues, for example, should have server names. Client-specific or connection (session)-specific data is supposed to go into these queues.
  • When Auto-Delete or exclusive queues use well-known (static) names, there will be a natural race condition between RabbitMQ nodes that delete such queues and recover clients that try to re-declare them in the event of client disconnection and immediate reconnection. This can lead to connection recovery failures or exceptions on the client-side, causing unnecessary confusion and affecting application availability.
  • It is necessary to declare a queue before using it. If a queue isn’t already present, declaring one will create one. If the queue already exists and its attributes are identical to those in the declaration, the declaration will have no effect. A channel-level exception with code 406 (PRECONDITION FAILED) will be raised if the existing queue attributes differ from those in the declaration.
  • Optional queue arguments, also known as “x-arguments” in the AMQP 0-9-1 protocol due to their field name, are a map (dictionary) of arbitrary key/value pairs that clients can provide when a queue is declared.
  • Several features and plugins rely on the map, including
    • Type of Queuing (e.g. quorum or classic)
    • TTL Limit for messages and queues
    • Queue Mirroring options from the past
    • Maximum Priorities
  • However, there are some exceptions. For example, queue type (x-queue-type) and the maximum number of queue priorities (x-max-priority) must be specified at the time of queue declaration and cannot be changed later.
  • There are two ways to set optional queue arguments:
    • Using Policies, to Groups of Queues (recommended): When a client declares a queue, the former option is more flexible, non-intrusive, and does not necessitate application modifications or redeployments. As a result, it is highly recommended for the majority of users. Because some optional arguments, such as queue type or a maximum number of priorities, cannot be dynamically changed and must be known at declaration time, they can only be provided by clients.
    • Using Function Method: Clients provide optional arguments in a variety of ways, but they are usually an argument alongside the RabbitMQ durable, auto-delete, and other arguments of the function (method) that declares queues.

What makes Hevo’s ETL Process Best-In-Class

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s automated, No-code platform empowers you with everything you need to have for a smooth data replication experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for 100+ sources (with 40+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

RabbitMQ Durable Queue  Aspects: Message Ordering

  • RabbitMQ queues are collections of messages that are arranged in a specific order. 
  • Priority and sharded queues do not always follow FIFO ordering.
  • Multiple Competing Customers, Consumer Priorities, and Message Redeliveries can all influence order placement. This is true for all redeliveries, including automatic after-channel-closure redeliveries and negative consumer acknowledgments.
  • Messages published on a single channel are assumed to be enqueued in publishing order in all queues to which they are routed. Message sequences will be routed concurrently and interleaved when publishing occurs on multiple connections or channels.
  • Initial deliveries to a single consumer (those with the redelivered property set to false) can be assumed to be performed in the same FIFO order in which they were enqueued by consuming applications. Original ordering can be affected by the timing of consumer acknowledgments and redeliveries, so it is not guaranteed for repeated deliveries (the redelivered property is set to true).
  • Messages will be dequeued for delivery in FIFO order if there are multiple consumers, but actual delivery will be split among them. If all of the customers have the same priorities, a round-robin selection process will be used. Consumers on channels with a prefetch value (the number of outstanding unacknowledged deliveries) of less than $1000 will be considered.

RabbitMQ Durable Queue Aspects: Durability and Durable Storage

  • Queues can be permanent or temporary. A RabbitMQ durable queue’s metadata is stored on a disc, whereas a transient queue’s metadata is stored in memory whenever possible. Some protocols, such as AMQP 0-9-1 and MQTT, make the same distinction for messages at publishing time.
  • Applications must use RabbitMQ durable queues and ensure that published messages are marked as persisted in environments and use cases where durability is important.
  • When the node boots up, the transient queues will be removed. As a result, they aren’t designed to survive a node restart. Transient queued messages will be deleted as well.
  • RabbitMQ Durable queues, including messages published as persistent, will be recovered when the node boots. Even if they were saved in RabbitMQ durable queues, messages published as transient will be discarded during recovery.

How to Choose Queues

  • RabbitMQ Durable Queues are recommended in the vast majority of cases. Only RabbitMQ durable queues make sense for replicated queues.
  • In most cases, whether a queue is durable or not has no impact on its Throughput or Latency. Only environments with an extremely high queue or binding churn — that is, where queues are deleted and re-declared hundreds or thousands of times per second — will see latency improvements for some operations, specifically bindings. The semantics of the use case determines whether a queue is durable or transient.
  • For workloads with transient clients, such as temporary WebSocket connections in User Interfaces, Mobile Applications, and devices that are expected to go offline or use switch identities, temporary queues can be a good choice. Such clients usually have a transient state that needs to be replaced when the client reconnects.
  • Transient Queues aren’t supported by all queue types. The assumptions and requirements of the underlying replication protocol, for example, necessitate the durability of Quorum Queues.
  • The length of a queue can be set and TTLs exist for queues and messages.
  • Both features can be used to prevent data from expiring and to limit the number of resources (RAM, disc space) that a queue can use at any given time, for example, when consumers go offline or their throughput falls behind publishers.
  • Messages are stored in RAM and/or on disc in queues. This is partially controlled by the client in some protocols (e.g. AMQP 0-9-1). This is done using a message property (delivery mode or, in some clients, persistent) in AMQP 0-9-1.
  • RabbitMQ should keep as many messages in RAM as possible when publishing messages as transient. Queues, on the other hand, will page even transient messages to disc if they run out of memory.
  • Messages routed to RabbitMQ durable queues are persisted in batches or after a set period has passed (fraction of a second).
  • Regardless of their persistence property, the lazy queue pages message out to disc more aggressively.

RabbitMQ Durable Queue Aspects: Durable Work Queues

Round-Robin Dispatching

  • The ability to easily parallelize work is one of the benefits of using a Task Queue. If you have a backlog of work, you can simply hire more workers and scale up quickly.
  • Here try running two worker.py scripts at the same time first. They’ll both receive messages from the queue.
  • Three consoles must be open. The worker.py script will be run by two people. Your two consumers, C1 and C2, will be these consoles.
# shell 1
python worker.py
# => [*] Waiting for messages. To exit press CTRL+C
# shell 2
python worker.py
# => [*] Waiting for messages. To exit press CTR
  • You’ll publish new tasks in the third one. After you’ve started the customers, you can send them the following messages:
# shell 3
python new_task.py First message.
python new_task.py Second message..
python new_task.py Third message...
python new_task.py Fourth message....
python new_task.py Fifth message....
  • Here take a look at what your employees receive:
# shell 1
python worker.py
# => [*] Waiting for messages. To exit press CTRL+C
# => [x] Received 'First message.'
# => [x] Received 'Third message...'
# => [x] Received 'Fifth message.....'
# shell 2
python worker.py
# => [*] Waiting for messages. To exit press CTRL+C
# => [x] Received 'Second message..'
# => [x] Received 'Fourth message....'
  • RabbitMQ sends each message in sequence to the next consumer by default. Each customer will receive the same number of messages on average. Round-robin is the term for this method of distributing messages.

Message Acknowledgment

  • A task can be completed in a matter of seconds. You might be wondering what happens if one of the customers starts a long task and dies halfway through it. When RabbitMQ delivers a message to the consumer, it is immediately marked for deletion in your current code. If you kill a worker in this case, you’ll lose the message it was just processing. You’ll also lose all of the messages that were sent to this work but have yet to be processed.
  • However, you do not want to lose any tasks. If one of your employees dies, you’d like the task to be assigned to another employee.
  • Message Acknowledgments are supported by RabbitMQ to ensure that a message is never lost. The consumer sends an ack(acknowledgment) to RabbitMQ to inform it that a specific message has been received, processed, and RabbitMQ is free to delete it.
  • RabbitMQ will understand that a message wasn’t fully processed and will re-queue it if a consumer dies (its channel is closed, the connection is closed, or TCP connection is lost) without sending an ack. It will quickly redeliver it to another consumer if there are other consumers online at the same time. Even if the workers die occasionally, you can be certain that no message will be lost.
  • Consumer delivery acknowledgment is subject to a 30-minute timeout by default. This aids in the detection of buggy (stuck) customers who fail to acknowledge deliveries. This timeout can be increased as described in Delivery Acknowledgement Timeout.
  • By default, manual message acknowledgment is enabled. You explicitly disabled them in previous examples by setting auto ack=True. Once you’ve completed a task, remove this flag and send a proper acknowledgment from the worker.
def callback(ch, method, properties, body):
    print(" [x] Received %r" % body.decode())
    time.sleep(body.count(b'.') )
    print(" [x] Done")
    ch.basic_ack(delivery_tag = method.delivery_tag)
channel.basic_consume(queue='hello', on_message_callback=callback)
  • You can be sure that nothing will be lost if you kill a worker with CTRL+C while it is processing a message with this code. All unacknowledged messages will be redelivered shortly after the worker dies.
  • The acknowledgment must be sent through the same channel as the delivery. Attempts to acknowledge using a different channel will result in a protocol exception at the channel level.

Message Durability

  • You’ve learned how to ensure that the task is completed even if the customer dies. However, if the RabbitMQ server goes down, your tasks will be lost.
  • Unless you tell RabbitMQ not to, it will forget the queues and messages when it exits or crashes. To ensure that messages are not lost, you must mark both the queue and the messages as durable.
  • First, you must ensure that the queue will continue to function after a RabbitMQ node restart. You must declare it as durable to do so:
channel.queue_declare(queue='hello', durable=True)
  • This command is correct in and of itself, but it will not work in your environment. This is because you’ve already created a hello queue that isn’t persistent. RabbitMQ will reject any program that tries to redefine an existing queue with different parameters. However, there is a quick fix: create a new queue with a different name, such as task queue:
channel.queue_declare(queue='task_queue', durable=True)
  • Both the producer and consumer code must be updated with this queue declare change. Even if RabbitMQ restarts, you’re confident that the task queue will not be lost. Now you must set the delivery mode property to pika.spec and mark your messages as persistent. PERSISTENT DELIVERY MODE
channel.basic_publish(exchange='',
                      routing_key="task_queue",
                      body=message,
                      properties=pika.BasicProperties(
                         delivery_mode = pika.spec.PERSISTENT_DELIVERY_MODE
                      ))

Fair Dispatch

  • You may have noticed that dispatching is still not working as it should. In a situation with two workers, for example, if all odd messages are heavy and all even messages are light, one worker will be constantly busy while the other will do very little work. 
  • RabbitMQ, on the other hand, is completely unaware of this and will continue to distribute messages evenly. Because RabbitMQ only dispatches a message when it enters the queue, this happens. It does not consider a consumer’s number of unanswered messages. It simply sends every nth message to the nth consumer blindly.
rabbitmq durable: rabbitmq fair dispatch message
Image Source
  • To overcome this, use the prefetch count=1 setting in the Channel#basic qos channel method. This instructs RabbitMQ not to send more than one message to a worker at a time using the basic.qos protocol method. 
  • Alternatively, don’t send a new message to a worker until the previous one has been processed and acknowledged. Instead, it will send it to the next worker who is not currently working.
channel.basic_qos(prefetch_count=1)
  • This is how everything put together looks like: 
New_task.py
#!/usr/bin/env python
import pika
import sys

connection = pika.BlockingConnection(
    pika.ConnectionParameters(host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)

message = ' '.join(sys.argv[1:]) or "Hello World!"
channel.basic_publish(
    exchange='',
    routing_key='task_queue',
    body=message,
    properties=pika.BasicProperties(
        delivery_mode=pika.spec.PERSISTENT_DELIVERY_MODE
    ))
print(" [x] Sent %r" % message)
connection.close()
Worker.py
#!/usr/bin/env python
import pika
import time

connection = pika.BlockingConnection(
    pika.ConnectionParameters(host='localhost'))
channel = connection.channel()

channel.queue_declare(queue='task_queue', durable=True)
print(' [*] Waiting for messages. To exit press CTRL+C')


def callback(ch, method, properties, body):
    print(" [x] Received %r" % body.decode())
    time.sleep(body.count(b'.'))
    print(" [x] Done")
    ch.basic_ack(delivery_tag=method.delivery_tag)
channel.basic_qos(prefetch_count=1)
channel.basic_consume(queue='task_queue', on_message_callback=callback)

channel.start_consuming()
  • You can create a work queue with Message Acknowledgments and Prefetch Count. The durability options allow tasks to survive RabbitMQ restarts.

RabbitMQ Durable Queue: AMQP Model

  • AMQP 0-9-1 is a programmable protocol in the sense that applications, not brokers, define AMQP 0-9-1 entities and routing schemes. 
  • As a result, protocol operations such as declaring queues and exchanges, defining bindings between them, subscribing to queues, and so on have been included.
  • This provides a great deal of flexibility to application developers, but it also requires them to be aware of potential definition conflicts. Definition conflicts are uncommon in practice, but they frequently signal a misconfiguration.
  • Applications declare the AMQP 0-9-1 entities they require, specify routing schemes, and delete AMQP 0-9-1 entities that are no longer in use.
  • Messages are sent to AMQP 0-9-1 entities called exchanges. A message is routed through an exchange into one or more queues. The routing algorithm used is determined by the type of exchange and bindings. AMQP 0-9-1 brokers offer four different types of exchange:
Exchange typeDefault pre-declared names
Direct exchange(Empty string) and amq.direct
Fanout exchangeamq.fanout
Topic exchangeamq.topic
Headers exchangeamq.match (and amq.headers in RabbitMQ)
  • Exchanges are defined by several attributes in addition to the exchange type, the most important of which are:
    • Durability is a name that has been given to a product that (exchanges survive broker restart)
    • Auto Delete (exchange is deleted when the last queue is unbound from it)
    • Arguments(optional, used by plugins and broker-specific features)
  • Exchanges can be permanent or temporary. Transient exchanges do not survive broker restarts, whereas durable exchanges do (they have to be redeclared when the broker comes back online). Exchanges do not have to be durable in all scenarios and use cases.
  • Queues can be declared as durable or transient in AMQP 0-9-1. A durable queue’s metadata is stored on a disc, whereas a transient queue’s metadata is stored in memory whenever possible.
  • Applications must use durable queues and ensure that published messages are marked as persistent in environments and use cases where durability is important.
  • In the AMQP 0-9-1 model, messages have attributes. Some attributes are so common that they are defined in the AMQP 0-9-1 specification, so application developers don’t have to remember the exact name.
  • AMQP brokers use some attributes, but the majority are left to interpretation by the applications that receive them. Some attributes, known as headers, are optional. They’re similar to HTTP’s X-Headers. When a message is published, its attributes are set.
  • AMQP brokers treat messages’ payload (the data they carry) as an opaque byte array. The payload will not be examined or changed by the broker. Only attributes and no payload are allowed in messages. 
  • To serialize structured data and publish it as the message payload, serialization formats such as JSON, Thrift, Protocol Buffers, and MessagePack are commonly used. This information is typically communicated by protocol peers using the “content-type” and “content-encoding” fields, but this is only by convention.
  • Messages can be marked as persistent, causing the broker to save them to disc. The system ensures that received persistent messages are not lost if the server is restarted. 
  • Simply publishing a message to a durable exchange or routing it to a durable queue does not make it persistent: it is entirely dependent on the message’s persistence mode. The performance of persistent messages is affected (just like with data stores, durability comes at a certain cost in performance).

Conclusion

This article explains RabbitMQ Durable queues in detail. It also gives an overview of RabbitMQ. Using Durable Queues helps queues survive a broker start and they help keep messages around persistently for consumers to consume them whenever they want.

visit our website to explore hevo

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations, with a few clicks. Hevo Data with its strong integration with 100+ sources (including 40+ free sources) allows you to not only export data from your desired data sources & load it to the destination of your choice, but also transform & enrich your data to make it analysis-ready so that you can focus on your key business needs and perform insightful analysis using BI tools.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

No-code Data Pipeline For Your Data Warehouse