MongoDB High Availability: How to Deploy & Run it Simplified 101

on Data Integration, Data Replication, Database Management Systems, MongoDB, Tutorials • May 26th, 2022 • Write for Hevo

MongoDB High Availability_FI

Organizations leverage many high availability frameworks for databases that provide continued functioning in case of failures. MongoDB is one such database that can provide high availability for the MongoDB databases by using features like sharding and replication. If the system crashes or the hardware or software faces any issues, replication processes in the databases help you recover and backup your system. Therefore, to achieve and deploy such high availability using the replication process. 

In this article, you will learn deploying Percona Server for MongoDB’s high availability. 

In this article, you will gain information about MongoDB High Availability. You will also gain a holistic understanding of MongoDB, its key features, replication in MongoDB and deploying Percona Server to achieve MongoDB High Availability.

Read along to find out in-depth information about MongoDB High Availability.

Table of contents

Prerequisites

  • Basics need for high availability
  • Basic Understanding of MongoDB

What is MongoDB?

MongoDB High Availability: MongoDB Logo | Hevo Data
Image Source

Developed in 2009, MongoDB is an open-sourced, document-oriented NoSQL database used for high volumes of data. Unlike the relational databases that store data in tables, MongoDB stores records in the form of collections and documents. Documents are considered the basic unit of data in the MongoDB database, consisting of key-value pairs. Documents can then have different fields and structures, and it also leverages a document storage format called BSON, a binary style of JSON (JavaScript Object Notation) documents.

MongoDB has developed one of the most popular databases with NoSQL features, MapReduce capabilities, and distributed key-value storage. Therefore, many organizations are leveraging the MongoDB database. For instance, eBay is a multinational online organization that facilitates consumer-to-consumer and business-to-consumer sales through its website. It enables users to list items for sale, which other users can then bid on in auctions. For handling such extensive data, eBay uses MongoDB for operational resilience. 

Key Features of MongoDB

Some of the key features of MongoDB are as follows:

1) Schema-less database 

Since MongoDB is an unstructured database, it can hold different types of data in each document. Therefore, MongoDB is a schema-less database that can handle many contents, fields, and sizes of different documents in the same collection.

2) Document Oriented

MongoDB is a document-oriented database where all the records are stored in documents and collections instead of rows and columns. These documents can store different types of data. The set of documents is called collections in MongoDB, where each document consists of a unique primary key.

3) Indexing

Indexing helps in improving the performance of search queries. When you continuously perform searches in MongoDB documents, you can index those files which match your search criteria. Therefore, in MongoDB documents, users can index any field with primary or secondary indices, making the query search faster.

4) Sharding

Sharding in MongoDB is performed when you want to work with larger datasets. With sharding, you can distribute such large data to multiple MongoDB instances. The collections in MongoDB having larger sizes are distributed in multiple collections, called ‘shards.’ Shards are implemented through clusters.

Replicate MongoDB Data in Minutes Using Hevo’s No-Code Data Pipeline

Hevo Data, a Fully-managed Data Pipeline platform, can help you automate, simplify & enrich your data replication process in a few clicks. With Hevo’s wide variety of connectors and blazing-fast Data Pipelines, you can extract & load data from MongoDB and 100+ Data Sources straight into your Data Warehouse or any Databases. To further streamline and prepare your data for analysis, you can process and enrich raw granular data using Hevo’s robust & built-in Transformation Layer without writing a single line of code!

GET STARTED WITH HEVO FOR FREE

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

What is Replication in MongoDB?

MongoDB High Availability: Replication in MongoDB | Hevo Data
Image Source

Replication is performed in MongoDB to maintain data redundancy and high availability. It is performed by creating multiple copies of your data across multiple servers. Replication is achieved by the replica sets that contain primary and secondary nodes. Write operations are sent to the primary server, which is also called the primary node. The primary node performs these operations on secondary nodes (servers), which replicate the data. In case the primary server fails, the secondary nodes take over and become the primary node through the election method.

MongoDB High Availability: Replication | Hevo Data
Image Source

What is MongoDB High Availability?

High availability refers to a system’s ability to operate continuously without failure for a set period of time. Moreover, HA works to ensure that a system meets an agreed-upon level of operational performance.

When running critical services in a production environment, high availability is a must. It can be achieved by removing all single points of failure, including the database tier.

MongoDB High availability service can be achieved through replication. The term replica set refers to a configuration in which multiple MongoDB processes run and maintain the same data.

Getting Started with MongoDB High Availability

Developed in 2006, Percona is a fully compatible, enhanced, and open-source drop-in replacement of any MySQL database. Percona is used for performing high performance and availability.

Percona server contains an in-memory storage engine, data-at-rest encryption, audit logging, external LDAP (Lightweight Directory Access Protocol) authentication with SASL ( Simple Authentication Security Layer), and backups to maximize performance and process efficient databases.

Deploying Percona Server for MongoDB High Availability

You will require at least 3 nodes for implementing MongoDB high availability in which a replica set consists of 1 primary node and two secondary nodes. You can use these two nodes along with an arbiter as a third node.

In the below example, you will run 3 virtual environments with CentOS Linux release 7.3 as the operating system. The Percona Server will also be used for MongoDB version 4.2 for the installation with the below IP addresses.

mongo-node8: 10.10.10.17
mongo-node9: 10.10.10.18
mongo-node10: 10.10.10.19
  • Step 1: Ensure that all the nodes are configured in the /etc/hosts file on each node before installation, using the following command.
[root@mongo-node9 ~]# cat /etc/hosts

Output:

MongoDB High Availability: node configuration | Hevo Data
Image Source
  • Step 2: You need to configure the Percona repository on each of the nodes. Then you can enable the repository for psmdb42, as shown below.
[root@mongo-node8 ~]# percona-release setup psmdb42

Output:

MongoDB High Availability: Configure Percona Repository| Hevo Data
Image Source
  • Step 3: Continue to install the Percona Server for the MongoDB package.
[root@mongo-node8 ~]# yum install percona-server-mongodb*

Output:

MongoDB High Availability: Plugins | Hevo Data
Image Source
  • Step 4: Repeat the installation to all the other nodes. Once the installation is complete, you can change the bindIP configuration on /etc/mongod.conf from the localhost IP address to all the private IP addresses, as shown below.
MongoDB High Availability: Network Interfaces | Hevo Data
Image Source
  • Step 5: You can restrict the IP address on bindIP parameter for security reasons by adding the IP address with a semicolon as a separator.
  • Step 6: Connect to the MongoDB instance between three nodes, as shown below.
[root@mongo-node8 ~]# mongo --host 10.10.10.19 --port 27017

Output:

MongoDB High Availability: MongoDB Instance | Hevo Data
Image Source
  • Step 7: Configure the replica set in MongoDB. Edit the etc/mongod.conf file and uncomment the replication section. Add parameter replsetName, as shown below.
replication:
 
  replSetName: "my-mongodb-rs"
  • Step 8: Use the replica set name as ‘my-mongodb-rs’ in this installation. Restart the MongoDB service after the replication configuration is added.
service mongod restart
  • Step 9: Repeat the configuration on the other nodes.
  • Step 10: After the configuration of all the nodes, you need to initialize the replication in one of the nodes. Run the rs.initiate() command, as shown below.
rs.initiate()
 
{"info2" : "no configuration specified. Using a default configuration for the set","me" : "mongo-node8:27017","ok" : 1, "$clusterTime" : {"clusterTime" : Timestamp(1604036305, 1), "signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" :NumberLong(0)}},"operationTime" : Timestamp(1604036305, 1)}my-mongodb-rs:OTHER> my-mongodb-rs:PRIMARY>
  • Step 11: The first node, where you initiate the replication, will become the primary node. You need to add the rest of the nodes to join the replication. 
  • Step 12: Use rs.add() command on the primary node to add another node, as shown below.
my-mongodb-rs:PRIMARY> rs.add("mongo-node9:27017");

 {"ok" : 1,"$clusterTime" : {"clusterTime" :Timestamp(1604037158, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" :NumberLong(0)}},"operationTime" : Timestamp(1604037158, 1)}
 my-mongodb-rs:PRIMARY> rs.add("mongo-node10:27017");
 {"ok" : 1,"$clusterTime" : {"clusterTime" :Timestamp(1604037170, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" : NumberLong(0)}},"operationTime" : Timestamp(1604037170, 1)
 }
  • Step 13: You can also initiate the Replica Set using the initiate() command, as shown below.
rs.initiate( { _id: "my-mongodb-rs",members: [{ _id: 0, host: "mongo-node8:27017" },{ _id: 1, host: "mongo-node9:27017" },
{ _id: 2, host: "mongo-node10:27017" }] })
  • Step 14: Check the current replica set cluster using rs.status() command on any cluster nodes.
my-mongodb-rs:PRIMARY> rs.status()

Output:

MongoDB High Availability: Checking Current Replica set cluster| Hevo Data
Image Source

What Makes Hevo’s MongoDB ETL Process Best-In-Class

Providing a high-quality ETL solution can be a difficult task if you have a large volume of data. Hevo’s automated, No-code platform empowers you with everything you need to have for a smooth data replication experience.

Check out what makes Hevo amazing:

  • Fully Managed: Hevo requires no management and maintenance as it is a fully automated platform.
  • Data Transformation: Hevo provides a simple interface to perfect, modify, and enrich the data you want to transfer.
  • Faster Insight Generation: Hevo offers near real-time data replication so you have access to real-time insight generation and faster decision making. 
  • Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
  • Scalable Infrastructure: Hevo has in-built integrations for MongoDB and 100+ sources (with 40+ free sources) that can help you scale your data infrastructure as required.
  • Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-day free trial!

Deploying Percona Server for MongoDB High Availability using ClusterControl

ClusterControl is a database management tool that is used to monitor and scale database clusters. It supports the deployment of the Percona Server for achieving MongoDB High Availability. This deployment is simple, and you just need to deploy and then select the MongoDB replica set, as shown below.

MongoDB High Availability: General & SSH Settings | Hevo Data
Image Source
  • Step 1: Fill in the details like SSH user, password, port, and cluster name. You can set up passwordless SSH between the controller node and the target database node. Then click on continue after entering all the information. It will then open a window, as shown below.
MongoDB High Availability: Deploy Database Cluster| Hevo Data
Image Source
  • Step 2: Select the Percona Server as the vendor, and then select the Percona version you want to install. In case you have a custom MongoDB directory, you can specify it. Set admin user and password for your MongoDB. Instead of the default port, you can use another port. Lastly, fill in the IP address of your target database node in the Add Node combo box.
  • Step 3: Click on the “Deploy” button. It will then trigger a job to deploy a MongoDB cluster, as shown below.
MongoDB High Availability: Create Cluster| Hevo Data
Image Source
  • Step 4: After the deployment, you can see the overview page, where you have three instances of Percona Server for MongoDB which are running.
MongoDB High Availability: Profile (Active)| Hevo Data
Image Source

The Topology view shows that you have one primary and two secondary nodes.

MongoDB High Availability: Profile Topology| Hevo Data
Image Source

Therefore, the Percona Server, used for the MongoDB database’s high availability, is deployed using the ClusterControl database tool.

Conclusion

In this article, you learned about deploying MongoDB at high availability using Percona Server. This article also focused on the ClusterControl tool used to deploy and monitor the MongoDB database. High availability in MongoDB helps organizations improve their system performance and availability by minimizing the downtime often caused by planned operations or system crashes.

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations with a few clicks.

Visit our Website to Explore Hevo

Hevo Data with its strong integration with MongoDB and 100+ Data Sources (including 40+ Free Sources) allows you to not only export data from your desired data sources & load it to the destination of your choice but also transform & enrich your data to make it analysis-ready. Hevo also allows the integration of data from non-native sources using Hevo’s in-built REST API & Webhooks Connector. You can then focus on your key business needs and perform insightful analysis using BI tools. 

Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing price, which will assist you in selecting the best plan for your requirements.

Share your experience of understanding MongoDB high availability in the comment section below! We would love to hear your thoughts.

No-code Data Pipeline for MongoDB