Organizations leverage many high availability frameworks for databases that provide continued functioning in case of failures. MongoDB is one such database that can provide high availability for the MongoDB databases by using features like sharding and replication. If the system crashes or the hardware or software faces any issues, replication processes in the databases help you recover and backup your system. Therefore, to achieve and deploy such high availability using the replication process.
In this article, you will learn deploying Percona Server for MongoDB’s high availability.
In this article, you will gain information about MongoDB High Availability. You will also gain a holistic understanding of MongoDB, its key features, replication in MongoDB and deploying Percona Server to achieve MongoDB High Availability.
Read along to find out in-depth information about MongoDB High Availability.
What is MongoDB?
Developed in 2009, MongoDB is an open-sourced, document-oriented NoSQL database used for high volumes of data. Unlike the relational databases that store data in tables, MongoDB stores records in the form of collections and documents. Documents are considered the basic unit of data in the MongoDB database, consisting of key-value pairs. Documents can then have different fields and structures, and it also leverages a document storage format called BSON, a binary style of JSON (JavaScript Object Notation) documents.
MongoDB has developed one of the most popular databases with NoSQL features, MapReduce capabilities, and distributed key-value storage. Therefore, many organizations are leveraging the MongoDB database. For instance, eBay is a multinational online organization that facilitates consumer-to-consumer and business-to-consumer sales through its website. It enables users to list items for sale, which other users can then bid on in auctions. For handling such extensive data, eBay uses MongoDB for operational resilience.
Key Features of MongoDB
Some of the key features of MongoDB are as follows:
1) Schema-less database
Since MongoDB is an unstructured database, it can hold different types of data in each document. Therefore, MongoDB is a schema-less database that can handle many contents, fields, and sizes of different documents in the same collection.
2) Document Oriented
MongoDB is a document-oriented database where all the records are stored in documents and collections instead of rows and columns. These documents can store different types of data. The set of documents is called collections in MongoDB, where each document consists of a unique primary key.
3) Indexing
Indexing helps in improving the performance of search queries. When you continuously perform searches in MongoDB documents, you can index those files which match your search criteria. Therefore, in MongoDB documents, users can index any field with primary or secondary indices, making the query search faster.
4) Sharding
Sharding in MongoDB is performed when you want to work with larger datasets. With sharding, you can distribute such large data to multiple MongoDB instances. The collections in MongoDB having larger sizes are distributed in multiple collections, called ‘shards.’ Shards are implemented through clusters.
What is Replication in MongoDB?
Replication is performed in MongoDB to maintain data redundancy and high availability. It is performed by creating multiple copies of your data across multiple servers. Replication is achieved by the replica sets that contain primary and secondary nodes. Write operations are sent to the primary server, which is also called the primary node. The primary node performs these operations on secondary nodes (servers), which replicate the data. In case the primary server fails, the secondary nodes take over and become the primary node through the election method.
What is MongoDB High Availability?
High availability refers to a system’s ability to operate continuously without failure for a set period of time. Moreover, HA works to ensure that a system meets an agreed-upon level of operational performance.
When running critical services in a production environment, high availability is a must. It can be achieved by removing all single points of failure, including the database tier.
MongoDB High availability service can be achieved through replication. The term replica set refers to a configuration in which multiple MongoDB processes run and maintain the same data.
Getting Started with MongoDB High Availability
Developed in 2006, Percona is a fully compatible, enhanced, and open-source drop-in replacement of any MySQL database. Percona is used for performing high performance and availability.
Percona server contains an in-memory storage engine, data-at-rest encryption, audit logging, external LDAP (Lightweight Directory Access Protocol) authentication with SASL ( Simple Authentication Security Layer), and backups to maximize performance and process efficient databases.
Deploying Percona Server for MongoDB High Availability
You will require at least 3 nodes for implementing MongoDB high availability in which a replica set consists of 1 primary node and two secondary nodes. You can use these two nodes along with an arbiter as a third node.
In the below example, you will run 3 virtual environments with CentOS Linux release 7.3 as the operating system. The Percona Server will also be used for MongoDB version 4.2 for the installation with the below IP addresses.
mongo-node8: 10.10.10.17
mongo-node9: 10.10.10.18
mongo-node10: 10.10.10.19
- Step 1: Ensure that all the nodes are configured in the /etc/hosts file on each node before installation, using the following command.
[root@mongo-node9 ~]# cat /etc/hosts
Output:
- Step 2: You need to configure the Percona repository on each of the nodes. Then you can enable the repository for psmdb42, as shown below.
[root@mongo-node8 ~]# percona-release setup psmdb42
Output:
- Step 3: Continue to install the Percona Server for the MongoDB package.
[root@mongo-node8 ~]# yum install percona-server-mongodb*
Output:
- Step 4: Repeat the installation to all the other nodes. Once the installation is complete, you can change the bindIP configuration on /etc/mongod.conf from the localhost IP address to all the private IP addresses, as shown below.
- Step 5: You can restrict the IP address on bindIP parameter for security reasons by adding the IP address with a semicolon as a separator.
- Step 6: Connect to the MongoDB instance between three nodes, as shown below.
[root@mongo-node8 ~]# mongo --host 10.10.10.19 --port 27017
Output:
- Step 7: Configure the replica set in MongoDB. Edit the etc/mongod.conf file and uncomment the replication section. Add parameter replsetName, as shown below.
replication:
replSetName: "my-mongodb-rs"
- Step 8: Use the replica set name as ‘my-mongodb-rs’ in this installation. Restart the MongoDB service after the replication configuration is added.
service mongod restart
- Step 9: Repeat the configuration on the other nodes.
- Step 10: After the configuration of all the nodes, you need to initialize the replication in one of the nodes. Run the rs.initiate() command, as shown below.
rs.initiate()
{"info2" : "no configuration specified. Using a default configuration for the set","me" : "mongo-node8:27017","ok" : 1, "$clusterTime" : {"clusterTime" : Timestamp(1604036305, 1), "signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" :NumberLong(0)}},"operationTime" : Timestamp(1604036305, 1)}my-mongodb-rs:OTHER> my-mongodb-rs:PRIMARY>
- Step 11: The first node, where you initiate the replication, will become the primary node. You need to add the rest of the nodes to join the replication.
- Step 12: Use rs.add() command on the primary node to add another node, as shown below.
my-mongodb-rs:PRIMARY> rs.add("mongo-node9:27017");
{"ok" : 1,"$clusterTime" : {"clusterTime" :Timestamp(1604037158, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" :NumberLong(0)}},"operationTime" : Timestamp(1604037158, 1)}
my-mongodb-rs:PRIMARY> rs.add("mongo-node10:27017");
{"ok" : 1,"$clusterTime" : {"clusterTime" :Timestamp(1604037170, 1),"signature" : {"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),"keyId" : NumberLong(0)}},"operationTime" : Timestamp(1604037170, 1)
}
- Step 13: You can also initiate the Replica Set using the initiate() command, as shown below.
rs.initiate( { _id: "my-mongodb-rs",members: [{ _id: 0, host: "mongo-node8:27017" },{ _id: 1, host: "mongo-node9:27017" },
{ _id: 2, host: "mongo-node10:27017" }] })
- Step 14: Check the current replica set cluster using rs.status() command on any cluster nodes.
my-mongodb-rs:PRIMARY> rs.status()
Output:
Deploying Percona Server for MongoDB High Availability using ClusterControl
ClusterControl is a database management tool that is used to monitor and scale database clusters. It supports the deployment of the Percona Server for achieving MongoDB High Availability. This deployment is simple, and you just need to deploy and then select the MongoDB replica set, as shown below.
- Step 1: Fill in the details like SSH user, password, port, and cluster name. You can set up passwordless SSH between the controller node and the target database node. Then click on continue after entering all the information. It will then open a window, as shown below.
- Step 2: Select the Percona Server as the vendor, and then select the Percona version you want to install. In case you have a custom MongoDB directory, you can specify it. Set admin user and password for your MongoDB. Instead of the default port, you can use another port. Lastly, fill in the IP address of your target database node in the Add Node combo box.
- Step 3: Click on the “Deploy” button. It will then trigger a job to deploy a MongoDB cluster, as shown below.
- Step 4: After the deployment, you can see the overview page, where you have three instances of Percona Server for MongoDB which are running.
The Topology view shows that you have one primary and two secondary nodes.
Therefore, the Percona Server, used for the MongoDB database’s high availability, is deployed using the ClusterControl database tool.
Conclusion
In this article, you learned about deploying MongoDB at high availability using Percona Server. This article also focused on the ClusterControl tool used to deploy and monitor the MongoDB database. High availability in MongoDB helps organizations improve their system performance and availability by minimizing the downtime often caused by planned operations or system crashes.
Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources and a wide variety of Desired Destinations with a few clicks.
Visit our Website to Explore Hevo
Hevo Data with its strong integration with MongoDB and 150+ Data Sources (including 40+ Free Sources) allows you to not only export data from your desired data sources & load it to the destination of your choice but also transform & enrich your data to make it analysis-ready. Hevo also allows the integration of data from non-native sources using Hevo’s in-built REST API & Webhooks Connector. You can then focus on your key business needs and perform insightful analysis using BI tools.
Want to give Hevo a try? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You may also have a look at the amazing price, which will assist you in selecting the best plan for your requirements.
Share your experience of understanding MongoDB high availability in the comment section below! We would love to hear your thoughts.
Manjiri is a proficient technical writer and a data science enthusiast. She holds an M.Tech degree and leverages the knowledge acquired through that to write insightful content on AI, ML, and data engineering concepts. She enjoys breaking down the complex topics of data integration and other challenges in data engineering to help data professionals solve their everyday problems.