One of the common challenges that every growing business faces is the ability to efficiently handle the exponentially growing data. Apart from the Traditional Relational Databases, organizations are now using Document-oriented Open-source NoSQL Databases such as MongoDB.
MongoDB Replication is an excellent feature that safeguards your data in case of any hardware or software failures. To enable this, you can easily set up the MongoDB Replica Set Configuration. These Replica Sets allow users to build extra data replicas for several purposes such as Disaster Recovery, Backup, Reporting, etc.
In this article, you will learn how to effectively set up the MongDB Replica Set Configuration in 7 easy steps.
Table of Content
What is MongoDB?
Image Source
MongoDB is a Document-oriented Open-source NoSQL Database. It stores the data in documents and supports NoSQL Query Language to interact with data. MongoDB is very popular among organizations due to its features. MongoDB is highly elastic and lets you combine and store multivariate data without compromising on the powerful indexing options, data access, and validation rules.
NoSQL database means it does not store the data in rows and columns compared to traditional databases but instead uses documents that collect key-value pairs to store the data. Key-Value pair allows MongoDB to store the data schemaless and scale them vertically when needed without disrupting the data model. It keeps the data into a storage format known as BSON (Binary Style of JSON document).
What is Replication in MongoDB?
Replication is the process of making copies of data across servers to ensure the high availability of data in case of failures. Replication of data ensures that the data is available all the time to users in case of any network failure or hardware failure that potentially causes data loss.
A MongoDB Replica Set is a group of mongod processes that maintain the same data set throughout. Replication in MongoDB is done via Replica Set. Replica set provides Data Redundancy and High Availability and provides fault tolerance against the loss. Replica set allows users to create additional data replicas for dedicated purposes such as Disaster Recovery, Backup, Reporting, etc.
Why do you need Replication in MongoDB?
Replication is a technique to synchronize the data across multiple server nodes. The need for Replication can be understood from the following points:
- Data Replication replicates the data across servers, thereby ensuring high data availability.
- Data Replication provides a level of fault tolerance against any data loss.
- Keeping a copy of data in another server helps users recover the data in case of network failure or data loss.
- It can also serve for disaster recovery, reporting, etc.
- It can also act as a load balancer, as it causes less load on a particular server for data retrieval.
What is MongoDB Replica Set Configuration?
The Replication of MongoDB uses a replica set to perform Replication. There are various configurations related to the replica set to replicate the data from one server to another successfully. In this blog post, you will learn about various configuration options available and how you can use them to replicate your data effectively.
You can access replica set configuration via method rs.conf(). Give below is a sample configuration document:
{
_id: <string>,
version: <int>,
term: <int>,
protocolVersion: <number>,
writeConcernMajorityJournalDefault: <boolean>,
configsvr: <boolean>,
members: [
{
_id: <int>,
host: <string>,
arbiterOnly: <boolean>,
buildIndexes: <boolean>,
hidden: <boolean>,
priority: <number>,
tags: <document>,
secondaryDelaySecs: <int>,
votes: <number>
},
...
],
settings: {
chainingAllowed : <boolean>,
heartbeatIntervalMillis : <int>,
heartbeatTimeoutSecs: <int>,
electionTimeoutMillis : <int>,
catchUpTimeoutMillis : <int>,
getLastErrorModes : <document>,
getLastErrorDefaults : <document>,
replicaSetId: <ObjectId>
}
}
The above configuration document contains the following elements:
- _id: This is the name of the replica set.
- version: This is an incrementing number to distinguish various versions of Replication.
- term: An incrementing number used to distinguish revisions of the replica set configuration document from previous iterations of the configuration.
- configsvr: is a boolean term that indicates whether the replica set is used for a sharded cluster’s config servers.
- protocolVersion: It defaults to 1.
- writeConcernMajorityJournalDefault: Determines the behavior of { w: “majority”} write concern if the write concern does not explicitly specify the journal option j.
- members: This array element contains the information about various replication members (servers) included in the replication set. It stores id, hostname, indexes, priority, etc.
- settings: It contains optional settings that are applied to the whole replica set.
You can find more information about these configuration options from the Official MongoDB Documentation page.
As the ability of businesses to collect data explodes, data teams have a crucial role to play in fueling data-driven decisions. Yet, they struggle to consolidate the data scattered across sources into their warehouse to build a single source of truth. Broken pipelines, data quality issues, bugs and errors, and lack of control and visibility over the data flow make data integration a nightmare.
1000+ data teams rely on Hevo’s Data Pipeline Platform to integrate data from over 150+ sources in a matter of minutes. Billions of data events from sources as varied as SaaS apps, Databases, File Storage and Streaming sources can be replicated in near real-time with Hevo’s fault-tolerant architecture. What’s more – Hevo puts complete control in the hands of data teams with intuitive dashboards for pipeline monitoring, auto-schema management, custom ingestion/loading schedules.
All of this combined with transparent pricing and 24×7 support makes us the most loved data pipeline software on review sites.
Take our 14-day free trial to experience a better way to manage data pipelines.
How to set up MongoDB Replica Set Configuration?
Image Source
Setting up Replication in MongoDB contains fairly simple steps and is easy to learn. Ensure that MongoDB is installed on all nodes where you want to replicate the data. For MongoDB installation, you can refer to the guide available in the Official MongoDB Documentation. You can now follow the step-by-step procedure to set up Replication in MongoDB:
MongoDB Replica Set Configuration Step 1: Set up the hosts
For this example, the following nodes are used to replicate the data:
192.168.0.29 mongo-db1
192.168.0.30 mongo-db2
192.168.0.31 mongo-db3
Add the above IP address to each node’s/etc/hosts file. Please ensure that all the three IPs need to be in each node’s/etc/hosts file.
MongoDB Replica Set Configuration Step 2: Set up hostname
On each node, rename the hostname to be identified via a name, not just by IP address.
$ sudo vim /etc/hostname ## On Node1
mongo-db1
$ sudo vim /etc/hostname ## On Node2
mongo-db2
$ sudo vim /etc/hostname ## On Node3
mongo-db3
MongoDB Replica Set Configuration Step 3: Generate Key
Keys will help to communicate nodes without the requirement of an external passcode.
To generate the key, execute the given below commands:
On Node 1(mongo-db1)
# mkdir -p /etc/mongodb/keys/
# openssl rand -base64 756 > /etc/mongodb/keys/mongo-key
# chmod 400 /etc/mongodb/keys/mongo-key
# chown -R mongodb:mongodb /etc/mongodb
Now, you can copy the generated key file to all other nodes at the same location i.e. /etc/mongodb/keys .
MongoDB Replica Set Configuration Step 4: Configure Replica Set
Now that you have all the configurations done on the nodes, one last step is to add the IP addresses of the nodes and provide a replication set name. The MongoDB configuration file can be found at /etc/mongo.conf.
On node 1 => mongo-db1
# network interfaces
net:
port: 27017
bindIp: 192.168.0.29
#security:
security:
authorization: enabled
keyFile: /etc/mongodb/keys/mongo-key
#replication:
replication:
replSetName: "replicaset-01"
Perform the same thing on the other two nodes.
Once the changes are done, you can restart the MongoDB services by running the following command:
sudo systemctl restart mongod
MongoDB Replica Set Configuration Step 5: Initiate replication
Log in to the primary node 192.168.0.29 .
$ mongo
> rs.initiate()
MongoDB Replica Set Configuration Step 6: Adding Instances to Replica set
Once the replication REPL is started, it’s time to add nodes to the replication set by the following command:
rs.add(“mongo-db2:27017”)
rs.add(“mongo-db3:27017”)
Once you add the nodes, you will see the output as {‘ok’:1}, which indicates a successful addition of nodes in the replica set.
Using manual scripts and custom code to move data into the warehouse is cumbersome. Frequent breakages, pipeline errors and lack of data flow monitoring makes scaling such a system a nightmare. Hevo’s reliable data pipeline platform enables you to set up zero-code and zero-maintenance data pipelines that just work.
- Reliability at Scale: With Hevo, you get a world-class fault-tolerant architecture that scales with zero data loss and low latency.
- Monitoring and Observability: Monitor pipeline health with intuitive dashboards that reveal every stat of pipeline and data flow. Bring real-time visibility into your ELT with Alerts and Activity Logs
- Stay in Total Control: When automation isn’t enough, Hevo offers flexibility – data ingestion modes, ingestion, and load frequency, JSON parsing, destination workbench, custom schema management, and much more – for you to have total control.
- Auto-Schema Management: Correcting improper schema after the data is loaded into your warehouse is challenging. Hevo automatically maps source schema with destination warehouse so that you don’t face the pain of schema errors.
- 24×7 Customer Support: With Hevo you get more than just a platform, you get a partner for your pipelines. Discover peace with round the clock “Live Chat” within the platform. What’s more, you get 24×7 support even during the 14-day full-feature free trial.
- Transparent Pricing: Say goodbye to complex and hidden pricing models. Hevo’s Transparent Pricing brings complete visibility to your ELT spend. Choose a plan based on your business needs. Stay in control with spend alerts and configurable credit limits for unforeseen spikes in data flow.
MongoDB Replica Set Configuration Step 7: Check the Status
The status of the replication set can be seen from the following command:
rs.status()
The above command will result in the following output:
{
"set" : "myitsocial",
"date" : ISODate("2022-02-10T06:15:02Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "192.168.0.29:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 303165,
"optime" : Timestamp(1644516902, 1),
"optimeDate" : ISODate("2022-02-10T06:15:02Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.0.30:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 302985,
"optime" : Timestamp(1644516902, 1),
"optimeDate" : ISODate("2022-02-10T06:15:02Z"),
"lastHeartbeat" : ISODate("2022-02-10T06:15:02Z"),
"lastHeartbeatRecv" : ISODate("2014-08-12T06:15:02Z"),
"pingMs" : 0,
"syncingTo" : "10.20.30.40:27017"
},
{
"_id" : 2,
"name" : "192.168.0.31:27017",
"health" : 1, "state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 302985,
"optime" : Timestamp(1644516902, 1),
"optimeDate" : ISODate("2022-02-10T06:15:02Z"),
"lastHeartbeat" : ISODate("2022-02-10T06:15:02Z"),
"lastHeartbeatRecv" : ISODate("2022-02-10T06:15:02Z"),
"pingMs" : 0,
"syncingTo" : "192.168.0.29:27017"
}
],
"ok" : 1
}
Conclusion
In this article, you have learned how to set up MongoDB Replica Set Configuration with help of a step-by-step procedure. MongoDB prevents the loss of any data via its Replication feature. Using Replica sets, MongoDB provides Data Redundancy, High Availability, and Fault-tolerance by allowing users to create additional data replicas for dedicated purposes such as Disaster Recovery, Backup, Reporting, etc.
Apart from MongoDB, you would be using several applications and databases across your business for Marketing, Accounting, Sales, Customer Relationship Management, etc. To get a complete overview of your business performance, it is important to consolidate data from all these sources. To achieve this you need to assign a portion of your Engineering Bandwidth to Integrate Data from all sources, Clean & Transform it, and finally, Load it to a Cloud Data Warehouse or a destination of your choice for further Business Analytics. All of these challenges can be comfortably solved by a Cloud-Based ETL tool such as Hevo.
Visit our Website to Explore Hevo
Hevo, a No-code Data Pipeline can seamlessly transfer data from a vast sea of 100+ sources such as MongoDB & MongoDB Atlas to a Data Warehouse or a Destination of your choice to be visualized in a BI Tool. It is a reliable, completely automated, and secure service that doesn’t require you to write any code!
Hevo will automate your data transfer process, hence allowing you to focus on other aspects of your business like Analytics, Customer Management, etc. Hevo provides a wide range of sources – 150+ Data Sources (including 40+ Free Sources) – that connect with over 15+ Destinations. It will provide you with a seamless experience and make your work life much easier.
Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand.
Tell us about your experience of setting up the MongoDB Replica Set Configuration! Share your thoughts with us in the comments section below.