Two phase commit in database management systems is a technique used to ensure atomicity and consistency. How? Two phase commit coordinates the saving of data (commit) and reversal of changes if need be. A two phase commit MongoDB operation makes sure that there are no inconsistencies due to failures in database operations while performing multi-document transactions.
Transactional processing in MongoDB and databases, in general, are subject to ACID compliance. Single document transactions in MongoDB are atomic, however, multi-document transactions are not. Hence, there exists the need for two phase commits. A two phase commit performs commit operations on different database servers or nodes using a process flow that makes sure all the database servers are consistent.
In this article, we’ll show you how a two phase commit is implemented in database servers, in general, and how you can use two phase commit MongoDB operation in MongoDB.
Table of Contents
What is MongoDB?
Image Source: Wikipedia
MongoDB is a popular open-source NoSQL database. It is a document-oriented dynamic schema database that stores data in JSON-like documents. This means you don’t have to worry about the data structure, the number of fields or the types of fields used to store values when storing your records. Documents in MongoDB are similar to JSON objects.
MongoDB stores and retrieves data from the database using a wiredTiger engine, which is much faster than other database engines. It also has multi-document ACID transaction features, which are very useful. It includes a comprehensive aggregation framework, expressive joins, graph traversal, and pipelines.
You can change the structure of records by simply adding new fields or removing existing ones (referred to as documents by MongoDB). This MongoDB feature simplifies the representation of Hierarchical Relationships, Store Arrays, and other more complex data structures. Many tech behemoths, including Facebook, eBay, Adobe, and Google, now use MongoDB to store massive amounts of data.
To learn more about MongoDB Data Modeling and document relation types you can create in your database, check out our helpful guide on- Understanding MongoDB Data Modeling: A Comprehensive Guide. If you would like to know about best practices to optimize your files and services in MongoDB, do give a read here– MongoDB Configuration 101: Best Practices to Optimize your Files & Services.
Key Features of MongoDB
When compared to other traditional databases, MongoDB has several distinguishing features that make it a superior solution. Some of these characteristics are discussed in greater detail below:
- Fewer Schemas in a Database: A schema-less database, MongoDB allows you to store different types of documents in a single collection (the equivalent of a table). In other words, multiple documents, each with its own set of fields, content, and size, can be stored in a single collection. As a result, MongoDB provides users with a great deal of flexibility.
- Document Indexed: Every field in a MongoDB database document is indexed with primary and secondary indices, making data retrieval from the pool easier.
- Scalability: Sharding in MongoDB enables horizontal scalability. Sharding is the process of distributing data across multiple servers. A large amount of data is partitioned into data chunks using the shard key, and these data chunks are evenly distributed across shards that span many physical servers.
- Replication: MongoDB ensures high data availability by replicating and distributing data across multiple servers, ensuring that if one fails, your data can still be retrieved from another.
Hevo Data, a No-Code Data Pipeline, helps load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources such as MongoDB, including 40+ Free Sources. It is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination.
Hevo will automate your data flow in minutes without writing any line of code. Its fault-tolerant architecture makes sure that your data is secure and consistent. Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data.
Get Started with Hevo for Free
Let’s look at some of the salient features of Hevo:
- Fully Managed: It requires no management and maintenance as Hevo is a fully automated platform.
- Data Transformation: It provides a simple interface to perfect, modify, and enrich the data you want to transfer.
- Real-Time: Hevo offers real-time data migration. So, your data is always ready for analysis.
- Schema Management: Hevo can automatically detect the schema of the incoming data and map it to the destination schema.
- Connectors: Hevo supports 100+ Integrations to SaaS platforms such as WordPress, FTP/SFTP, Files, Databases, BI tools, and Native REST API & Webhooks Connectors. It supports various destinations including Google BigQuery, Amazon Redshift, Snowflake, Firebolt, Data Warehouses; Amazon S3 Data Lakes; Databricks, MySQL, SQL Server, TokuDB, DynamoDB, MongoDB PostgreSQL Databases to name a few.
- Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
- Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
- Live Monitoring: Advanced monitoring gives you a one-stop view to watch all the activities that occur within Data Pipelines.
- Live Support: Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-Day Free Trial!
What is Two Phase Commit MongoDB Operation?
This section explains what two phase commit operation is, in general, and in MongoDB.
Two phase commit enables users to save data (commit) and reverse changes in case of a need. The saving and reversal of changes in DBMS are made possible with transactional logging. Transactional logging involves logging information on servers regarding database operations. In a distributed setting, each server would keep a record of its logs. Imagine managing all of these! That’s where a two phase commit comes in.
How Does Two Phase Commit MongoDB Work?
There are two phases in the two phase commit operation. The first phase involves sending messages to each server:
- Phase 1: The coordinator sends a message to each of the servers for them to put in their independent log records. If the operation on any server is unsuccessful, it returns a failure message, and if otherwise, it returns a success message.
The next step in the two phase commit operation is to commit.
- Phase 2: Phase 2 is the committing phase. In this phase, the coordinator signals to every server connected with commit instructions. Meanwhile, if all the servers do not respond with a success message, phase 1 has to be reinitiated.
After the commit process, in the two phase commit MongoDB operation, the servers write the commits as part of the log records for reference purposes and send a message to the coordinator that the commit was successful.
If the commit was not successful, the coordinator sends a message to all the servers to roll back the transactions and communicates the feedback to the coordinator.
These processes show you how the coordinator organizes and synchronizes the operations in a two phase commit operation.
Drawback of Two Phase Commit Operation
However, with the synchronization and consistency benefits of two phase commits, there’s a major drawback – the coordinator.
The coordinator is responsible for almost every operation that takes place in a two phase commit, if it fails, there would be uncompleted and unsynchronized transactions which would create inconsistency. This is because after a server sends a feedback message to the coordinator, there won’t be any operation until a commit or rollback is received. This effect is called the blocking protocol.
Now that you’ve seen how a two phase commit works, without the specification of a database management system, let’s get into the main purpose of this article – How do we implement a two phase MongoDB commit in MongoDB?
How to Implement Two Phase Commit MongoDB Operation?
Before we proceed, two phase commit MongoDB operations are technically impossible in MongoDB. MongoDB provides a document model to group related document data using arrays and embedded document objects. However, MongoDB supports multi-document transactions for cases where related data are not contained in a single document, although this should be considered as the last resort.
So, how do we simulate a two phase commit MongoDB operation with transactional processing in MongoDB?
Two Phase Commit MongoDB Operation Use Case
A Reservation Processing System
This use case will use Node.js to implement the two phase commit MongoDB process with MongoDB, however, the process flow is similar to other technologies.
Two Phase Commit MongoDB Operation Use Case: Pre-Configuration
Firstly, we need to install a module that allows us to set up replica sets.
Replica sets in MongoDB are a group of MongoDB processes that maintain the same data set. Without a replica set, we can’t use transactions in MongoDB.
The module for configuring replica sets in MongoDB is ‘run-rs’. Just in case you’re wondering, it doesn’t conflict with your installed MongoDB and requires zero configuration.
Steps to Set Up Two Phase Commit MongoDB Process
Step 1: Create a main.js file.
The main.js file will contain all our transactional processes. The first thing to do is set up the database connection.
In this code piece, first, we specified the connection URI which included the database name ‘test’ for our MongoDB replica set and connected to it. Secondly, we connected to our database with client.connect() and created a createListing function that takes in two arguments – client(our database instance) and newListing which is the listing data for reservation.
To create listing data, pass a listing data to the createListing function.
Next up is to set up transaction functions for the transactional processes.
Step 2: Setup transactions.
You need to set up transaction functions to initiate transactional processes. The source code included here is a sample implementation and is just for demonstration.
Next, add a reservation to the reservations array for the appropriate document in the users collection.
Check if the listing is already reserved for those dates. If so, abort the transaction.
Now, add the reservation dates to the datesReserved array for the appropriate document in the listingsAndRewiews collection.
Step 3: End the session.
This is where the bulk of processes start. We create a createReservation function that takes client, userEmail, nameOfListing, reservationDates, and reservationDetails. The client argument is the database instance we’ve created in step 1, userEmail and the rest of the arguments will be inputted manually.
The next steps involved creating two documents, users and listingsAndReviews for the multi-document transaction and setting up reservations. The reservation is set with a createReservationDocument that takes nameOfListing, reservationDates, and reservationDetails.
The client.startSession() initiates the start of a new session that cannot be modified. The succeeding code block does the following:
- Adds a reservation to the reservations array.
- Check if the listing is already reserved for the reservation dates. If there is, the transaction is aborted, buttressing the nature of two phase commit MongoDB.
- Adds reservation dates to a datesReserved array in the listingsAndReviews collection. The datesReserved array is created with the $addToSet mongodb function.
- Return the output from listingsAndReviewsUpdateResults with regards to the matchedCounts and modifiedCount.
To clarify the logic in the code, anything inside the session.withTransaction() function is a transaction. Every process inside is synchronous and returns either a success or failure message. The message returned will determine the action to be performed after the transaction.
This code implementation is a sample implementation of a simulated two phase commit MongoDB process. As a recap of the major points of action, we covered:
- Database connection.
- Creation of multiple documents.
- Process of starting the session, and
- Starting transactions in the session.
With these four processes in mind, you can implement transactional processing in MongoDB using two phase commit MongoDB process.
Conclusion
Transactional processing in MongoDB is a simulated two phase commit MongoDB operation. This article has explained the process flow of two phase commit in database management systems and two phase commit MongoDB. It has also shown you the sample implementation in Node.js from the development view.
As said earlier, MongoDB uses the document model to group related document data rather than using a multi-document structure. However, if there is a need for multi-document transactions, the best approach is with transactional processing in MongoDB.
MongoDB Database stores valuable business data that can be used to generate insights. Companies need to analyze their business data stored in multiple data sources. The data needs to be loaded to the Data Warehouse to get a holistic view of the data.
Visit our Website to Explore Hevo
Hevo Data is a No-code Data Pipeline solution that helps to transfer data from 100+ sources to desired Data Warehouse. It fully automates the process of transforming and transferring data to a destination without writing a single line of code.
Want to take Hevo for a spin? Sign Up here for a 14-day free trial and experience the feature-rich Hevo suite first hand.
Share your experience of learning about two phase commit MongoDB operation in the comments section below!