MongoDB to MySQL ETL: Steps to Migrate Data Easily

on Tutorial • February 21st, 2020 • Write for Hevo

MongoDB is a NoSQL database that stores objects in a JSON-like structure. It is usually classified as document-oriented storage because of the way it treats objects as documents. Schemaless databases like MongoDB offer unique versatility because of their ability to store semi-structured data. MySQL, on the other hand, is a structured database with a hard schema. It is a usual practice to use NoSQL databases for use cases where the number of fields will evolve as the development progresses. At some point in time when the use case matures itself, organizations will notice the overhead introduced because of their NoSQL schema and would want to migrate the data to hard structured databases with comprehensive querying ability and predictable query performance.  In this, we will focus on methods to move data from MongoDB to MySQL.

Understanding MongoDB

NoSQL databases offer a unique benefit when compared to their SQL counterparts – The ability to dynamically add new keys or attributes without worrying about a hard structure. This ability coupled with a comprehensive querying layer makes MongoDB one of the most widely used NoSQL databases. MongoDB Atlas makes it possible to create MongoDB clusters in any of the available cloud platforms in a short time. Developers of MongoDB also can choose to install it on their on-premise infrastructure. It is a true cloud agnostic horizontally scalable NoSQL database. MongoDB also supports master-slave based replication making it a highly available database.

Conceptually, data in MongoDB is stored into key-value pairs called documents. A document is a collection of key and value pairs. Documents for the same purpose or end goal are grouped together to form collections. A collection is vaguely equivalent to a table in an SQL based database. Multiple collections from a database. MongoDB supports indexes over collections. It is also capable of using the Map-Reduce paradigm for aggregation over a large amount of data. 

Understanding MySQL

MySQL is the most widely accepted SQL based database which powers the day to day operations for some of the biggest names in the industry including Facebook, Github, etc. It has a very comprehensive querying layer that can even process programming languages like  Python. It supports a wide variety of data types for the table fields including JSON. MySQL tables support granular read and write access level controls even to the extent of individual fields. Security can also be enforced using encrypting the entries – Tables and binary logs can be encrypted. 

Conceptually, MySQL is a query layer that can run on top of a number of storage engines. InnoDB is the default storage engine. There is support for other storage engines based on memory as well as file storage. Memory-based storage engines store data in RAM, enabling MySQL to run as an in-memory database. Other than the SQL interface and parser, the querying layer also contains an optimizer and cache. There is also a connection pool that works with these components to manage the incoming connection requests. 

Methods to Move Data from MongoDB to MySQL

There are two popular methods to perform MongoDB to MySQL ETL.

Method 1: Use a fully-managed, hassle-free solution like Hevo Data – 14 Day Free Trial.

Method 2: Write a Custom Code to move data from MongoDB to MySQL.

In this post, we will deep dive on Method 2. Towards the end, we will also discuss the cons of this approach and discover easier ways to load data from MongoDB to MySQL.

MongoDB to MySQL: Moving Data Using Custom Code

MongoDB and MySQL are extremely different databases with different schema strategies. This means there are a lot of things to consider before moving your data from a Mongo collection to MySQL. The simplest of the migration will contain the few steps below.

Use the default mongoexport tool to create a CSV from the collection.

mongoexport --host localhost --db classdb --collection student --type=csv  --out students.csv --fields first_name,middle_name,last_name, class,email

In the above command, classdb is the database name, student is the collection name and students.csv is the target CSV file which will contain data from MongoDB. An important point here is the –field attribute. This attribute should have all the list of fields that you plan to export from the collection. If you think about it, MongoDB follows a schema-less strategy and there is no way to ensure that all the fields are there in all the documents. If MongoDB was being used for the purpose it was designed for, there is a big chance that not all documents in the same collection have all the attributes. Hence while doing this export, you should be sure that these fields are there in all the documents. If the fields are not in all documents, MongoDB will not throw an error, but will just populate an empty value in its place.

Create a student table in MySQL to accept the new data.

CREATE TABLE students ( id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY, firstname VARCHAR(30) NOT NULL, middlename VARCHAR(30) NOT NULL, lastname VARCHAR(30) NOT NULL, class VARCHAR(30) NOT NULL, email VARCHAR(30) NOT NULL, )

Load the data into the MySQL table using the below command.

LOAD DATA LOCAL INFILE 'students.csv' INTO TABLE students FIELDS TERMINATED BY ','  ENCLOSED BY '"' LINES TERMINATED BY 'n' (firstname,middlename,lastname,class,email)

You have the data from MongoDB loaded into MySQL now.

Another alternative to this process would be to exploit the document storage capability of MySQL. This allows MongoDB documents to be directly loaded as a MySQL collection rather than a MySQL table. The caveat is that you will not be able to use the true power of MySQL’s structured data storage. And in most cases, that will be the reason why you chose to move the data to MySQL in the first place. 

But the above set of steps only works for a limited set of use cases and does not reflect the true challenges in migrating a collection from MongoDB to MySQL. Let us look into them in the next section.

MongoDB to MySQL: Limitations of Custom Code Approach

  1. In real life, there are going to be hundreds of fields in a collection and creating the first command(the first step) itself is going to be a pain trying to manually write all the fields.
  2. This approach needs you to have access to an instance with enough storage space to store the whole contents of the collection and then execute the LOAD DATA command.
  3. In MongoDB use cases, typically there will be many attributes that do not exist in all the documents. This method will populate empty values in such cases. This is not a desirable behavior and in most cases, you will need to write a custom script to handle this.
  4. This method is suitable for a first-time bulk load, but if your use case needs frequent syncing, it would be tough to handle the duplicate data rows and scheduling. 
  5. This is going to be an extremely time-consuming process if the source collection is large. If MongoDB is being used for serving a live website or mobile app, the developer will need to build additional throttling logic so that the incoming traffic is not affected.
  6. You could also end up in trouble if the fields inside your MongoDB has arrays as their values. These will need to be specifically handled using a script since the simple CSV export can lead to unexpected results here. 

A solution to all these complexities will be to use a third-party cloud-based ETL tool like Hevo.  Hevo can mask all the above concerns and provide an elegant migration process for your MongoDB collections. 

Simpler WAY TO MOVE DATA FROM MONGODB TO MySQL

Using a fully managed, no-code Data Pipeline platform like Hevo Data, you can load your data from MongoDB (and many other data sources) to MySQL in real-time effortlessly. Hevo makes complex data load from MongoDB to MySQL a cakewalk in 3 simple steps:

  • Authenticate and connect to your MongoDB database.
  • Select the replication mode: (a) Full Dump and Load (b) Incremental load for append-only data (c) Incremental load for mutable data.
  • Configure the MySQL destination and start moving data.

Hevo automatically flattens all the nested JSON data coming from MongoDB and automatically maps it to MySQL destination without any manual effort. 

Sign up for a 14-day free trial to experience a hassle-free data load from MongoDB to MySQL. 

No-code Data Pipeline for your Data Warehouse