MongoDB is a leading NoSQL database and provides a wide range of functionality along with versatility for allowing various data types and structures for storage. PostgreSQL is also a robust platform for RDBMS which is traditional. This is good for storage but restricts the ability to gain information from data. Moving data from PostgreSQL to MongoDB helps in resolving this issue to some extent.

This article provides a step-by-step guide to transferring data from PostgreSQL to MongoDB.

What is MongoDB?

PostgreSQL to MongoDB: MongoDB Logo
Image Source: www.mongodb.com

MongoDB is a NoSQL database that was developed by MongoDB inc, which is schema-free. It was designed and created using c++ and javascript allowing for higher connectivity. It uses a collection of Documents and has an option for creating schemas as well. It doesn’t follow the same structure of a traditional database wherein the data is stored in form of rows.

Since general RDBMS are easier to use same is the case with MongoDB. MongoDB uses a NoSQL platform making it easier for individuals having less or no prior programming knowledge. MongoDB processes the data in a semi-structured format, allowing for processing large volumes of data in one go simultaneously. It can be hosted on mostly all the cloud platforms be it Google’s Cloud, Microsoft Azure, or even Amazons’ Web Services.

MongoDB uses Binary JSON and MQL as an alternative to SQL. BSON allows for data types such as the floating-point, long, date, and many more that are not supported by regular JSON. MQL offers additional capabilities when compared to regular SQL making it more relevant for MongoDB as it processes JSON-type documents.

Solve your data replication problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Key Features of MongoDB

MongoDB hosts a wide range of services that provide a great solution compared to other databases. Some of these features are:

  • In MongoDB, you can search by field, range query, regular expression. This adds support for ad hoc queries.
  • This is a schemaless database written in C ++ and provides high performance.
  • Any field in the document can be indexed to support master-slave replication.
  • There is an autoload configuration feature for grouping similar data in the database.
  • You can easily save files of any size without complicating the stack and manage them in the event of a failure.
  • Use JavaScript instead of procedures.
  • MongoDB also supports JSON data models, automated sharding, and embedded replication for high scalability and availability.

To learn more about MongoDB, click this link.

Simplify Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ data sources (including 30+ free data sources) like Asana and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

GET STARTED WITH HEVO FOR FREE[/hevoButton]

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
SIGN UP HERE FOR A 14-DAY FREE TRIAL

What is PostgreSQL?

PostgreSQL to MongoDB: postgreSQL Logo
Image Source: medium.com

PostgreSQL is also known as Postgres. It is an open-source, free to use, RDBMS platform that is SQL compliment and highly extensible. It was released on July 8, 1996, and was designed in a lab at the University of California, the campus of Berkeley. It was designed as a successor to the aging Ignes database. PostgreSQL is more traditional in storing the data, it uses structured objects, unlike MongoDB which uses documents in JSON format.

It uses a monolithic architecture, in which all the components are united and work in symmetry and in a systematic matter and is programmed in C. It has a wide community and provides a lot of support through community and also additional support to its paid community. It finds its use in healthcare, banking, and manufacturing. It also permits the use of innovative mechanisms for backup.

It is programmed in C and follows a monolithic architecture, which means that the components are completely united and work systematically. It offers community support along with additional support to some of its paid customers. It is widely used in the healthcare, banking, and manufacturing industries due to its innovative backup mechanisms.

Key Features of PostgreSQL

PostgreSQL houses some unique features that make it a good alternative compared to other traditional RDBMSs. Some of these features are:

  • Various data types, document types are all supported by PostgreSQL.
  • Since all the architecture is monolithic the components work together in an automated manner.
  • It is ideal for bank systems transactional workflows that perform risk management, BI, and many more.
  • It makes storage reliable since it has many fail-safes and redundancies.
  • It is open-source, free-of-cost, and has many features.
  • It has limited scalability as its processing power depends on the machine it runs on.
  • It runs successfully on major operating systems and is ACID (Atomicity, Consistency, Isolation, and Durability) compliant.

To learn more about PostgreSQL, click this link.

Steps to Connect PostgreSQL to MongoDB

Transferring data from PostgreSQL to MongoDB is a simple process. The complexity depends on the structure of the document and the data present in the PostgreSQL database and the structure required in the MongoDB database.

The process to move data is a two-step one where the data from PostgreSQL and then import the data into MongoDB using the mongoimport command.

Two methods to move data from PostgreSQL to MongoDB.

1. PostgreSQL to MongoDB: Using TSV to transfer data

The TSV layout is a fast and clean option for PostgreSQL to MongoDB. However, it best works if the unique PostgreSQL schema is surprisingly easy and also you don’t want to embed files in different files with a one-to-many relationship for moving data from PostgreSQL to MongoDB.

For instance, in case you want to create a group of files that constitute clients and plan on embedding every customer’s order in those files, TSV won’t paintings as it doesnt aid hierarchical information structures. Each row withinside the TSV turns into a document. You can nevertheless map values in every row to fields deeper for your files; you simply can`t embed files.

You could create an address field and create nested state and city fields as in the example below. However, you couldn’t store multiple address entities

Consider the PostgreSQL created table “users.”

postgres=# select * from users;
 userid | first_name | last_name | logins | upgraded | city | state 
--------+------------+-----------+--------+----------+------+-------
      1 | Bob        | Smith     |     10 | t        | NYC  | NY
COPY (SELECT
        userid AS "userId.int32()",
        logins AS "loginCount.int32()",
      first_name AS "firstName.auto()",
      last_name AS "lastName.auto()",
        CASE WHEN upgraded=TRUE THEN 'TRUE' ELSE 'FALSE' END AS "upgradedAccount.boolean()",
        city AS "address.city.auto()",
        state AS "address.state.auto()"
    FROM
        users
) TO '/tmp/users.tsv' WITH (FORMAT CSV, HEADER TRUE, DELIMITER E't');

As seen in the example for PostgreSQL to MongoDB, we are trying to rename each column that needs to be exported using the AS command. The alias that is created is of the type mongoFieldName.type(). for instance userId.int32() is a alias. When we use the mongoimport for executing the import of PostgreSQL to MongoDB, the command will parse the header and create fields based on types. In the general use case, we use the auto() command of mongoimport to self-determine the type based on context.

For the column that is UPGRADED, which is of type BOOLEAN, the PostgreSQL returns t for true and f for false, and the default value won’t’ be recognized, hence the case statement issued to determine the values based on condition.

limited data nesting can also be done using the TSV migration process in PostgreSQL to MongoDB. the city and state fields of the dataset can be considered an example for moving data from PostgreSQL to MongoDB.

the end line of the query is the one that exports as CSV using a tab delimiter. this converts it into TSV format. This command is used to export files into MongoDB:

mongoimport --uri mongodb+srv://<mongodb_user>:<mongodb_password>@<altas-cluster>.mongodb.net/<DATABASE> 
 --collection users --type tsv --file users.tsv --headerline --columnsHaveTypes

This will result in the following document in the “users” collection after PostgreSQL to MongoDB:

{
    userId: 1,
    first_name: "Bob",
    last_name: "Smith",
      LoginCount : 10,
      upgraded : true,
    "Address" : {
            "city" : "NYC",
            "state" : "NY" }
}

2. PostgreSQL to MongoDB: Using JSON to transfer data

this method for moving data from PostgreSQL to MongoDB is effectiveWhen the database schema is complex and in the end, you require nested arrays of records documents using JSON for data migration is the best method.

To utilize the JSON to generate the results of the PostgreSQL query, you require three functions:

  • row_to_json: It returns the column name as keys and the value as rows and then row as a JSON object.
  • array_to_json: returns an array of data as a JSON array.
  • array_agg: Accepts a set of values and returns an array where each value in the set becomes an element in the array

Let’s look at an orders table which in our relational schema keeps a record for every product ordered by the user:

 id | userid | product | quantity | price 
----+--------+---------+----------+-------
  1 |      1 | shoes   |        4 | 50.75
  2 |      1 | razer   |       20 |  1.75

Here is an example of PostgreSQL to MongoDB query using all three functions:

COPY (SELECT row_to_json(results)
FROM (
  SELECT userid,first_name, last_name,
    (
      SELECT array_to_json(array_agg(o))
      FROM (
        SELECT id, product, quantity, price
        FROM orders
        WHERE products.userid = users.userid
      ) o
    ) AS orders
  FROM users
) results) TO '/tmp/orders.json' WITH (FORMAT text, HEADER FALSE);

The query above will create a file orders.json with JSON documents for each user from the user’s table:

{
    id: 1,
    first_name: "Bob",
    last_name: "Smith",
    "orders" : [
        {
            "id" : 1,
            "product" : "shoes",
            "quantity" : 4,
            "price" : 50.75
        },
        {
            "id" : 2,
            "product" : "razer",
            "quantity" : 20,
            "price" : 1.75
        }
    ]
}

Once you have written the query and saved it, you can use mongoimport to import the file:

mongoimport --uri mongodb+srv://<mongodb_user>:<mongodb_password>@<atlas-cluster>.mognodb.net/<DATABASE> 
--collection orders --jsonArray orders.json

Conclusion

MongoDB is a leading database in the world of NoSQL platforms out there. It provides a wide range of functionality as well as flexibility in terms of structure and types of data that can be used. PostgreSQL is also a renowned name in the field of RDBMS platforms and it is more traditional but robust. This article gives a guide to connect PostgreSQL to MongoDB.

MongoDB and PostgreSQL are trusted source that a lot of companies use as it provides many benefits but transferring data from it into a data warehouse is a hectic task. The Automated data pipeline helps in solving this issue and this is where Hevo comes into the picture. Hevo Data is a No-code Data Pipeline and has awesome 100+ pre-built Integrations that you can choose from.

visit our website to explore hevo[/hevoButton]

Hevo can help you Integrate your data from numerous sources and load them into a destination to Analyze real-time data with a BI tool such as Tableau. It will make your life easier and data migration hassle-free. It is user-friendly, reliable, and secure.

SIGN UP for a 14-day free trial and see the difference!

Share your experience of learning about steps to connect PostgreSQL to MongoDB in the comments section below.

No-code Data Pipeline For Your Data Warehouse