In relational database management systems (RDBMS), join operations are commonly employed for cross-table data analysis. However, MongoDB, as a NoSQL database, diverges from the norm by adopting a denormalized data model. Here, related data is stored together in a single document, contrasting with the traditional approach of spreading it across multiple tables or collections.

But MongoDB does provide a join functionality between collections where needed, through its rich $lookup aggregation pipeline stage. In this article, we will walk through MongoDB join two collections using practical examples and explore how to filter joined data using $lookup features like the where clause and the compass operator. Let’s get started!

What is MongoDB?

MongoDB is a non-relational Database Management System that is Open Source and Cross-Platform. MongoDB, which was founded in 2009, employs the Document-Oriented Database Model to organize data into documents and collections rather than tables. This enables it to store various types of data. Furthermore, it is distributed under the Server Side Public License (SSPL), which supports a novel mechanism for storing and retrieving large amounts of data. MongoDB is a straightforward and simple-to-configure database that provides high performance, automatic scalability, and high availability.

How To Join Two Collections In MongoDB?

MongoDB has some issues with linking data from one collection to another unless you use simple Script Query functions. To resolve this problem, we introduce the JOIN concept, which facilitates the relationship between the data.

We can join documents on collections in MongoDB by using the $lookup (Aggregation) function. $lookup(Aggregation) creates an outer left join with another collection and helps to filter data from merged data. If documents are part of a “joined” collection, the $lookup (Aggregation) function will return documents as a subarray of the original collection.

Syntax

In MongoDB’s JOIN operation, the goal is to connect one set of data to another set of data. To join two collections, we use the $lookup operator, whose syntax is defined below:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

The $lookup function accepts a document containing these fields:

  • From: It describes the collection within an identical database that should be used for executing the join, but with sharded collection restrictions.
  • LocalField: This field represents the input field from the documents to the stage of $lookup, which performs an equivalence match between the local-field and the foreign-field from the collection ‘from.’ If an input document does not have a value for a local-field, then this operator will give the field a null value for matching purposes.
  • foreignField: This field contains data from the documents in the ‘from’ collection with which an equivalence match can be made between the foreignField and the localField. When a document in the collection ‘from’ does not have a foreignField value, this operator will set the field to null for further matching purposes.
  • As: It specifies the name of the array field that needs to be added to the input documents. More so, a new array field also includes matching documents from the collection ‘from.’ The prevailing field will be overwritten if the stated name already exists in the input document.

In SQL, when a JOIN operation is performed, the result is always a new row that combines all the fields present from the parent and foreign tables. However, in MongoDB, the output documents are supplemented by an array of native collection documents.

Steps for Joining Two Collections in MongoDB

For performing MongoDB Join two collections, you must use the $lookup operator. It is defined as a stage that executes a left outer join with another collection and aids in filtering data from joined documents.

For example, if a user requires all grades from all students, then the below query can be written:

Students.aggregate([{
$lookup: {
From: ‘Grades’,
LocalField: ‘Student_id’,
foreignField: ‘Stud_id’,
as: ‘Student_grade’
}}]);

From the above query, we can see that the initial parameter named ‘from’ specifies the collection which is to be joined with the present one, and the ‘localField’ specifies the key in the current collection, which is to be matched with the foreign key in another collection by the ‘foreignField’ parameter. For retrieving the data, Student_id from Students and stud_id from Grades are matched and joined. In addition, the end parameter ‘as’ is added as a code-named to the data.

It is worth noting that here, within every record of the output data from the query above, the data from the joined table also appears as the primary element within an array. Similarly, every row from the output data from the query above will display as follows:

{_ID: String, Quantity: Number, Student_id: String, Student_grade: Array, Stud_id: String, Student_name: String}

The data that was joined by the query will be the first element in the grades array.

Example 1

The following example creates a sample dataset and inserts some documents to perform MongoDB Join two collections:

db.address.insertMany(
    [
        {
            "name": "Bob",
            "blk_no": 22,
            "street" : "dewey street",
            "city" : "United States of America"
        },
        {
            "name": "Jack",
            "blk_no": 25,
            "street" : "gordon street",
            "city" : "New Zealand"
        }
    ]
);
{
        "acknowledged" : true,
        "insertedIds" : [
                ObjectId("613594cdb59313217373673c"),
                ObjectId("613594cdb59313217373673d")
        ]
}

A few documents were inserted into the address collection. Now let’s add a few documents to another collection:

db.userInfo.insertMany(
    [
        {
            "contact_name": "Bob",
            "age": 27,
            "sex" : "male",
            "citizenship" : "Filipino"
        },
        {
            "contact_name": "Jack",
            "age": 22,
            "sex" : "male",
            "citizenship" : "Filipino"
        }
    ]
);
{
        "acknowledged" : true,
        "insertedIds" : [
                ObjectId("613594dbb59313217373673e"),
                ObjectId("613594dbb59313217373673f")
        ]
}

The userInfo collection was populated with some documents.

It is important to note that the name field in the address collection has the same values as the contact_name field in the userInfo collection.

Applying the $lookup function, we find an equality match as follows:

098db.userInfo.aggregate([
    { $lookup:
        {
           from: "address",
           localField: "contact_name",
           foreignField: "name",
           as: "address"
        }
    }
]).pretty();
Output:
{
        "_id" : ObjectId("613594dbb59313217373673e"),
        "contact_name" : "Bob",
        "age" : 27,
        "sex" : "male",
        "citizenship" : "Filipino",
        "address" : [
                {
                        "_id" : ObjectId("613594cdb59313217373673c"),
                        "name" : "Bob",
                        "blk_no" : 22,
                        "street" : "dewey street",
                        "city" : "United States of America"
                }
        ]
}
{
        "_id" : ObjectId("613594dbb59313217373673f"),
        "contact_name" : "Jack",
        "age" : 22,
        "sex" : "male",
        "citizenship" : "Filipino",
        "address" : [
                {
                        "_id" : ObjectId("613594cdb59313217373673d"),
                        "name" : "Jack",
                        "blk_no" : 25,
                        "street" : "gordon street",
                        "city" : "New Zealand"
                }
        ]
}

We matched the name field and the contact_name field of these two collections. Then, you can see the output where we joined the two collections. This is the simplest way to perform MongoDB join two collections from the same database.

Example 2

Create two sample collections, inventory and orders:

db.inventory.insertMany( [
   { prodId: 100, price: 20, quantity: 125 },
   { prodId: 101, price: 10, quantity: 234 },
   { prodId: 102, price: 15, quantity: 432 },
   { prodId: 103, price: 17, quantity: 320 }
] )

db.orders.insertMany( [
   { orderId: 201, custid: 301, prodId: 100, numPurchased: 20 },
   { orderId: 202, custid: 302, prodId: 101, numPurchased: 10 },
   { orderId: 203, custid: 303, prodId: 102, numPurchased: 5 },
   { orderId: 204, custid: 303, prodId: 103, numPurchased: 15 },
   { orderId: 205, custid: 303, prodId: 103, numPurchased: 20 },
   { orderId: 206, custid: 302, prodId: 102, numPurchased: 1 },
   { orderId: 207, custid: 302, prodId: 101, numPurchased: 5 },
   { orderId: 208, custid: 301, prodId: 100, numPurchased: 10 },
   { orderId: 209, custid: 303, prodId: 103, numPurchased: 30 }
] )

Creating a Joined View

We will use db.createView() to create a new view called sales, based on the orders collection:

db.createView( "sales", "orders", [
   {
      $lookup:
         {
            from: "inventory",
            localField: "prodId",
            foreignField: "prodId",
            as: "inventoryDocs"
         }
   },
   {
      $project:
         {
           _id: 0,
           prodId: 1,
           orderId: 1,
           numPurchased: 1,
           price: "$inventoryDocs.price"
         }
   },
      { $unwind: "$price" }
] )

In the provided example:

  • The $lookup stage utilizes the prodId field within the orders collection to connect documents in the inventory collection that share matching prodId fields.
  • The matched documents are incorporated as an array in the inventoryDocs field.
  • The $project stage picks specific fields from the available options.
  • The $unwind stage transforms the price field from an array into a single scalar value.

The resulting documents in the sales view are:

{ orderId: 201, prodId: 100, numPurchased: 20, price: 20 },
{ orderId: 202, prodId: 101, numPurchased: 10, price: 10 },
{ orderId: 203, prodId: 102, numPurchased: 5, price: 15 },
{ orderId: 204, prodId: 103, numPurchased: 15, price: 17 },
{ orderId: 205, prodId: 103, numPurchased: 20, price: 17 },
{ orderId: 206, prodId: 102, numPurchased: 1, price: 15 },
{ orderId: 207, prodId: 101, numPurchased: 5, price: 10 },
{ orderId: 208, prodId: 100, numPurchased: 10, price: 20 },
{ orderId: 209, prodId: 103, numPurchased: 30, price: 17 }

Querying the View

We can query the view to find out the total amount sold of each product.

db.sales.aggregate( [
   {
      $group:
         {
            _id: "$prodId",
            amountSold: { $sum: { $multiply: [ "$price", "$numPurchased" ] } }
         }
   }
] )

Output:

[
{ _id: 102, amountSold: 90 },
  { _id: 101, amountSold: 150 },
  { _id: 103, amountSold: 1105 },
  { _id: 100, amountSold: 600 }
]
Simplify MongoDB ETL & Analysis Using Hevo’s No-code Data Pipeline

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources such as MongoDB & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

Get Started with Hevo for Free

How MongoDB Joins Two Collections With Specific Conditions using Pipeline & $Lookup?

$lookup is used with a pipeline to perform MongoDB join two collections based on specific conditions requiring expressions.

For instance, if we want to get all users with orders greater than five

User.aggregate([{
$lookup: {
 from: 'Orders',
 let: {ID: '$ID'},
 pipeline: [{$match: {
  $expr: {
   $eq: [
    '$User_ID', '$$ID'
   ],
   $gt: [
    '$quantity', 5
   ]
  }}}]
  as: 'OrdersData'
}}]);

We can quickly join tables using pipelines within a $lookup.

How Does Mongodb Join Two Collections With The Where Clause?

MongoDB’s aggregate() method joins two collections with a where clause. By doing this, we can correlate two subqueries.

The pipeline operator is also used to apply the condition. Let’s look at the following example:

Example

In this example, we create two collections, and then we apply the where clause to join them using the $lookup aggregation.

MongoDB Join two collections: Lookup Aggregation

Two collections, orders, and warehouses are created, and a few documents are stored. We now combine both collections using an aggregation operation.

As you can see in the where clause condition, we use a pipeline function, assign the stock_item equal to order_item, and then add stock_data if the amount in stock is more significant than the order_qty.

This is the simplest way to apply where clauses to perform MongoDB Join two collections. However, you can learn other ways to do this by practicing more.

Using Compass To Join Two Collections In MongoDB

The MongoDB Compass GUI allows you to query, aggregate, and analyze your MongoDB data using a visual environment. 

The $lookup aggregation function performs MongoDB Join two collections for the same database.

Example

In this example, we’ll learn how to combine two database collections using $lookup aggregation.

In Compass, you can apply the aggregation operation by following these steps:

  • Open MongoDB Compass.
  • If desired, create a new collection and database.
  • We created two collections for genera and movies in the test database, and we also inserted some documents into both collections.
    • We now merge the two collections using the aggregation function.
    • On the Documents page, click Aggregation.

There’s a “pipeline stage” after you select Aggregation where there’s a dropdown list and where you have to choose which operation to perform; we choose the $lookup function.

Some fields in the collection need to be provided in this function. First, you write the collection name that you would like to join (movies) into the ‘from’ field, and write the fields of the local collection (genre list) and the fields of the “form collection” (type) in localField and foreignField, respectively, and in as write the output array field (movies).

You can see the output on the right side of the input after Aggregation.

This is the simple way to apply the $lookup aggregation using MongoDB compass. With more practice, you will better understand the aggregation operations and better grasp the compass environment.

Conclusion

One of the fundamental differences between SQL and NoSQL databases is the JOIN operator. The JOIN operation can be performed in SQL databases between two tables inside the same database or different. On the other hand, JOIN operations in MongoDB can be performed between collections implemented in a similar database, although the case is varied.

This guide introduced you to a few of the most essential operations for performing MongoDB Join Two Collections with examples. Read through these essentials for a deeper insight into the Join functionality:

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline provides you with a consistent and reliable solution to manage data transfer between a variety of sources like MongoDB and a wide variety of Desired Destinations with a few clicks.

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable Hevo Pricing that will help you choose the right plan for your business needs.

Share your experience of learning how to perform MongoDB Join Two Collections in the comments below!

References

Samuel Salimon
Freelance Technical Content Writer, Hevo Data

Samuel specializes in freelance writing within the data industry, adeptly crafting informative and engaging content centered on data science by merging his problem-solving skills.

No Code Data Pipeline For MongoDB

Get Started with Hevo