In relational database management systems (RDBMS), join operations are commonly employed for cross-table data analysis. However, MongoDB, as a NoSQL database, diverges from the norm by adopting a denormalized data model. Here, related data is stored together in a single document, contrasting with the traditional approach of spreading it across multiple tables or collections.
But MongoDB join does provide a join functionality between collections where needed, through its rich $lookup
aggregation pipeline stage.
In this article, we will walk through MongoDB join two collections using practical examples and explore how to filter joined data using $lookup
features like the where clause and the compass operator. Let’s get started!
What is MongoDB?
MongoDB is a non-relational Database Management System that is Open Source and Cross-Platform. MongoDB, which was founded in 2009, employs the Document-Oriented Database Model to organize data into documents and collections rather than tables.
This enables it to store various types of data. Furthermore, it is distributed under the Server Side Public License (SSPL), which supports a novel mechanism for storing and retrieving large amounts of data. MongoDB is a straightforward and simple-to-configure database that provides high performance, automatic scalability, and high availability.
Hevo is the ideal data pipeline solution for integrating MongoDB as a source, enabling seamless data extraction, transformation, and loading. This ensures smooth data flow and real-time updates, optimizing your analytics and data management processes.
Let’s see some unbeatable features of Hevo Data:
- Fully Managed: Hevo Data is a fully managed service and is straightforward to set up.
- Schema Management: Hevo Data automatically maps the source schema to perform analysis without worrying about the changing schema.
- Real-Time: Hevo Data works on the batch as well as real-time data transfer so that your data is analysis-ready always.
- Live Support: With 24/5 support, Hevo provides customer-centric solutions to the business use case.
Get Started with Hevo for Free
How To Join Two Collections In MongoDB?
MongoDB has some issues with linking data from one collection to another unless you use simple Script Query functions. To resolve this problem, we introduce the MongoDB JOIN concept, which facilitates the relationship between the data.
We can join documents on collections in MongoDB join by using the $lookup
(Aggregation) function. $lookup
(Aggregation) creates an outer left join with another collection and helps to filter data from merged data. If documents are part of a “joined” collection, the <strong>$lookup</strong>
(Aggregation) function will return documents as a subarray of the original collection.
Syntax
In MongoDB’s JOIN operation, the goal is to connect one set of data to another set of data. To use MongoDB query join two collections, we use the $lookup
operator, whose syntax is defined below:
{
$lookup:
{
from: <collection to join>,
localField: <field from the input documents>,
foreignField: <field from the documents of the "from" collection>,
as: <output array field>
}
}
from
: Specifies the collection to join with. (Replace <collection to join>
with the actual collection name.)
localField
: The field from the input documents that will be matched with the foreignField
. (Replace <field from the input documents>
with the actual field name.)
foreignField
: The field from the documents in the “from” collection that will be matched with the localField
. (Replace <field from the documents of the "from" collection>
with the actual field name.)
as
: The name of the output array field that will contain the matching documents from the “from” collection. (Replace <output array field>
with the desired output field name.)
The $lookup function accepts a document containing these fields:
- From: It describes the collection within an identical database that should be used for executing the join, but with sharded collection restrictions.
- LocalField: This field represents the input field from the documents to the stage of
$lookup
, which performs an equivalence match between the local-field and the foreign-field from the collection ‘from.’ If an input document does not have a value for a local-field, then this operator will give the field a null value for matching purposes.
- foreignField: This field contains data from the documents in the ‘from’ collection with which an equivalence match can be made between the foreignField and the localField. When a document in the collection ‘from’ does not have a foreignField value, this operator will set the field to null for further matching purposes.
- As: It specifies the name of the array field that needs to be added to the input documents. More so, a new array field also includes matching documents from the collection ‘from.’ The prevailing field will be overwritten if the stated name already exists in the input document.
In SQL, when a JOIN operation is performed, the result is always a new row that combines all the fields present from the parent and foreign tables. However, in MongoDB, the output documents are supplemented by an array of native collection documents.
Steps for Joining Two Collections in MongoDB
For performing MongoDB Join two collections, you must use the $lookup
operator. It is defined as a stage that executes a left outer join with another collection and aids in filtering data from joined documents.
For example, if a user requires all grades from all students, then the below query can be written:
Students.aggregate([{
$lookup: {
From: ‘Grades’,
LocalField: ‘Student_id’,
foreignField: ‘Stud_id’,
as: ‘Student_grade’
}}]);
- Purpose: Join
Students
collection with Grades
collection to include students’ grades.
- Operation: Uses
$lookup
for a left outer join.
- Fields:
from
: Specifies the Grades
collection to join.
localField
: Matches Student_id
from Students
.
foreignField
: Matches Stud_id
from Grades
.
as
: Stores matching grades in Student_grade
array.
From the above query, we can see that the initial parameter named ‘from’ specifies the collection which is to be joined with the present one, and the ‘localField’ specifies the key in the current collection, which is to be matched with the foreign key in another collection by the ‘foreignField’ parameter. For retrieving the data, Student_id from Students and stud_id from Grades are matched and joined. In addition, the end parameter ‘as’ is added as a code-named to the data.
It is worth noting that here, within every record of the output data from the query above, the data from the joined table also appears as the primary element within an array. Similarly, every row from the output data from the query above will display as follows:
{_ID: String, Quantity: Number, Student_id: String, Student_grade: Array, Stud_id: String, Student_name: String}
The data that was joined by the query will be the first element in the grades array.
Load Data from MongoDB to any Data Warehouse
No credit card required
Example 1
The following example creates a sample dataset and inserts some documents to perform MongoDB Join two collections:
db.address.insertMany(
[
{
"name": "Bob",
"blk_no": 22,
"street" : "dewey street",
"city" : "United States of America"
},
{
"name": "Jack",
"blk_no": 25,
"street" : "gordon street",
"city" : "New Zealand"
}
]
);
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("613594cdb59313217373673c"),
ObjectId("613594cdb59313217373673d")
]
}
- Purpose: Insert multiple documents into the
address
collection.
- Documents Inserted:
- First Document:
name
: Bob
blk_no
: 22
street
: dewey street
city
: United States of America
- Second Document:
name
: Jack
blk_no
: 25
street
: gordon street
city
: New Zealand
- Result:
acknowledged
: true (indicates the operation was successful)
insertedIds
: Array of ObjectIds for the inserted documents:
- ObjectId(“613594cdb59313217373673c”)
- ObjectId(“613594cdb59313217373673d”)
A few documents were inserted into the address collection. Now let’s add a few documents to another collection:
db.userInfo.insertMany(
[
{
"contact_name": "Bob",
"age": 27,
"sex" : "male",
"citizenship" : "Filipino"
},
{
"contact_name": "Jack",
"age": 22,
"sex" : "male",
"citizenship" : "Filipino"
}
]
);
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("613594dbb59313217373673e"),
ObjectId("613594dbb59313217373673f")
]
}
The userInfo collection was populated with some documents.
It is important to note that the name field in the address collection has the same values as the contact_name field in the userInfo collection.
Applying the $lookup
function, we find an equality match as follows:
098db.userInfo.aggregate([
{ $lookup:
{
from: "address",
localField: "contact_name",
foreignField: "name",
as: "address"
}
}
]).pretty();
Output:
{
"_id" : ObjectId("613594dbb59313217373673e"),
"contact_name" : "Bob",
"age" : 27,
"sex" : "male",
"citizenship" : "Filipino",
"address" : [
{
"_id" : ObjectId("613594cdb59313217373673c"),
"name" : "Bob",
"blk_no" : 22,
"street" : "dewey street",
"city" : "United States of America"
}
]
}
{
"_id" : ObjectId("613594dbb59313217373673f"),
"contact_name" : "Jack",
"age" : 22,
"sex" : "male",
"citizenship" : "Filipino",
"address" : [
{
"_id" : ObjectId("613594cdb59313217373673d"),
"name" : "Jack",
"blk_no" : 25,
"street" : "gordon street",
"city" : "New Zealand"
}
]
}
We matched the name field and the contact_name field of these two collections. Then, you can see the output where we joined the two collections. This is the simplest way to perform MongoDB join two collections from the same database.
Example 2
Create two sample collections, inventory and orders:
db.inventory.insertMany( [
{ prodId: 100, price: 20, quantity: 125 },
{ prodId: 101, price: 10, quantity: 234 },
{ prodId: 102, price: 15, quantity: 432 },
{ prodId: 103, price: 17, quantity: 320 }
] )
db.orders.insertMany( [
{ orderId: 201, custid: 301, prodId: 100, numPurchased: 20 },
{ orderId: 202, custid: 302, prodId: 101, numPurchased: 10 },
{ orderId: 203, custid: 303, prodId: 102, numPurchased: 5 },
{ orderId: 204, custid: 303, prodId: 103, numPurchased: 15 },
{ orderId: 205, custid: 303, prodId: 103, numPurchased: 20 },
{ orderId: 206, custid: 302, prodId: 102, numPurchased: 1 },
{ orderId: 207, custid: 302, prodId: 101, numPurchased: 5 },
{ orderId: 208, custid: 301, prodId: 100, numPurchased: 10 },
{ orderId: 209, custid: 303, prodId: 103, numPurchased: 30 }
] )
Creating a Joined View
We will use db.createView()
to create a new view called sales, based on the orders collection:
db.createView( "sales", "orders", [
{
$lookup:
{
from: "inventory",
localField: "prodId",
foreignField: "prodId",
as: "inventoryDocs"
}
},
{
$project:
{
_id: 0,
prodId: 1,
orderId: 1,
numPurchased: 1,
price: "$inventoryDocs.price"
}
},
{ $unwind: "$price" }
] )
In the provided example:
- The
$lookup
stage utilizes the prodId field within the orders collection to connect documents in the inventory collection that share matching prodId fields.
- The matched documents are incorporated as an array in the inventoryDocs field.
- The
$project
stage picks specific fields from the available options.
- The
$unwind
stage transforms the price field from an array into a single scalar value.
The resulting documents in the sales view are:
{ orderId: 201, prodId: 100, numPurchased: 20, price: 20 },
{ orderId: 202, prodId: 101, numPurchased: 10, price: 10 },
{ orderId: 203, prodId: 102, numPurchased: 5, price: 15 },
{ orderId: 204, prodId: 103, numPurchased: 15, price: 17 },
{ orderId: 205, prodId: 103, numPurchased: 20, price: 17 },
{ orderId: 206, prodId: 102, numPurchased: 1, price: 15 },
{ orderId: 207, prodId: 101, numPurchased: 5, price: 10 },
{ orderId: 208, prodId: 100, numPurchased: 10, price: 20 },
{ orderId: 209, prodId: 103, numPurchased: 30, price: 17 }
Querying the View
We can query the view to find out the total amount sold of each product.
db.sales.aggregate( [
{
$group:
{
_id: "$prodId",
amountSold: { $sum: { $multiply: [ "$price", "$numPurchased" ] } }
}
}
] )
Output:
[
{ _id: 102, amountSold: 90 },
{ _id: 101, amountSold: 150 },
{ _id: 103, amountSold: 1105 },
{ _id: 100, amountSold: 600 }
]
Extract and Load from MongoDB to Snowflake
Extract and Load from MongoDB to BigQuery
Extract and Load from MongoDB to Databricks
Extract and Load from MongoDB to Redshift
How MongoDB Joins Two Collections With Specific Conditions using Pipeline & $Lookup?
$lookup is used with a pipeline to perform MongoDB join two collections based on specific conditions requiring expressions.
For instance, if we want to get all users with orders greater than five
User.aggregate([{
$lookup: {
from: 'Orders',
let: {ID: '$ID'},
pipeline: [{$match: {
$expr: {
$eq: [
'$User_ID', '$$ID'
],
$gt: [
'$quantity', 5
]
}}}]
as: 'OrdersData'
}}]);
We can quickly use MongoDB join tables using pipelines within a $lookup.
How Does Mongodb Join Two Collections With The Where Clause?
MongoDB’s aggregate()
method joins two collections with a where clause. By doing this, we can correlate two subqueries.
The pipeline operator is also used to apply the condition. Let’s look at the following example:
Example
In this example, we create two collections, and then we apply the where clause to join them using the $lookup
aggregation.
Two collections, orders, and warehouses are created, and a few documents are stored. We now combine both collections using an aggregation operation.
As you can see in the where clause condition, we use a pipeline function, assign the stock_item equal to order_item, and then add stock_data if the amount in stock is more significant than the order_qty.
This is the simplest way to apply where clauses to perform MongoDB Join two collections. However, you can learn other ways to do this by practicing more.
Using Compass To Join Two Collections In MongoDB
The MongoDB Compass GUI allows you to query, aggregate, and analyze your MongoDB data using a visual environment.
The $lookup
aggregation function performs MongoDB Join two collections for the same database.
Example
In this example, we’ll learn how to combine two database collections using $lookup
aggregation.
In Compass, you can apply the aggregation operation by following these steps:
- If desired, create a new collection and database.
- We created two collections for genera and movies in the test database, and we also inserted some documents into both collections.
- We now merge the two collections using the aggregation function.
- On the Documents page, click Aggregation.
There’s a “pipeline stage” after you select Aggregation where there’s a dropdown list and where you have to choose which operation to perform; we choose the $lookup
function.
Some fields in the collection need to be provided in this function. First, you write the collection name that you would like to join (movies) into the ‘from’ field, and write the fields of the local collection (genre list) and the fields of the “form collection” (type) in localField and foreignField, respectively, and in as write the output array field (movies).
You can see the output on the right side of the input after Aggregation.
This is the simple way to apply the $lookup
aggregation using MongoDB compass. With more practice, you will better understand the aggregation pipeline and operations and better grasp the compass environment.
Conclusion
MongoDB Join two collections is a powerful feature that enhances data retrieval and analysis. By leveraging techniques like the $lookup
stage in aggregation pipelines or Compass for visual joins, MongoDB enables seamless data integration. Whether it’s applying specific conditions, using a where
clause, or exploring advanced queries, MongoDB offers flexibility to meet diverse requirements.
For a no-code solution to integrate MongoDB with other data sources effortlessly, consider using Hevo. Want to take Hevo for a spin? SIGN UP and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.
FAQ on How To Join Two Collections In MongoDB
Can you join two collections in MongoDB?
Yes, joining two collections is possible in MongoDB. It involves combining data from multiple collections into one result set.
How to get data from one collection to another collection in MongoDB?
To get data from one collection to another collection in MongoDB, you can use the $merge operator. It can output to a collection in the same or different database.
How do I join multiple collections with $lookup MongoDB?
To join multiple collections with $lookup, you can apply $lookup before $group or after $group.
Samuel is a versatile writer specializing in the data industry. With over seven years of experience, he excels in data science, data integration, and data analysis, crafting engaging content on these topics. He is also adept at WordPress development. Samuel holds a Bachelor's degree in Computer Science from Lagos State University.