How to Set Up a NodeJS MongoDB Aggregation Pipeline?

As a MongoDB Developer, you might have used map-reduce to calculate the aggregated value. While a map-reduce is a useful tool, it may be sluggish when dealing with large amounts of data. This is where MongoDB offers you a powerful Aggregation framework.

The MongoDB Aggregation framework is used for querying your Database using multi-stage pipelines. These pipelines filter, organize, sort, transform, or execute functions on data by sending documents through several stages. The aggregated findings may then be used for analysis, visualization, or reporting.

In this post, you will learn how to set up a NodeJS MongoDB Aggregation Pipeline using the MongoDB Atlas Aggregation Builder tool. Before moving to that, you will understand what is MongoDB Aggregation framework and what are the various types of Aggregation Stages in the MongoDB Aggregation Pipeline. Moreover, you will get familiar with the amazing features offered by NodeJS and MongoDB. At the end of this article, you will explore some of the limitations associated with the MongoDB Aggregation Pipelines. So, let’s get started.

Table of Contents

What is NodeJS?

NodeJS or Node.JS is an open-source and cross-platform framework that uses the V8 engine to run JavaScript code outside of a web browser. It’s commonly used to make scalable applications and web pages. NodeJS is an asynchronous technology, meaning data is delivered over networks without regard for time constraints.

Key Features of NodeJS

NodeJS provides a plethora of robust features. Let’s have a look at these in detail below:

Asynchronous & Event-Driven: The APIs of the NodeJS library are completely asynchronous. Read about creating a secure NodeJS REST API.
Highly Scalable: NodeJS can control and manage concurrent requests efficiently.
Cross-Platform Compatibility: It is compatible with a wide range of operating systems, including Windows, Unix, Linux, Mac OS X, and mobile devices.
Single-Threaded: It is scalable because it uses a single-threaded architecture with event looping.
Faster Code Execution: The V8 JavaScript Runtime motor, which is also used by Google Chrome, is used by NodeJS.

Learn how to install NodeJS with MongoDB on Ubuntu.

What is MongoDB?

MongoDB is a document-oriented NoSQL Database designed to store and analyze massive volumes of data. It stores data as collections and documents rather than tables with rows and columns, unlike typical relational databases. The Collections are made up of many documents, each of which contains key-value pairs, which are the fundamental units of data.

MongoDB was initially released in February 2009. MongoDB.Inc. developed, maintains, and manages it under the SSPL (Server Side Public License) license. Dive into the use cases of MongoDB and why MongoDB is primarily used.

Key Features of MongoDB

MongoDB has many features that distinguish it from other Databases. The following are some of the most important features:

High Data Availability & Durability: MongoDB’s Replication enables you to backup & recover data from different servers.
Horizontal Scalability: It supports horizontal scalability with the help of sharding.
Efficient Concurrent Management: It can successfully manage numerous concurrent read & write requests for the same data thanks to its superior concurrency controls and locking protocols.
Quick Indexing: It delivers excellent speed for every query thanks to a wide range of indices and features, including language-specific sort orders that facilitate complicated access patterns to datasets.
Accelerated Ad-hoc Queries: When executing Ad-hoc queries, you may need to evaluate hundreds to millions of variables.

Hevo is the ideal data pipeline solution for integrating MongoDB as a source, enabling seamless data extraction, transformation, and loading. This ensures smooth data flow and real-time updates, optimizing your analytics and data management processes.

Let’s see some unbeatable features of Hevo Data:

Fully Managed: Hevo Data is a fully managed service and is straightforward to set up.
Schema Management: Hevo Data automatically maps the source schema to perform analysis without worrying about the changing schema.
Real-Time: Hevo Data works on the batch as well as real-time data transfer so that your data is analysis-ready always.
Live Support: With 24/5 support, Hevo provides customer-centric solutions to the business use case.

Get Started with Hevo for Free

Introduction to the MongoDB Aggregation Framework

In MongoDB, Aggregate operations are expressions that can be used to obtain reduced and aggregated results. The Aggregate Pipeline in MongoDB’s Query API lets you build a Pipeline with one or more stages, each of which performs a different action on your data.

Aggregation Framework implements the same reasoning as the SQL operator “GROUP BY“. Pipelines and Expressions are the two primary components of the aggregate framework. Pipelines are operators that can handle a large number of documents in a short period. After the computations on the input documents, expressions return the output documents. You can study your data in real time using the MongoDB Aggregation framework.Read along to learn how you can set up a NodeJS MongoDB Aggregation Pipeline using MongoDB Atlas.

Types of MongoDB Aggregation Stages

You can use the framework to build a MongoDB Aggregate Pipeline with one or more stages. A document Aggregation Pipeline stage can perform some of the following operations:

On the input documents, each stage conducts an operation. For instance, a stage can filter, organize or calculate values in documents.
The output documents from one step are handed on to the next by one of the stages.
The results of an Aggregate Pipeline can be returned for groups of documents.

Many types of Aggregation Stages can be used in an aggregation pipeline. Some of these have been listed below:

$match: It helps to filter the input record according to the set of specified expressions.
$project: It generates a resultset with a subset of input or computed fields.
$geoNear: It outputs documents in the order of their proximity to a specific location.
$group: It groups some columns & performs aggregations on others.
$limit: It picks first n documents from input sets.
$skip: It ignores the first n documents from the input set.
$sort: It helps to sort all the input documents according to the specified object.
$redact: It generates a restriction on the contents of the documents based on information from the document.
$unwind: It extracts an array field from a document with n elements. Then it returns n documents, with each element being inserted as a field to each document in place of the array.
$out: It takes all the documents returned from the prior stage and then writes them to the collection.

Steps to Set Up NodeJS MongoDB Aggregation Pipeline

Before you get started with the NodeJS MongoDB Aggregation Pipeline, make sure you meet the following requirements:

Project Data Access Read-Only Role: To create and execute aggregation pipelines in the Data Explorer
Project Data Access Read/Write Role: To use the $out stage in your pipeline.

Follow the steps below to set up the NodeJS MongoDB Aggregation Pipeline:

Step 1: Set up the Aggregation Pipeline Builder
Step 2: Create a new NodeJS MongoDB Aggregation Pipeline
Step 3: Export the NodeJS MongoDB Aggregation Pipeline to NodeJS

Step 1: Set up the Aggregation Pipeline Builder

MongoDB Atlas’ built-in Data Explorer allows you to explore and interact with your data by creating Aggregation Pipelines. To access your Data Explorer you can navigate to Databases → Browse Collections. If you are currently viewing a Database Deployment, you can simply click the Collections tab on the top.

Now follow the steps below to set up the Aggregation view in the Aggregation Pipeline Builder:

Firstly, select the Database for the Collection you want to work on as shown below.

NodeJS MongoDB Aggregation - Select Database

Now, on the left-hand side, you can see the Collections list. Select the desired Collection and then click on Aggregation views. A window similar to the one shown below will be displayed.

NodeJS MongoDB Aggregation - Create Aggregation View

Integrate MongoDB to BigQuery

Get a Demo Try it

Integrate MongoDB to Redshift

Get a Demo Try it

Integrate MongoDB to Snowflake

Get a Demo Try it

Step 2: Create a new NodeJS MongoDB Aggregation Pipeline

After setting up the Aggregation view for the new NodeJS MongoDB Aggregation Pipeline, follow the steps below to set up the new NodeJS MongoDB Aggregation Pipeline.

Navigate to the bottom-left of the panel and click on the Select dropdown list as shown below.

NodeJS MongoDB Aggregation - Select Aggregation Stage

Now, choose one of the Aggregation Stages to work with as shown above.
Next, you can modify your Aggregation Stage as shown below. If the Comment Mode is turned on, you will be guided by some of the syntactic guidelines.

NodeJS MongoDB Aggregation - Aggregation Pipeline Result

To add more Aggregation Stages in your NodeJS MongoDB Aggregation Pipeline, you can either click on the ‘+‘ button or click the Add Stage button as shown below. The former adds the stage after the current stage whereas the latter adds the stage at the end of the Aggregation Pipeline.

NodeJS MongoDB Aggregation - Add New Stage

If you want to add any specific language rules for string comparison, you can update that using the Collation button present at the top of the Aggregation view.

Step 3: Export the NodeJS MongoDB Aggregation Pipeline to NodeJS

After adding all the Aggregation Stages, you can export the NodeJS MongoDB Aggregation Pipeline to NodeJS or any language such as Python 3, Java, and C#. Follow the steps below to export your NodeJS MongoDB Aggregation Pipeline to NodeJS:

Navigate to the top of the Atlas Aggregation Pipeline Builder and click on Export to Language.
Now, from the dropdown list, select Node, as shown below.

NodeJS MongoDB Aggregation - Export Aggregation Pipeline to NodeJS

To include the comments, you can tick the checkbox at the bottom.
Next, click on the Copy button at the top-right of your pane to copy the NodeJS MongoDB Aggregation Pipeline to the clipboard. Now, you can merge this code to your NodeJS file.

Great Work! You have successfully created a NodeJS MongoDB Aggregation Pipeline and exported the Pipeline to your NodeJS file using the MongoDB Atlas Aggregation Pipeline Builder.

Limitations of MongoDB Aggregation Pipelines

The Aggregate Command has the following constraints when it comes to using the Aggregation Operations in the NodeJS MongoDB Aggregation Pipeline.

Result Size Restrictions: The results of the aggregate command can be returned as a cursor or stored in the collection. The BSON Document Size restriction of 16 megabytes applies to each document in the result set. The aggregate gives an error if any single document exceeds the BSON Document Size limit. The limit only applies to documents that have been returned.
Stage Restrictions: The number of Aggregate Pipeline stages permitted in a single pipeline in MongoDB 5.0 is limited to 1000.
Memory Restrictions: A total of 100 megabytes of RAM is allotted to each pipeline stage. MongoDB generates an error by default if a stage exceeds this limit.

To read the limitations of the MongoDB Aggregation Pipeline, refer to Aggregation Pipeline Limits — MongoDB Manual.

Conclusion

To summarize, in this article, you gained a basic understanding of how to create a NodeJS MongoDB Aggregation Pipeline. You learned the different types of MongoDB Aggregation Stages Aggregation in node js and also explored some of the limitations associated with the MongoDB Aggregation Pipeline.

However, as a Developer, extracting complex data from a diverse set of data sources to your MongoDB Database or creating NodeJS MongoDB Aggregation pipelines by integrating with other sources can seem to be quite challenging. This is where a simpler alternative like Hevo can save your day! Hevo Data is a No-Code Data Pipeline that offers a faster way to move data from 150+ Data Sources, such as MongoDB and other 60+ Free Sources, into your Data Warehouse.

Want to take Hevo for a spin? SIGN UP and experience the feature-rich Hevo suite firsthand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

FAQ on NodeJS MongoDB Aggregation Pipeline

How to use aggregation in MongoDB in Nodejs?

To use aggregation in MongoDB with Node.js:
– Use the aggregate method of the MongoDB Node.js driver.
– Construct an aggregation pipeline comprising stages like $match, $group, $project, etc., to perform complex data transformations and calculations.

Is MongoDB good for aggregation?

Yes, MongoDB is well-suited for aggregation tasks due to its powerful aggregation framework, which allows for flexible and efficient processing of data.

What is the function of node aggregation?

The purpose of an Aggregation node is to apply aggregate functions to measures based on one or several attributes.

How many types of aggregation are there in MongoDB?

In MongoDB, there are three main types of aggregation operations:
– Aggregation Pipeline
– Map-Reduce
– Single-Purpose Aggregation Methods

Shubhnoor Gill Research Analyst, Hevo Data

Shubhnoor is a data analyst with a proven track record of translating data insights into actionable marketing strategies. She leverages her expertise in market research and product development, honed through experience across diverse industries and at Hevo Data. Currently pursuing a Master of Management in Artificial Intelligence, Shubhnoor is a dedicated learner who stays at the forefront of data-driven marketing trends. Her data-backed content empowers readers to make informed decisions and achieve real-world results.

Build a NodeJS MongoDB Aggregation Pipeline in 3 Easy Steps