Build a NodeJS MongoDB Aggregation Pipeline in 3 Easy Steps

on Data Aggregation, Data Integration, Data Pipeline, Database Management Systems, ETL Tutorials, MongoDB, Node js • February 24th, 2022 • Write for Hevo

NodeJS MongoDB Aggregation - Featured Image

As a MongoDB Developer, you might have used map-reduce to calculate the aggregated value. While a map-reduce is a useful tool, it may be sluggish when dealing with large amounts of data. This is where MongoDB offers you a powerful Aggregation framework.

The MongoDB Aggregation framework is used for querying your Database using multi-stage pipelines. These pipelines filter, organize, sort, transform, or execute functions on data by sending documents through several stages. The aggregated findings may then be used for analysis, visualization, or reporting.

In this post, you will learn how to set up a NodeJS MongoDB Aggregation Pipeline using the MongoDB Atlas Aggregation Builder tool. Before moving to that, you will understand what is MongoDB Aggregation framework and what are the various types of Aggregation Stages in the MongoDB Aggregation Pipeline. Moreover, you will get familiar with the amazing features offered by NodeJS and MongoDB. At the end of this article, you will explore some of the limitations associated with the MongoDB Aggregation Pipelines. So, let’s get started.

Table of Contents

What is NodeJS?

NodeJS MongoDB Aggregation - NodeJS Logo
Image Source

NodeJS or Node.JS is an open-source and cross-platform framework that uses the V8 engine to run JavaScript code outside of a web browser. It’s commonly used to make scalable applications and web pages. NodeJS is an asynchronous technology, meaning data is delivered over networks without regard for time constraints.

The event-driven runtime in NodeJS supports all types of HTTP requests. NodeJS is a “JavaScript Everywhere” paradigm that combines server-side and client-side script development into a single programming language. Developers can employ JavaScript and server-side scripts to produce dynamic web content before transmitting it to the user’s browser. 

Key Features of NodeJS

NodeJS MongoDB Aggregation - NodeJS Features
Image Source

NodeJS provides a plethora of robust features. Let’s have a look at these in detail below:

  • Asynchronous & Event-Driven: The APIs of the NodeJS library are completely asynchronous. A server designed in NodeJS never has to wait for data from an API. The server goes on to the next API after visiting one. It employs a notification system called Events to receive and monitor replies to earlier API queries.
  • Highly Scalable:  NodeJS can control and manage concurrent requests efficiently. It contains a cluster module that handles load balancing across all active CPU cores. The ability of NodeJS to horizontally split applications lets businesses provide several app versions to different target groups.
  • Cross-Platform Compatibility: It is compatible with a wide range of operating systems, including Windows, Unix, Linux, Mac OS X, and mobile devices. It can be used in conjunction with the proper package to create a self-sufficient executable.
  • Single-Threaded: It is scalable because it uses a single-threaded architecture with event looping. Unlike traditional servers that establish restricted threads to handle requests, the event mechanism allows the NodeJS server to respond in a non-blocking and scalable manner.
  • Faster Code Execution: The V8 JavaScript Runtime motor, which is also used by Google Chrome, is used by NodeJS. Hub offers a wrapper for the JavaScript motor, allowing the runtime motor to execute more quickly. As a result, the preparation of requests in NodeJS becomes more efficient.

Explore other key features, benefits, and lots more on the NodeJS homepage.

What is MongoDB?

NodeJS MongoDB Aggregation - MongoDB Logo
Image Source

MongoDB is a document-oriented NoSQL Database designed for storing and analyzing massive volumes of data. It stores data as Collections and documents rather than tables with rows and columns, as opposed to typical Relational Databases. The Collections are made up of many documents, each of which contains key-value pairs, which are the fundamental units of data.

MongoDB was initially released in February 2009. MongoDB.Inc developed, maintains, and manages it under the SSPL (Server Side Public License) license.

Key Features of MongoDB

NodeJS MongoDB Aggregation - MongoDB Features
Image Source

MongoDB has many features that distinguish it from other Databases. The following are some of the most important features:

  • High Data Availability & Durability: MongoDB’s Replication enables you to backup & recover data from different servers. It delivers higher Data Availability and Stability since it stores the same data or shards of data across several servers. This provides data access and security at all times, even in the event of server breakdowns, or other issues.
  • Horizontal Scalability: It supports horizontal scalability with the help of sharding. That is it spreads data across several servers using the Shard Key. With minimal downtime, this combination of comprehensive databases provides for the effective management of expanding amounts of data.
  • Efficient Concurrent Management: It can successfully manage numerous concurrent read & write requests for the same data thanks to its superior concurrency controls and locking protocols.
  • Quick Indexing: It delivers excellent speed for every query thanks to a wide range of indices and features, including language-specific sort orders that facilitate complicated access patterns to datasets. 
  • Accelerated Ad-hoc Queries: When executing Ad-hoc queries, you may need to evaluate hundreds to millions of variables. MongoDB indexes BSON documents and uses the MongoDB Query Language (MQL) to offer real-time updates to ad-hoc queries. In this article, you will learn to use Aggregation stages to create a NodeJS MongoDB Aggregation Pipeline.

Want to learn, more about MongoDB? Explore MongoDB’s Homepage here.

Introduction to the MongoDB Aggregation Framework

NodeJS MongoDB Aggregation - MongoDB Aggregation Pipeline
Image Source

In MongoDB, Aggregate operations are expressions that can be used to obtain reduced and aggregated results. The Aggregate Pipeline in MongoDB’s Query API lets you build a Pipeline with one or more stages, each of which performs a different action on your data.

Aggregation Framework implements the same reasoning as the SQL operator “GROUP BY“. Pipelines and Expressions are the two primary components of the aggregate framework. Pipelines are operators that can handle a large number of documents in a short period. After the computations on the input documents, expressions return the output documents. You can study your data in real-time using the MongoDB Aggregation framework.

Read along to learn how you can set up a NodeJS MongoDB Aggregation Pipeline using MongoDB Atlas.

Simplify MongoDB ETL & Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline, helps load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports 100+ Data Sources such as MongoDB,  including 40+ Free Sources. It is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. 

Hevo loads the data onto the desired destination in real-time, enriches, and transforms it into an analysis-ready form. Its completely automated pipeline, fault-tolerant, and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss without having to write a single line of code.

GET STARTED WITH HEVO FOR FREE

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled securely and consistently with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.

Simplify your Data Analysis with Hevo today! 

SIGN UP HERE FOR A 14-DAY FREE TRIAL!

Types of MongoDB Aggregation Stages

NodeJS MongoDB Aggregation - MongoDB Aggregation Stages
Image Source

You can use the framework to build a MongoDB Aggregate Pipeline with one or more stages. A document Aggregation Pipeline stage can perform some of the following operations:

  • On the input documents, each stage conducts an operation. For instance, a stage can filter, organize or calculate values in documents.
  • The output documents from one step are handed on to the next by one of the stages.
  • The results of an Aggregate Pipeline can be returned for groups of documents.

Many types of Aggregation Stages can be used in an aggregation pipeline. Some of these have been listed below:

  • $match: It helps to filter the input record according to the set of specified expressions.
  • $project: It generates a resultset with a subset of input or computed fields.
  • $geoNear: It outputs documents in the order of their proximity to a specific location.
  • $group: It groups some columns & performs aggregations on others.
  • $limit: It picks first n documents from input sets.
  • $skip: It ignores the first n documents from the input set.
  • $sort: It helps to sort all the input documents according to the specified object.
  • $redact: It generates a restriction on the contents of the documents based on information from the document.
  • $unwind: It extracts an array field from a document with n elements. Then it returns n documents, with each element being inserted as a field to each document in place of the array.
  • $out: It takes all the documents returned from the prior stage and then writes them to the collection.

To check out the other MongoDB Aggregation Stages, visit the Aggregation Pipeline Stages — MongoDB Manual

Steps to Set Up NodeJS MongoDB Aggregation Pipeline

The MongoDB Aggregation framework is a strong, simple, and lightweight feature that enables you to significantly increase the efficiency of aggregated value computations without having to use map-reduce.

Developers can use the MongoDB Atlas’ Aggregate Pipeline Builder to analyze and refine aggregation queries before putting them into application code, thereby saving hours of trial and error. You can take benefit of MongoDB’s powerful query capabilities thanks to features like drag-and-drop stages, code skeletons, and preview mode. The Aggregation Pipeline Builder, together with additional features like the Data Explorer, Triggers, and Charts, makes MongoDB Atlas a powerful tool for boosting Developer productivity.

Before you get started with the NodeJS MongoDB Aggregation Pipeline, make sure you meet the following requirements:

  • Project Data Access Read-Only Role: To create and execute aggregation pipelines in the Data Explorer
  • Project Data Access Read/Write Role: To use the $out stage in your pipeline.

Follow the steps below to set up the NodeJS MongoDB Aggregation Pipeline:

Step 1: Set up the Aggregation Pipeline Builder

MongoDB Atlas’ built-in Data Explorer allows you to explore and interact with your data by creating Aggregation Pipelines. To access your Data Explorer you can navigate to Databases → Browse Collections. If you are currently viewing a Database Deployment, you can simply click the Collections tab on the top. 

Now follow the steps below to set up the Aggregation view in the Aggregation Pipeline Builder:

  • Firstly, select the Database for the Collection you want to work on as shown below.
NodeJS MongoDB Aggregation - Select Database
Image Source
  • Now, on the left-hand side, you can see the Collections list. Select the desired Collection and then click on Aggregation views. A window similar to the one shown below will be displayed.
NodeJS MongoDB Aggregation - Create Aggregation View
Image Source

Step 2: Create a new NodeJS MongoDB Aggregation Pipeline

After setting up the Aggregation view for the new NodeJS MongoDB Aggregation Pipeline, follow the steps below to set up the new NodeJS MongoDB Aggregation Pipeline.

  • Navigate to the bottom-left of the panel and click on the Select dropdown list as shown below. 
NodeJS MongoDB Aggregation - Select Aggregation Stage
Image Source
  • Now, choose one of the Aggregation Stages to work with as shown above.
  • Next, you can modify your Aggregation Stage as shown below. If the Comment Mode is turned on, you will be guided by some of the syntactic guidelines.
NodeJS MongoDB Aggregation - Aggregation Pipeline Result
Image Source
  • To add more Aggregation Stages in your NodeJS MongoDB Aggregation Pipeline, you can either click on the ‘+‘ button or click the Add Stage button as shown below. The former adds the stage after the current stage whereas the latter adds the stage at the end of the Aggregation Pipeline.
NodeJS MongoDB Aggregation - Add New Stage
Image Source
  • If you want to add any specific language rules for string comparison, you can update that using the Collation button present at the top of the Aggregation view.

Step 3: Export the NodeJS MongoDB Aggregation Pipeline to NodeJS

After adding all the Agreggation Stages, you can export the NodeJS MongoDB Aggregation Pipeline to NodeJS or any language such as Python 3, Java, C#. Follow the steps below to export your NodeJS MongoDB Aggregation Pipeline to NodeJS:

  • Navigate to the top of the Atlas Aggregation Pipeline Builder and click on Export to Language.
  • Now, from the dropdown list, select Node as shown below.
NodeJS MongoDB Aggregation - Export Aggregation Pipeline to NodeJS
Image Source
  • To include the comments, you can tick the checkbox at the bottom.
  • Next, click on the Copy button at the top-right of your pane to copy the NodeJS MongoDB Aggregation Pipeline to the clipboard. Now, you can merge this code to your NodeJS file.

Great Work! You have successfully created a NodeJS MongoDB Aggregation Pipeline and exported the Pipeline to your NodeJS file using the MongoDB Atlas Aggregation Pipeline Builder. If you want to take a recap, you can check out the video below to set up the NodeJS MongoDB Aggregation Pipeline.

Limitations of MongoDB Aggregation Pipelines

The Aggregate Command has the following constraints when it comes to using the Aggregation Operations in the NodeJS MongoDB Aggregation Pipeline.

  • Result Size Restrictions: The results of the aggregate command can be returned as a cursor or stored in the collection. The BSON Document Size restriction of 16 megabytes applies to each document in the result set. The aggregate gives an error if any single document exceeds the BSON Document Size limit. The limit only applies to documents that have been returned.
  • Stage Restrictions: The number of Aggregate Pipeline stages permitted in a single pipeline in MongoDB 5.0 is limited to 1000.
  • Memory Restrictions: A total of 100 megabytes of RAM is allotted to each pipeline stage. MongoDB generates an error by default if a stage exceeds this limit.

To read the limitation so the MongoDB Aggregation Pipeline, refer to Aggregation Pipeline Limits — MongoDB Manual

Conclusion

To summarize, in this article you gained a basic understanding of how to create a NodeJS MongoDB Aggregation Pipeline. You learned the different types of MongoDB Aggregation Stages and also explored some of the limitations, associated with the MongoDB Aggregation Pipeline. 

However, as a Developer, extracting complex data from a diverse set of data sources to your MongoDB Database or creating NodeJS MongoDB Aggregation pipelines by integrating with other sources, can seem to be quite challenging. This is where a simpler alternative like Hevo can save your day! 

Hevo Data is a No-Code Data Pipeline that offers a faster way to move data from 100+ Data Sources such as MongoDB and other 40+ Free Sources, into your Data Warehouse to be visualized in a BI tool. You can easily create a NodeJS MongoDB Aggregation Pipeline using Hevo.

VISIT OUR WEBSITE TO EXPLORE HEVO

Want to take Hevo for a spin?

SIGN UP and experience the feature-rich Hevo suite first hand. You can also have a look at the unbeatable pricing that will help you choose the right plan for your business needs.

Feel free to ask questions or share your experience with NodeJS MongoDB Aggregation Pipelines in the comments section below!

No-Code Data Pipeline For MongoDB