Handling data in recent times needs immense flexibility, and many will not require a rigid traditional RDMS. Here’s where NoSQL Databases come in handy and are popular today. With more flexible requirements and development of modern applications need the support of NoSQL Databases such as MongoDB. Organizations store and retrieve their business data from MongoDB using applications they use to run business activities.
MongoDB comes with many functions and features that help users handle data in the Database with ease. MongoDB aggregation functions are widely used to manipulate data in the Database. It allows users to perform different operations on data. MongoDB Substring is a function that retrieves substring from the data string.
In this article, you will learn about the MongoDB aggregation functions, key features, and architecture of MongoDB. You will also read about the MongoDB Substring function that is widely used in Data Analysis. Moreover, understand how to use the MongoDB Substring function using syntax and examples.
What is MongoDB?
MongoDB is a Document-Oriented NoSQL Database ideal for storing large amounts of data. MongoDB does not utilize tables and rows like typical relational Databases do. Instead, it uses collections and documents. Collections are analogous to relational Database tables in that they include groups of documents and functions.
As part of a platform-as-a-service offering, 10gen Software created MongoDB in 2007. In 2009, the company shifted to an open-source development method with commercial backing and extra services. 10gen had given the name MongoDB Inc. in 2012.
Documents comprise key-value pairs, which constitute MongoDB’s fundamental data type. MongoDB’s appeal stems from its efficient querying and rapid data retrieval capabilities, which provide more profound insights through Big Data Analytics.
Features of MongoDB
Some of the main features of MongoDB are listed below:
- Document-Oriented: There are collections in every Database, and those collections include documents. The number of fields in each document might change, making each one unique. Each document might be distinct in terms of its size and content.
- Flexibility: the document’s structure is more in accordance with how programmers design their classes and objects in their different programming languages. Programmers claim that their classes are rows and columns and contain an organized layout of keys and values.
- Shema Less: MongoDB’s data schema makes it possible to easily create hierarchical connections, store arrays, and other more complicated structures. In MongoDB, rows (or documents, as they’re known) don’t need to be pre-configured with a schema.
- Scalability: The fields in a MongoDB row can be constructed as per requirement on the fly. When it comes to scalability, the MongoDB Database is unmatched. Numerous businesses worldwide have set up clusters, some operating 100s of nodes and storing millions of documents.
MongoDB Architecture
MongoDB’s architecture is composed of the following components:
Database
Additionally, it refers to the physical container for information stored. Different Databases exist on a single MongoDB system. Each Database has its collection of data on the system files.
Collection
The collection contains a variety of materials from a variety of areas. There are no schemas in collections. A single Database is going to manage all of the collections.
Document
The document has been allocated to key values connected with dynamic schemas. The advantage of adopting various schemas is that documents do not have to share the same fields and can have a variety of data kinds.
To learn more about MongoDB, click here.
Effortlessly manage and migrate your sharded MongoDB data with Hevo. Hevo supports MongoDB as one of its 150+ data sources, ensuring seamless data integration and real-time synchronization.
- No-Code Solution: Easily connect and manage your MongoDB data without writing a single line of code.
- Flexible Transformations: Use drag-and-drop tools or custom scripts for data transformation.
- Real-Time Sync: Keep your destination data warehouse updated in real time.
- Auto-Schema Mapping: Automatically handle schema mapping for a smooth data transfer.
Join over 2000 satisfied customers, including companies like Voiceflow and Playtomic, who trust Hevo for their data integration needs. Check out why Hevo is rated 4.7 stars on Capterra.
Get Started with Hevo for Free
Aggregation in MongoDB
Aggregation operations in MongoDB are used to handle data records/documents and produce calculated results. It is analogous to SQL’s aggregate function. It aggregates information from several documents and then executes various operations on the gathered data, such as total, mean, minimum, and maximum, to deliver a calculated result.
MongoDB aggregates data in three distinct ways:
- Aggregation pipeline
- Map-reduce function
- Single-purpose aggregation
Aggregation Pipelines
A document aggregation pipeline is composed of one or more processing stages:
- Each step conducts an operation on the document processed. A stage, for example, can filter documents, group them, and calculate values.
- The documents generated by a step are forward to the following stage.
- A pipeline for document aggregation can produce results for groups of documents. For example, return the MongoDB substring, sum, mean, highest, and lowest values.
The following example of an aggregate pipeline has two steps and returns the total order number of burgers organized by Burger name:
db.orders.aggregate( [
// Stage 1: Filter Burger order documents by Burger size
{
$match: { size: "medium" }
},
// Stage 2: Group remaining documents by Burger name and calculate the total quantity
{
$group: { _id: "$name", totalQuantity: { $sum: "$quantity" } }
}
] )
Single Purpose Aggregation
Single Purpose Aggregation is utilized when we want quick access to documents, such as when we need to count the number of documents or when we need to locate all different values in a document.
It just gives access to the pipeline’s common aggregation mechanism via the count(), distinct(), and estimatedDocumentCount() methods, and hence lacks the pipeline’s flexibility and possibilities.
Let’s use a collection named “sales” that stores purchases:
{
_id: 5bd761dcdo323e45a93ccxsb,
saleDate: 2020-08-25T12:54:14.170+00:00,
items: [
{
"name": " paper",
"price": 19,
// ...
},
],
storeLocation: "St’ Louis",
customer: {
age: 35,
satisfaction: 4,
// ...
},
couponUsed: false,
purchaseMethod: "Delivey"
}
If we wanted to get the overall number of sales, we could run:
const totalNumberOfSales = await collection.countDocuments();
countDocuments() will aggregate the collection’s total number of documents and return it to us.
Integrate MongoDB to Redshift
Integrate MongoDB to Snowflake
Integrate MongoDB to BigQuery
Map Reduce
Map reduction is a technique used to aggregate results for vast amounts of data. Map Reduce serves two primary functions: first, it acts as a map, grouping all documents, and second, it acts as a reducer, performing operations on the grouped data.
Syntax:
db.collectionName.mapReduce(mappingFunction, reduceFunction, {out:'Result'});
MongoDB Substring
Users can use the MongoDB Substring function to extract a substring from a string. MongoDB Substring is an aggregation pipeline operator that helps users manipulate documents in the Database.
$substr
The MongoDB Substring ($substr) returns a substring of a string beginning at the supplied index point and terminating with the specified number of characters. The index is numeric.
Syntax:
{ $substr: [ <string>, <start>, <length> ] }
Here,
- If the start value is negative then the MongoDB Substring function will return an empty string “ ”.
- If the value of length is negative then the MongoDB Substring function will return the substring starting from the specified starting value or index till the full remaining length of the string.
The arguments may be any acceptable expression as long as the first is a string and the second and third are integers.
Consider the following MongoDB Substring example for stock levels collection of documents:
{ "_id" : 1, "item" : "PQR1", quarter: "12Q1", "description" : "product 1" }
{ "_id" : 2, "item" : "PQR2", quarter: "12Q4", "description" : "product 2" }
{ "_id" : 3, "item" : "ABC1", quarter: "15Q2", "description" : null }
The following procedure, which makes use of the MongoDB Substring $substr operator, converts the quarter value to a yearSubstring and a quarterSubstring. The following code is given below.
db.inventory.aggregate(
[
{
$project:
{
item: 1,
yearSubstring: { $substr: [ "$quarter", 0, 2 ] },
quarterSubtring: { $substr: [ "$quarter", 2, -1 ] }
}
}
]
)
The following output for MongoDB Substring extracted is returned by the operation, shown below.
{ "_id" : 1, "item" : "PQR1", "yearSubstring" : "12", "quarterSubstring" : "Q1" }
{ "_id" : 2, "item" : "PQR2", "yearSubstring" : "12", "quarterSubstring" : "Q4" }
{ "_id" : 3, "item" : "ABC1", "yearSubstring" : "15", "quarterSubstring" : "Q2" }
Start your MongoDB Integration in Real-time
No credit card required
Conclusion
In this article, you will learn about MongoDB and what are the aggregation functions in MongoDB. You also read about one of the widely used aggregation functions, MongoDB Substring that is used to extract a piece of string or a substring from the given string based on the conditions provided by the user. In which you went through examples of how to use MongoDB Substring $substr function with their outputs. MongoDB Substring helps in cleaning data values, splitting the strings, etc. that can be useful for companies in Data Analytics.
MongoDB substring is a valuable aggregation technique. MongoDB ranks 5th overall in ranking the most popular Database management systems worldwide as of January 2022. Performing all reporting, transformation, and sophisticated querying operations is critical to a Database developer’s job, and MongoDB is an efficient partner.
MongoDB Database stores valuable business data that can be used to generate insights. Companies need to analyze their business data stored in multiple data sources. The data needs to be loaded to the Data Warehouse to get a holistic view of the data. Hevo Data is a No-code Data Pipeline solution that helps to transfer data from 150+ data sources to desired Data Warehouse. It fully automates the process of transforming and transferring data to a destination without writing a single line of code.
Sign Up here for a 14-day free trial and experience the feature-rich Hevo suite first hand.
Share your experience of learning about the MongoDB Substring Aggregation function in the comments section below!
FAQs
How to Split a Collection in MongoDB?
MongoDB does not natively support splitting collections, but you can achieve it by querying subsets of data and inserting them into new collections.
db.originalCollection.find({field: "criteria"}).forEach(doc => { db.newCollection.insert(doc); });
How to Use $gt
in MongoDB?
The $gt
(greater than) operator filters documents where a field value is greater than a specified value.
db.collection.find({age: {$gt: 30}});
This retrieves documents where the age
field is greater than 30.
How to Compare Two Strings in MongoDB?
Use the $expr
operator with $eq
, $gt
, or $lt
to compare strings within a query.
db.collection.find({ $expr: {$gt: ["$stringField1", "$stringField2"]} });
This compares stringField1
and stringField2
values.
Dimple is an experienced Customer Experience Engineer with four years of industry proficiency, including the last two years at Hevo, where she has significantly refined customer experiences within the innovative data integration platform. She is skilled in computer science, databases, Java, and management. Dimple holds a B.Tech in Computer Science and excels in delivering exceptional consulting services. Her contributions have greatly enhanced customer satisfaction and operational efficiency.