Suppose you are dealing with a MongoDB Collection named products that contain information about the number of products sold by your organization. Now if you want to analyze the data about the number of products sold per quarter to derive some useful insights, you might want to count the number of a particular product.

In MongoDB, there is a specific method to do that. The MongoDB Count Method can help you in counting the number of documents in a collection.

This article will provide you with a decent overview of the MongoDB Count Method along with its syntax and some practical examples. Read along to learn about the MongoDB Count method in detail!

What is MongoDB?

MongoDB is a well-known Open-Source NoSQL Database written in C++. MongoDB is a Document-oriented Database that uses JSON-like documents with a Dynamic Schema to store data. It means that you can store your records without having to worry about the Data Structure, the number of fields or the types of fields used to store values. Documents in MongoDB are similar to JSON Objects.

You can change the structure of records (which MongoDB refers to as Documents) by simply adding new fields or deleting existing ones.

This feature of MongoDB allows you to easily represent Hierarchical Relationships, Store Arrays, and other complex Data Structures. Nowadays, many tech giants, including Facebook, eBay, Adobe, and Google, use MongoDB to store colossal amounts of data.

Key Features of MongoDB

MongoDB offers a wide range of unique features that make it a better solution in comparison to other conventional databases. Some of these features are discussed below:

  • Schema Less Database: A Schema-Less Database allows various types of Documents to be stored in a single Collection(the equivalent of a table). In other words, in the MongoDB database, a single collection can hold multiple Documents, each of which can have a different number of Fields, Content, and Size. It is not necessary for one document to be similar to another which is a prerequisite in Relational Databases. Due to this feature, MongoDB offers great flexibility to the users.
  • Index-based Document: Each field in a MongoDB database is indexed with Primary and Secondary Indices, which makes it easier to retrieve information from the pool of data. 
  • Scalability: Sharding in MongoDB allows for Horizontal Scalability. Sharding refers to the process of distributing data across multiple Servers. A large amount of data is partitioned into data chunks using the Shard Key, and these data chunks are evenly distributed across Shards that reside across many Physical Servers.
  • Replication: MongoDB offers high availability of data by creating multiple copies of the data and sending these copies to a different Server so that if one Server fails, the data can still be retrieved from another Server.

How to Use the MongoDB Count() Method?

MongoDB Count Method is used to count the number of documents matching specified criteria. This method does not actually perform the find() operation, but rather returns a numerical count of the documents that meet the selection criteria.

A) General Syntax

The general syntax for using the MongoDB Count Method is given below:

db.collection.count(query, options)

B) Parameters Involved in the MongoDB Count Method

The parameters involved in the syntax description on the MongoDB Count Method are as follows:

  • query: It represents the selection criteria. The type of this parameter is a Document.
  • options: It represents the second parameter that is optional. Some of the optional parameters are discussed below:
    • limit: It is used to impose a constraint on the maximum number of documents that need to be counted.
    • skip: It is an optional parameter that specifies the number of documents that should be skipped before the MongoDB Count Method starts the actual counting.
    • hint: It is a document or field that specifies the index to be used in order to support the filter. It can accept either an Index Specified Document or an Index Name String, and if you specify an index that does not exist, it will return an error.
    • readConcern: Read Concern in MongoDB allows you to manage the consistency and isolation properties of the data read from your Replica Sets. You can leverage the readConcern parameter if you don’t want to use the default read concern.
    • collation: Collation allows users to impose language-specific string comparison rules such as letter case and accent marks rules. It can be specified for a collection or view, an index, or specific operations that support collation.
    • maxTimeMs: It is an optional parameter that specifies the cumulative time limit(in milliseconds) for processing operations on the cursor.

C) Usage Notes for Using the MongoDB Count Method

Some of the important points that you should be aware of before using the MongoDB Count Method are as follows:

  • You should avoid using the db.collection.count() method without a Query Predicate as it will return the results based on the collection’s metadata, which may result in an approximation of the count.
  • The resulting count on a Sharded Cluster does not correctly filter out the orphaned documents. Orphaned Documents in MongoDB are those that exist in chunks on other shards as a result of failed migrations or incomplete migration cleanup due to an abnormal shutdown.

D) Practical Examples

Let us understand the working of the MongoDB Count Method in detail with the help of some example queries. Suppose you have a collection named employees which consists of the age and name of the employees:

{
“_id”           :        ObjectId(“600fab1788edfrcbcbc34e”),
“name”       :        “John”,
“age”          :        22

}
{
“_id”           :        ObjectId(“600fab1788edfrcbcbc34f”),
“name”       :        “David”,
“age”          :        32

}
{
“_id”           :        ObjectId(“600fab1788edfrcbcbc34g”),
“name”       :        “Leo”,
“age”          :        24

}
{
“_id”           :        ObjectId(“600fab1788edfrcbcbc34h”),
“name”       :        “Theon”,
“age”          :        41

}
{
“_id”           :        ObjectId(“600fab1788edfrcbcbc34i”),
“name”       :        “Larry”,
“age”          :        46

}

Now, if you want to count the total number of objects in the collection. The query would look something like this:

db.employees.count()

The output of the above query will be 5 as there are a total of Five Objects in the employees Collection.

Now, if you want to find out the total number of employees having age of above 25. The query would look something like this:

db.employees.count({age:{$gt:25}})

Here, $gt means greater than. The output for the above query will be 3 as there are a total of three employees (David, Larry, and Theon) whose age is greater than 25.

Explore more about : How To Join Two Collections In MongoDB

How to Count Documents in a Shared Cluster?

As discussed in the previous section, if orphaned documents exist or a chunk migration is in progress on a sharded cluster, the MongoDB Count Method without a Query Predicate can produce incorrect results. To avoid such situations, you can use the db.collection.aggregate() method.

Unlike the db.collection.count() method, the Aggregate method does not use the collection’s metadata to return the count. You can leverage the $count stage to retrieve the accurate number of documents in the required collection. The following operation, for example, counts the documents in a collection:

db.collection.aggregate( [
   { $count: "myCount" }
])

Conclusion

This article introduced you to MongoDB and the salient features that it offers. Furthermore, it introduced you to the MongoDB Count method which can help you in counting the number of documents in a collection.

As your business begins to grow, data is generated at an exponential rate across all of your company’s SaaS applications, Databases, and other sources.

To meet this growing storage and computing needs of data,  you would require to invest a portion of your engineering bandwidth to Integrate data from all sources, Clean & Transform it, and finally load it to a Cloud Data Warehouse for further Business Analytics.

All of these challenges can be efficiently handled by a Cloud-Based ETL tool such as Hevo Data.

Give Hevo a try by signing up for a 14-day free trial today.

mm
Former Research Analyst, Hevo Data

Rakesh is a research analyst at Hevo Data with more than three years of experience in the field. He specializes in technologies, including API integration and machine learning. The combination of technical skills and a flair for writing brought him to the field of writing on highly complex topics. He has written numerous articles on a variety of data engineering topics, such as data integration, data analytics, and data management. He enjoys simplifying difficult subjects to help data practitioners with their doubts related to data engineering.