MongoDB Profiler Setup: A Easy Step-by-Step Guide 101

• February 25th, 2022

mongodb profiler - featured image

For any organization that generates data and creates databases, it is important to devise a means of improving your database performance so you can remain highly productive. When your database grows, you face a lot of challenges on how to avoid problematic situations that can impede the overall reliability/performance of your database. These tricky scenarios can be handled by taking steps like rewriting difficult queries or analyzing/optimizing database schemas.

MongoDB is a reliable document database platform that provides you with wonderful features that can be used to create efficient and reliable databases. These features of MongoDB include the MongoDB Profiler, Mongostat, Mongotop, etc.

In this write-up, we will look at how to set up the MongoDB Profiler to use it to query a database by providing details about a result you want through a query, get query level insights by finding slow queries, setting filters to determine slow queries, specifying the threshold for slow queries, etc. 

Table of Content

Prerequisites

  • MongoDB account: Log in to your MongoDB account here. In case you have not set up a MongoDB account yet you can set it up with this.
  • Mongod: Mongod is an abbreviation for “mongodaemon”. mongod is a background process used by MongoDB. The main purpose of Mongod is to manage all  MongoDB server tasks. For example, accepting requests, responding to clients, memory management, and so on.
  • Mongosh: Mongosh is an abbreviation for MongoDB shell. mongosh, is a full-featured JavaScript and Node.js 14.x REPL environment for interacting with MongoDB deployments. It can be used to test queries and operations directly against the database. mongosh is available as a standalone package from the MongoDB Download Center. 
    You can learn how to download and install the mongosh binary with this link.

What is MongoDB?

mongodb profiler: mongoDB logo
Image Source

MongoDB is a document database used to build highly available and scalable internet applications. It uses agile methodologies with a flexible schema approach, thereby making it a tool of choice for developers building and deploying applications. It can also be described as an open-source document-oriented NoSQL database management program which is an alternative to traditional relational databases as it makes use of collections and documents as opposed to using tables and rows as found in traditional databases making MongoDB useful when working with large sets of distributed data. 

MongoDB is a tool that can manage document-oriented information by using a JSON-like format to store and retrieve information and this format directly maps to native objects in most modern programming languages.

MongoDB offers drivers for all the major programming languages and allows developers to start building applications immediately without having to configure a database first. MongoDB also makes it easy to store structured and unstructured data and it can handle a high volume of data, scaling them both vertically and horizontally.

Simplify Data Analysis with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDKs, and Streaming Services and simplifies the ETL process. It supports 100+ data sources (including 40+ free data sources) like Asana and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse/destination but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

GET STARTED WITH HEVO FOR FREE[/hevoButton]

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
SIGN UP HERE FOR A 14-DAY FREE TRIAL

What is Profiler in MongoDB?

A database system determines how it handles your query by deciding how to go about it and what physical operation it will perform at any given time. The database profiler collects all information as regards queries that are executed on an individual database instance. When the profiler collects these query details, it does this for the entire instance, ranging from information on the database and the connections on that instance.

The MongoDB database has an in-built profiler that gives you query-level insights as to why the database chooses a certain operation and it also provides in-depth details of the operations performed such as the queries run, sampling time-slices, etc. The MongoDB profiler also has a feature to record the logs of individual queries which are executed and records all the CRUD operations along with configuration and management controls in logs.

How to Use the MongoDB Profiler?

Detailed information about database commands executed against a running mongod instance is collected by the MongoDB profiler, it writes these collected data to a system.profile collection in each profiled database.

The profiler by default is disabled because it has an effect on database performance and it has a high consumption rate of the memory. It can be enabled on a per-database or per-instance basis at one of the available profiling levels. 

In this section, we will look at how to enable and configure the database profiler by specifying the threshold for slow operations, setting filters to determine profiled operations, checking your profiling level, enable profiling for an entire mongod instance, amongst other things.

What Information do you get from the system.profiler?

The profiler saves the query in db.system.profile. It can be queried like any other database, so you can find a lot of information about queries directly. 

Along with the query source and the overview of the execution process/ steps, you can get the following information, from db.system.profiler:

  1. When the query is executed 
  2. How fast did the query get executed 
  3. Number of documents checked for the execution of the query
  4. Type of plan the query execution plan used 
  5. Whether it can be fully utilized and indexed 
  6. What kind of  lock occurred during the execution

Enabling and Configuring MongoDB Profiler

The MongoDB profiler is enabled from mongosh or through a driver using the profile command. To enable profiling and set the profiling level, type db.setProfilingLevel() helper in mongosh where you will pass the profiling level you want as a parameter.

There are three profiling levels available in the database profiler, they include:

LevelDescription
0This is the default profiler level set off and it does not collect any data. mongod writes operations longer than the slowOpThresholdMs threshold to its log.
1This level collects profiling data for slow operations only. Slow operations are regarded as operations that are slower than 100 milliseconds by default though, you can use the slowOpThresholdMs runtime option or the setParameter command to modify your definition of slow operations.
2This level of the profiler collects profiling data for all the database operations.

The example below enables profiling for all database operations using mongosh.

db.setProfilingLevel(2)

The shell returns a result as seen below.

{ “was” : 0, “slowns” : 100, “ok” : 1 }

The result above contains the previous level indicated by “was”, the slow operation threshold through “slowns”, and the “ok”: 1 key-value pair indicates that the operation is successful. 

How to Specify the Threshold for Slow Operations?

By default, the slow operation threshold is 100 milliseconds though, a database with a profiling level of 1 will log operations slower than 100 milliseconds. Changing the slow operations threshold for the database profiler applies to the entire mongod instance which means when the threshold is altered, it is also changed for all databases on the instance and it affects the profiling subsystem’s slow operation threshold for the entire mongod instance therefore, it is advisable to set the threshold to the highest useful value.

To alter the threshold, you will pass two parameters to the db.setProfilingLevel() helper in mongosh where the first parameter as already seen earlier will be the profiling level and the second one will set the default slow operation threshold for the entire mongod instance as shown in the example below.

db.setProfilingLevel(1, { slowms: 20 })

The example above sets the profiling level at 1 and the slow operation threshold for the mongod instance to 20 milliseconds.

How to Set Filters to Determine MongoDB Profiler Operations?

With the MongoDB profiler, you can set a filter to control which operations are profiled and logged at a given point in time. This is done by using the profile command db.setProfilingLevel() in mongosh as seen below:

db.setProfilingLevel( 2, { filter: { op: "query", millis: { $gt: 2000 } } } )

The example above shows the profiling level of 2 and a query filter of { op: “query”, millis: where only query operations longer than 2 seconds will be logged to the profiler.

How to Check MongoDB Profiler Level?

To view your profiling level, type the following in mongosh.

db.getProfilingStatus()

The shell will return a document that looks like the one below:

{ "was" : 0, "slowms" : 100, "sampleRate" : 1.0, "ok" : 1 }

As stated earlier, the  “was” field shows your current level of profiling,  “slowns”, shows the slow operation threshold by indicating how long it takes an operation to exist in milliseconds before passing through the threshold. In the example above, the “slow” is 100 milliseconds and the “sampleRate” field indicates the percentage of slow operations that should be profiled.

To return only the profiling level of your MongoDB profiler, use the db.getProfilingLevel() helper in the mongosh as indicated below.

db.getProfilingStatus() 

How to Disable MongoDB Profiler?

To disable MongoDB profiler, type db.setProfilingLevel(0) in mongosh as seen below.

db.setProfilingStatus(0)

When the level is set at 0, profiling has automatically been disabled.

How to View MongoDB Profiler Output?

Information about your database operations such as read and write operations, cursor operations, and data-based commands are stored in the database profiler and this information can be viewed by querying the system.profile collection. Additional data can also be added to the query document by using the $comment tag to make it easier to analyze the data gotten from the profiler.

Below is an output of a sample document found in the system.profile collection queried with the command below. Some of the fields found in the output are also described thereafter.

db.system.profile.find().limit(1).pretty()

Output:

{
    "op" : "query",
    "ns" : "mydb.Log",
    "query" : {
        "find" : "Log",
        "filter" : {
            "EMP_ID" : "01778"
        }
    },
    "keysExamined" : 0,
    "docsExamined" : 90022,
    "cursorExhausted" : true,
    "keyUpdates" : 0,
    "writeConflicts" : 0,
    "numYield" : 703,
    "locks" : {
        "Global" : {
            "acquireCount" : {
                "r" : NumberLong(1408)
            }
        },
        "Database" : {
            "acquireCount" : {
                "r" : NumberLong(704)
            }
        },
        "Collection" : {
            "acquireCount" : {
                "r" : NumberLong(704)
            }
        }
    },
    "nreturned" : 60,
    "responseLength" : 17676,
    "protocol" : "op_command",
    "millis" : 40,
    "execStats" : {
        "stage" : "COLLSCAN",
        "filter" : {
            "EMP_ID" : {
                "$eq" : "01778"
            }
        },
        "nReturned" : 60,
        "executionTimeMillisEstimate" : 30,
        "works" : 90024,
        "advanced" : 60,
        "needTime" : 89963,
        "needYield" : 0,
        "saveState" : 703,
        "restoreState" : 703,
        "isEOF" : 1,
        "invalidates" : 0,
        "direction" : "forward",
        "docsExamined" : 90022
    },
    "ts" : ISODate("2018-09-09T07:24:56.487Z"),
    "client" : "127.0.0.1",
    "allUsers" : [ ],
    "user" : ""
}
  • Op: This field identifies the type of operation you carried out.
  • ns: This field holds the target database and collection name.
  • query: This is used to store information about the query and the result. Normally, if the document size is greater than 50KB, the result will be truncated.
  • keysExamined: The number of index keys examined by the database to execute the query is found in this field.
  • docsExamined: This stores the total number of documents examined by the database.
  • nreturned: This field holds the number of documents returned by the query.
  • millis: Millis contains the actual time in milliseconds taken by the query to execute the command.
  • ts: This field contains the timestamp of the query.

Sample Commands using the MongoDB profiler

The next section contains some sample commands that can be used to query a database to get different information about your data using the MongoDB profiler.

1: To return the 10 most recent log entries in the system.profile collection, run the following query:

db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()

2: To return queries that are performing large scans on the database, type the following:

db.system.profile.find({docsExamined:{$gt:10000}}).pretty()

3: To return all operations except command operations ($cmd), run a query like the one below:

db.system.profile.find( { op: { $ne : 'command' } } ).pretty()

4: To return all the operations for which some documents were moved, run the query below:

db.system.profile.find({moved:true}).pretty()

5: To return operations for a particular collection, type the query similar to the one below and it will return operations found in the mydb database’s test collection:

db.system.profile.find( { ns : 'mydb.test' } ).pretty()

6: To return operations slower than 5 milliseconds, run a query similar to the following:

db.system.profile.find( { millis : { $gt : 5 } } ).pretty()

7: To return the top 10 slowest aggregation/command queries, run a query similar to the one below:

db.system.profile.find({op: {$eq: “command” }}).sort({millis:-1}).limit(10).pretty();

8: To return information from a certain time range, run the following:

db.system.profile.find({
  ts : {
    $gt: new ISODate("2012-12-09T03:00:00Z"),
    $lt: new ISODate("2012-12-09T03:40:00Z")
  }
}).pretty()

9: To return the maximum and average time taken by each type of operation using aggregation, type the following command:

db.system.profile.aggregate(
{ $group : { 
   _id :"$op", 
   count:{$sum:1},
   "max_time":{$max:"$millis"},
   "avg_time":{$avg:"$millis"}
}}).pretty()

10: To return the maximum and average time taken by queries in each database using aggregation, run the query:

db.system.profile.aggregate(
{ $group : {
  _id :"$ns",
  count:{$sum:1}, 
  "max_time":{$max:"$millis"}, 
  "avg_time":{$avg:"$millis"}  
}}.pretty()

11: To return Sorted Queries by when they were recorded, run the following query:

db.system.profile.find({
"command.pipeline": { $exists: true }
}, {
"command.pipeline":1
}).sort({$natural:-1}).pretty();

12: To return all queries with a particular comment, type the following query:

db.Customers.find({
"Name.Last Name" : "Johnston"
}, {
"_id" : NumberInt(0),
"Full Name" : NumberInt(1)
}).sort({
"Name.First Name" : NumberInt(1)
}).comment( "Find all Johnstons and display their full names alphabetically" );

//display all queries with comments
db.system.profile.find({ "query.comment": {$exists: true} }, { query: 1 }).pretty()

What Causes MongoDB Queries to Slow Down?

You just need to know what a slow query is, so you need to specify what you consider to be slow, that is a millisecond criterion. Of course, you can save all queries and aggregates and select only those queries that match your criteria. As seen above you can use the following command to get all those queries that took more than 20ms.

 // Find something that took more than 20ms
 db.system.profile.find ({"millis": {$ gt: 20}}) 

You will find that there is too much “noise” in a heavily loaded system. In this case, it is much better to use profiling level 1. This saves only slow queries. There may be queries that are executed very often, for example, a Website Authentication routine. It takes a reasonable amount of time, but it is inefficient. We need a measure of cumulative time over a period of time, so we need full profiling at level 2. However, it often makes sense to have the ability to save only slow queries. Here, choose to collect only data for operations that take longer than the slowms reference value. At the same time, you can set slowms to 30ms.

For more information about the MongoDB Profiler, you can visit MongoDB’s official documentation here.

Conclusion

In this piece, you have been introduced to MongoDB and given insights about its Database Profiler to enable you to handle issues of your database by getting helpful information from them through a query command.

It also explained how the MongoDB profiler can be enabled/disabled, how to check the MongoDB profiler level, how to view the database output, explaining the content of the output with an example, etc. Having gone through this article, you should be more efficient in querying your database to get useful information.

MongoDB is a trusted source that a lot of companies use as it provides many benefits but transferring data from it into a data warehouse is a hectic task. The Automated data pipeline helps in solving this issue and this is where Hevo comes into the picture. Hevo Data is a No-code Data Pipeline and has awesome 100+ pre-built Integrations that you can choose from.

visit our website to explore hevo

Hevo can help you Integrate your data from numerous sources and load them into a destination to Analyze real-time data with a BI tool such as Tableau. It will make your life easier and data migration hassle-free. It is user-friendly, reliable, and secure.

SIGN UP for a 14-day free trial and see the difference!

Share your experience of learning about MongoDB Profiler in the comments section below.

No-code Data Pipeline For Your Data Warehouse