MongoDB Python Insertion 101: Syntax & Usage Simplified

By: Published: February 3, 2022

Apart from the traditional Relational Databases, a more flexible, scalable, reliable approach to handle and process rapidly growing data is by using NoSQL databases like MongoDB. It is an Open-Source Document Oriented Database Management System that allows you to store Unstructured Datasets.

One of the performance advantages of MongoDB is efficient data insertion. You can perform the MongoDB Python Insertion using the insert_one() and insert_many() methods. Instead of inserting data row by row, MongoDB optimizes the write performance by rapidly inserting data via the insert_many() API.

In this article, you will learn how to easily perform MongoDB Python Insertion.

Table of Contents

What is MongoDB?

MongoDB Python Insertion - MongoDB Logo
Image Source

MongoDB is a NoSQL Open-Source Document Oriented Database developed for storing and processing high volumes of data. Compared to the conventional Relational Databases, MongoDB makes use of collections and documents instead of tables consisting of rows and columns. The Collections consists of several documents and documents containing the basic units of data in terms of key and value pairs. 

Introduced in February 2009, the MongoDB database is designed, maintained, and managed by MongoDB.Inc under SSPL(Server Side Public License). Organizations such as Facebook, Nokia, eBay, Adobe, Google, etc. prefer it for efficiently handling and storing their exponentially growing data. It offers complete support for programming languages such as C, C++, C#, Go, Java, Node.js, Perl, PHP, Python, Motor, Ruby, Scala, Swift, and Mongoid. 

Key Features of MongoDB

With constant efforts from the online community, MongoDB has evolved over the years. Some of its eye-catching features are:

  • High Data Availability & Stability: MongoDB’s Replication feature provides multiple servers for disaster recovery and backup. Since several servers store the same data or shards of data, MongoDB provides greater Data Availability & Stability. This ensures all-time data access and security in case of server crashes, service interruptions, or even good old hardware failure. 
  • Accelerated Analytics: You may need to consider thousands to millions of variables while running Ad-hoc queries. MongoDB indexes BSON documents and utilizes the MongoDB Query Language (MQL) that allows you to update Ad-hoc queries in real-time. MongoDB provides complete support for field queries, range queries, and regular expression searches along with user-defined functions.
  • Indexing: With a wide range of indices and features with language-specific sort orders that support complex access patterns to datasets, MongoDB provides optimal performance for every query. For the real-time ever-evolving query patterns and application requirements, MongoDB also provisions On-Demand Indices Creation.
  • Horizontal Scalability: With the help of Sharding, MongoDB provides Horizontal Scalability by distributing data on multiple servers using the Shard Key. Each shard in every MongoDB Cluster stores parts of the data, thereby acting as a separate database. This collection of comprehensive databases allows efficient handling of growing volumes of data with zero downtime. The complete Sharding Ecosystem is maintained and managed by Mongos that directs queries to the correct shard based on the Shard Key.
  • Load Balancing: Real-time Replication and Sharding contribute towards large-scale Load Balancing. Ensuring top-notch Concurrency Controls and Locking Protocols, MongoDB can effectively handle multiple concurrent read and write requests for the same data.  
  • Aggregation: Similar to the SQL Group By clause, MongoDB can easily batch process data and present a single result even after executing several other operations on the group data. MongoDB’s Aggregation framework consists of 3 types of aggregations i.e. Aggregation Pipeline, Map-Reduce Function, and Single-Purpose Aggregation methods.

What is Python?

MongoDB Python Insertion - Python Logo
Image Source

Python is a widely-used interpreted, Object-Oriented, General-Purpose programming language. Launched in 1991, Python has now evolved into an advanced application development language offering features such as high-level data structures, dynamic typing, dynamic binding, etc. It is a powerful tool that is used for developing websites and software, Task Automation, Data Analysis, and Data Visualization. The popular standard python libraries include Tensor Flow, NumPy, Pandas, & Mathplotlib for data science.

Owing to its beginner-friendliness, it is also used by non-coders such as accountants and scientists, for a broad range of everyday tasks, like organizing finances, numerical computations, scientific graphs, etc. Python was specially developed for better readability with influences from English and Mathematics. It is completely free as it comes under the GPL-compatible license certified by the Open Source Initiative.

Key Features of Python

Since its inception, Python has become a popular choice for several tasks due to the following eye-catching features:

  • Interpreted Language: Instead of compiling the whole program into machine instructions, Python is read and executed by an IDLE(Interactive Development Environment). It is a interpreter that executes and displays the output of one line of code at a time.
  • Dynamically Typed: Unlike the statically-typed languages like Java, Python doesn’t require you to declare the data type of variable in advance. The Interpreter automatically decides the data type at runtime.
  • Graphical User Interface(GUI) Support: You can easily create GUIs using Python. Modules in Python such as Tkinter, PyQt, wxPython, or Pyside can be used to achieve this. You also get to enjoy a huge number of GUI frameworks and various other cross-platform solutions.
  • Object Oriented Programming Language: Providing you a platform to solve real-world problems using the Object-Oriented Approach, Python allows you to implement the concepts of Encapsulation, Inheritance, Polymorphism, etc.
  • Flexible: You can easily write Python Code into C or C++ language & eventually compile that code in C/C++ language. Python is completely compatible with Windows, Mac and Linux. Hence, if you write your code on Windows, then you don’t need to change it for other platforms.

Simplify MongoDB ETL with Hevo’s No-code Data Pipeline

Hevo Data, a No-code Data Pipeline helps to load data from any data source such as Databases, SaaS applications, Cloud Storage, SDK,s, and Streaming Services and simplifies the ETL process. It supports MongoDB, MongoDB Atlas, and Python, along with 100+ data sources (Including 40+ Free Data Sources), and is a 3-step process by just selecting the data source, providing valid credentials, and choosing the destination. Hevo not only loads the data onto the desired Data Warehouse but also enriches the data and transforms it into an analysis-ready form without having to write a single line of code.

Its completely automated pipeline offers data to be delivered in real-time without any loss from source to destination. Its fault-tolerant and scalable architecture ensure that the data is handled in a secure, consistent manner with zero data loss and supports different forms of data. The solutions provided are consistent and work with different BI tools as well.

Get Started with Hevo for Free

Check out why Hevo is the Best:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Schema Management: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Connectors: Hevo supports 100+ Integrations to SaaS platforms such as WordPress, FTP/SFTP, Files, Databases, BI tools, and Native REST API & Webhooks Connectors. It supports various destinations including Google BigQuery, Amazon Redshift, Snowflake, Firebolt, Data Warehouses; Amazon S3 Data Lakes; Databricks, MySQL, SQL Server, TokuDB, MongoDB, DynamoDB, PostgreSQL Databases to name a few.  
  • Hevo Is Built To Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular point in time.
Sign up here for a 14-Day Free Trial!

How to perform MongoDB Python Insertion?

MongoDB Python Insertion - Pymongo Library
Image Source

You can use the Python pymongo driver to easily perform the MongoDB Python Insertion task. To completely understand the process of MongoDB Python Insertion, you can go through the following aspects:

1) MongoDB Python Insertion: Insert One Document

To insert a single document or record, you can use the insert_one() method. This method takes a dictionary as a parameter that contains the names and values of each field in the document you want to insert. To perform the MongoDB Python insertion using insert_one(), you can check out the following example:

from pymongo import MongoClient

try:
	myclient = MongoClient("mongodb://localhost:27017/")
	print("Connected successfully!!!")
except:
	print("Could not connect to MongoDB")

db = myclient[“mydatabase”]

mycollection = db[“students”]

student_rec1 = {
		"name":"John Smith",
		"rollno":34,
		"address":"47 North Street Delhi"
		}
student_rec2 = {
		"name":"Max Earl",
		"rollno":56,
		"address":"23 Park Street Delhi"
		}

student_id1 = mycollection.insert_one(student_rec1)
student_id2 = mycollection.insert_one(student_rec2)

print("Data inserted with student ids",student_id1," ",student_id2)

cursor = collection.find()
for record in cursor:
	print(record)

Output:

Connected successfully!!!
Data inserted with record ids    
{'_id': ObjectId('5a02227b37b8552becf5ed2a') 
{'_id': ObjectId('5a02227c37b8552becf5ed2b')

{'_id': ObjectId('5a02227b37b8552becf5ed2a', name':'John Smith', 'rollno': 34, 'address': '47 North Street Delhi'}
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), name':'Max Earl', 'rollno': 56, 'address': '23 Park Street Delhi'}

In this example, you can observe the following elements:

  • MongoClient(): This is used to establish a connection with MongoDB by specifying the port number and URL parameters. In this case, the default port number is used i.e. 27017. 
  • mydatabase: Here, the database named “mydatabase” is selected. If there is no existing database by the mentioned name, then a new one is created.
  • Students: Now, after switching to “mydatabase”, the “students” collection is selected where you will perform the MongoDB Python Insertion.
  • student_rec1, student_rec2: These are 2 dictionaries containing data for 2 students that are inserted into the “students” collection.

Note: MongoDB automatically assigns a unique identifier “ObjectID” to each record if not specified while inserting a record.

2) MongoDB Python Insertion: Return the _id field

After you have performed the MongoDB Python Insertion, you can retrieve the Object ID of the newly inserted data. You can achieve this by using the inserted_id property of the InsertResultObject returned by the insert_one() method. For example,

newdict = { "name": "James", "address": "Beach Street 78" }

x = mycollection.insert_one(newdict)

print(x.inserted_id)

Output:

5b1910482ddb101b7042fcd7

3) MongoDB Python Insertion: Insert Multiple Documents

You can perform MongoDB Python Insertion for multiple documents in one go by using the insert_many() method. You can go through the following example to understand the insert_many() method:

from pymongo import MongoClient
newclient = pymongo.MongoClient("mongodb://localhost:27017/")
db = newclient["newdatabase"]
mycollection = db["customers"]

newlist = [
  { "name": "Suraj", "address": "Park st 923"},
  { "name": "Hanz", "address": "Valley 42"},
  { "name": "Jason", "address": "Beach 854"}
  ]

rec = mycollection.insert_many(newlist)

#print list of the _id values of the inserted documents:
print(rec.inserted_ids)

Output: 

[ObjectId('5b19112f2ddb101964065487'), ObjectId('5b19112f2ddb101964065488'), ObjectId('5b19112f2ddb101964065489')]

In the above example, you can note the following elements:

  • newlist: This is a list containing multiple dictionaries.
  • rec.inserted_ids: The “rec” is an InsertManyResult object that has property inserted_ids that returns the Object IDs of the newly inserted data. 

4) MongoDB Python Insertion: Insert Documents with Specified IDs

MongoDB allows you to specify the Object ID while inserting the documents. Note that the IDs provided by you should be unique. Check out the example below to understand how to enter the IDs:

from pymongo import MongoClient

newclient = pymongo.MongoClient("mongodb://localhost:27017/")
db = newclient["newdatabase"]
mycollection = db["customers"]

newlist = [
  { "_id": 1, "name": "Sammy", "address": "New st 457"},
  { "_id": 2, "name": "Emma", "address": "Mountain 98"},
  { "_id": 3, "name": "Ron", "address": "River 49"}
  ]

rec = mycollection.insert_many(newlist)

#print list of the _id values of the inserted documents:
print(rec.inserted_ids)

Output: 

[1, 2, 3]

Conclusion

In this article, you have learned how to perform MongoDB Python Insertion using the pymongo library. You can easily insert single or multiple documents using the insert_one method() or insert_many method() respectively. The inserted_id and inserted_ids property allow you to retrieve the Object IDs of the newly added data. MongoDB will always assign a unique ID to a new record if you don’t mention it while inserting the data.

After you have executed MongoDB Python Insertion, you can now start analyzing your data. To get a complete picture of your business performance and financial health, you need to consolidate data from MongoDB and all the other applications used across your business. To achieve this you need to assign a portion of your Engineering Bandwidth to Integrate Data from all sources, Clean & Transform it, and finally, Load it to a Cloud Data Warehouse or a destination of your choice for further Business Analytics. All of these challenges can be comfortably solved by a Cloud-Based ETL tool such as Hevo Data.  

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline can seamlessly transfer data from a vast sea of 100+ sources such as MongoDB, MongoDB Atlas & Python to a Data Warehouse or a Destination of your choice to be visualized in a BI Tool. It is a reliable, completely automated, and secure service that doesn’t require you to write any code!  

If you are using MongoDB as your NoSQL Database Management System and searching for a no-fuss alternative to Manual Data Integration, then Hevo can effortlessly automate this for you. Hevo, with its strong integration with 100+ sources & BI tools(Including 40+ Free Sources), allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy.

Want to take Hevo for a ride? Sign Up for a 14-day free trial and simplify your Data Integration process. Do check out the pricing details to understand which plan fulfills all your business needs.

Tell us about your experience of performing MongoDB Python Insertion! Share your thoughts with us in the comments section below.

Sanchit Agarwal
Former Research Analyst, Hevo Data

Sanchit Agarwal is a data analyst at heart with a passion for data, software architecture, and writing technical content. He has experience writing more than 200 articles on data integration and infrastructure. He finds joy in breaking down complex concepts in simple and easy language, especially related to data base migration techniques and challenges in data replication.

No-code Data Pipeline for MongoDB