Apart from the traditional Relational Databases, a more flexible, scalable, reliable approach to handle and process rapidly growing data is by using NoSQL databases like MongoDB. It is an Open-Source Document Oriented Database Management System that allows you to store Unstructured Datasets.
One of the performance advantages of MongoDB is efficient data insertion. You can perform the MongoDB Python Insertion using the insert_one() and insert_many() methods. Instead of inserting data row by row, MongoDB optimizes the write performance by rapidly inserting data via the insert_many() API.
In this article, you will learn how to easily perform MongoDB Python Insertion.
What is MongoDB?
MongoDB is a NoSQL Open-Source Document Oriented Database developed for storing and processing high volumes of data. Compared to the conventional Relational Databases, MongoDB makes use of collections and documents instead of tables consisting of rows and columns. The Collections consists of several documents and documents containing the basic units of data in terms of key and value pairs.
Introduced in February 2009, the MongoDB database is designed, maintained, and managed by MongoDB.Inc under SSPL(Server Side Public License). Organizations such as Facebook, Nokia, eBay, Adobe, Google, etc. prefer it for efficiently handling and storing their exponentially growing data. It offers complete support for programming languages such as C, C++, C#, Go, Java, Node.js, Perl, PHP, Python, Motor, Ruby, Scala, Swift, and Mongoid.
Key Features of MongoDB
With constant efforts from the online community, MongoDB has evolved over the years. Some of its eye-catching features are:
- High Data Availability & Stability: MongoDB’s Replication feature provides multiple servers for disaster recovery and backup. Since several servers store the same data or shards of data, MongoDB provides greater Data Availability & Stability. This ensures all-time data access and security in case of server crashes, service interruptions, or even good old hardware failure.
- Accelerated Analytics: You may need to consider thousands to millions of variables while running Ad-hoc queries. MongoDB indexes BSON documents and utilizes the MongoDB Query Language (MQL) that allows you to update Ad-hoc queries in real-time. MongoDB provides complete support for field queries, range queries, and regular expression searches along with user-defined functions.
- Indexing: With a wide range of indices and features with language-specific sort orders that support complex access patterns to datasets, MongoDB provides optimal performance for every query. For the real-time ever-evolving query patterns and application requirements, MongoDB also provisions On-Demand Indices Creation.
- Horizontal Scalability: With the help of Sharding, MongoDB provides Horizontal Scalability by distributing data on multiple servers using the Shard Key. Each shard in every MongoDB Cluster stores parts of the data, thereby acting as a separate database. This collection of comprehensive databases allows efficient handling of growing volumes of data with zero downtime. The complete Sharding Ecosystem is maintained and managed by Mongos that directs queries to the correct shard based on the Shard Key.
- Load Balancing: Real-time Replication and Sharding contribute towards large-scale Load Balancing. Ensuring top-notch Concurrency Controls and Locking Protocols, MongoDB can effectively handle multiple concurrent read and write requests for the same data.
- Aggregation: Similar to the SQL Group By clause, MongoDB can easily batch process data and present a single result even after executing several other operations on the group data. MongoDB’s Aggregation framework consists of 3 types of aggregations i.e. Aggregation Pipeline, Map-Reduce Function, and Single-Purpose Aggregation methods.
What is Python?
Python is a widely-used interpreted, Object-Oriented, General-Purpose programming language. Launched in 1991, Python has now evolved into an advanced application development language offering features such as high-level data structures, dynamic typing, dynamic binding, etc. It is a powerful tool that is used for developing websites and software, Task Automation, Data Analysis, and Data Visualization. The popular standard python libraries include Tensor Flow, NumPy, Pandas, & Mathplotlib for data science.
Owing to its beginner-friendliness, it is also used by non-coders such as accountants and scientists, for a broad range of everyday tasks, like organizing finances, numerical computations, scientific graphs, etc. Python was specially developed for better readability with influences from English and Mathematics. It is completely free as it comes under the GPL-compatible license certified by the Open Source Initiative.
Key Features of Python
Since its inception, Python has become a popular choice for several tasks due to the following eye-catching features:
- Interpreted Language: Instead of compiling the whole program into machine instructions, Python is read and executed by an IDLE(Interactive Development Environment). It is a interpreter that executes and displays the output of one line of code at a time.
- Dynamically Typed: Unlike the statically-typed languages like Java, Python doesn’t require you to declare the data type of variable in advance. The Interpreter automatically decides the data type at runtime.
- Graphical User Interface(GUI) Support: You can easily create GUIs using Python. Modules in Python such as Tkinter, PyQt, wxPython, or Pyside can be used to achieve this. You also get to enjoy a huge number of GUI frameworks and various other cross-platform solutions.
- Object Oriented Programming Language: Providing you a platform to solve real-world problems using the Object-Oriented Approach, Python allows you to implement the concepts of Encapsulation, Inheritance, Polymorphism, etc.
- Flexible: You can easily write Python Code into C or C++ language & eventually compile that code in C/C++ language. Python is completely compatible with Windows, Mac and Linux. Hence, if you write your code on Windows, then you don’t need to change it for other platforms.
How to perform MongoDB Python Insertion?
You can use the Python pymongo driver to easily perform the MongoDB Python Insertion task. To completely understand the process of MongoDB Python Insertion, you can go through the following aspects:
1) MongoDB Python Insertion: Insert One Document
To insert a single document or record, you can use the insert_one() method. This method takes a dictionary as a parameter that contains the names and values of each field in the document you want to insert. To perform the MongoDB Python insertion using insert_one(), you can check out the following example:
from pymongo import MongoClient
try:
myclient = MongoClient("mongodb://localhost:27017/")
print("Connected successfully!!!")
except:
print("Could not connect to MongoDB")
db = myclient[“mydatabase”]
mycollection = db[“students”]
student_rec1 = {
"name":"John Smith",
"rollno":34,
"address":"47 North Street Delhi"
}
student_rec2 = {
"name":"Max Earl",
"rollno":56,
"address":"23 Park Street Delhi"
}
student_id1 = mycollection.insert_one(student_rec1)
student_id2 = mycollection.insert_one(student_rec2)
print("Data inserted with student ids",student_id1," ",student_id2)
cursor = collection.find()
for record in cursor:
print(record)
- Import MongoClient:
from pymongo import MongoClient
: Imports the MongoClient
class from the pymongo
library to connect to MongoDB.
- Establish Connection:
myclient = MongoClient("mongodb://localhost:27017/")
: Attempts to connect to a MongoDB server running locally on the default port (27017).
print("Connected successfully!!!")
: Prints a success message if the connection is established.
except:
: Catches any exceptions and prints a message if the connection fails.
- Select Database:
db = myclient["mydatabase"]
: Selects (or creates) a database named mydatabase
.
- Select Collection:
mycollection = db["students"]
: Selects (or creates) a collection named students
within the database.
- Prepare Student Records:
student_rec1
and student_rec2
: Dictionaries representing student records with fields for name, roll number, and address.
- Insert Student Records:
student_id1 = mycollection.insert_one(student_rec1)
: Inserts the first student record into the collection and stores the returned ID.
student_id2 = mycollection.insert_one(student_rec2)
: Inserts the second student record into the collection and stores the returned ID.
print("Data inserted with student ids", student_id1, " ", student_id2)
: Prints the IDs of the inserted records.
- Retrieve and Print Records:
cursor = mycollection.find()
: Retrieves all records from the students
collection.
for record in cursor:
: Iterates over each record in the cursor.
print(record)
: Prints each student record.
Output:
Connected successfully!!!
Data inserted with record ids
{'_id': ObjectId('5a02227b37b8552becf5ed2a')
{'_id': ObjectId('5a02227c37b8552becf5ed2b')
{'_id': ObjectId('5a02227b37b8552becf5ed2a', name':'John Smith', 'rollno': 34, 'address': '47 North Street Delhi'}
{'_id': ObjectId('5a02227c37b8552becf5ed2b'), name':'Max Earl', 'rollno': 56, 'address': '23 Park Street Delhi'}
In this example, you can observe the following elements:
- MongoClient(): This is used to establish a connection with MongoDB by specifying the port number and URL parameters. In this case, the default port number is used i.e. 27017.
- mydatabase: Here, the database named “mydatabase” is selected. If there is no existing database by the mentioned name, then a new one is created.
- Students: Now, after switching to “mydatabase”, the “students” collection is selected where you will perform the MongoDB Python Insertion.
- student_rec1, student_rec2: These are 2 dictionaries containing data for 2 students that are inserted into the “students” collection.
Note: MongoDB automatically assigns a unique identifier “ObjectID” to each record if not specified while inserting a record.
2) MongoDB Python Insertion: Return the _id field
After you have performed the MongoDB Python Insertion, you can retrieve the Object ID of the newly inserted data. You can achieve this by using the inserted_id property of the InsertResultObject returned by the insert_one() method. For example,
newdict = { "name": "James", "address": "Beach Street 78" }
x = mycollection.insert_one(newdict)
print(x.inserted_id)
Output:
5b1910482ddb101b7042fcd7
3) MongoDB Python Insertion: Insert Multiple Documents
You can perform MongoDB Python Insertion for multiple documents in one go by using the insert_many() method. You can go through the following example to understand the insert_many() method:
from pymongo import MongoClient
newclient = pymongo.MongoClient("mongodb://localhost:27017/")
db = newclient["newdatabase"]
mycollection = db["customers"]
newlist = [
{ "name": "Suraj", "address": "Park st 923"},
{ "name": "Hanz", "address": "Valley 42"},
{ "name": "Jason", "address": "Beach 854"}
]
rec = mycollection.insert_many(newlist)
#print list of the _id values of the inserted documents:
print(rec.inserted_ids)
- Create a New Dictionary:
newdict = { "name": "James", "address": "Beach Street 78" }
: Defines a new dictionary with a student’s name and address.
- Insert the Dictionary into MongoDB:
x = mycollection.insert_one(newdict)
: Inserts the newdict
into the mycollection
(which is the students
collection).
- Print the Inserted ID:
print(x.inserted_id)
: Outputs the unique ID assigned to the newly inserted document in MongoDB.
Output:
[ObjectId('5b19112f2ddb101964065487'), ObjectId('5b19112f2ddb101964065488'), ObjectId('5b19112f2ddb101964065489')]
In the above example, you can note the following elements:
- newlist: This is a list containing multiple dictionaries.
- rec.inserted_ids: The “rec” is an InsertManyResult object that has property inserted_ids that returns the Object IDs of the newly inserted data.
4) MongoDB Python Insertion: Insert Documents with Specified IDs
MongoDB allows you to specify the Object ID while inserting the documents. Note that the IDs provided by you should be unique. Check out the example below to understand how to enter the IDs:
from pymongo import MongoClient
newclient = pymongo.MongoClient("mongodb://localhost:27017/")
db = newclient["newdatabase"]
mycollection = db["customers"]
newlist = [
{ "_id": 1, "name": "Sammy", "address": "New st 457"},
{ "_id": 2, "name": "Emma", "address": "Mountain 98"},
{ "_id": 3, "name": "Ron", "address": "River 49"}
]
rec = mycollection.insert_many(newlist)
#print list of the _id values of the inserted documents:
print(rec.inserted_ids)
- Import MongoClient:
from pymongo import MongoClient
: Imports the MongoClient
class to connect to MongoDB.
- Connect to MongoDB:
newclient = pymongo.MongoClient("mongodb://localhost:27017/")
: Connects to a MongoDB server running locally.
- Select Database and Collection:
db = newclient["newdatabase"]
: Selects (or creates) a database named newdatabase
.
mycollection = db["customers"]
: Selects (or creates) a collection named customers
.
- Prepare Data for Insertion:
newlist
: A list of dictionaries representing customer records, each with a unique _id
, name, and address.
- Insert Multiple Documents:
rec = mycollection.insert_many(newlist)
: Inserts all documents in newlist
into the customers
collection.
- Print Inserted IDs:
print(rec.inserted_ids)
: Outputs the list of _id
values assigned to the newly inserted documents.
Output:
[1, 2, 3]
Conclusion
In this article, you have learned how to perform MongoDB Python Insertion using the pymongo library. You can easily insert single or multiple documents using the insert_one method() or insert_many method() respectively. The inserted_id and inserted_ids property allow you to retrieve the Object IDs of the newly added data. MongoDB will always assign a unique ID to a new record if you don’t mention it while inserting the data.
After you have executed MongoDB Python Insertion, you can now start analyzing your data. To get a complete picture of your business performance and financial health, you need to consolidate data from MongoDB and all the other applications used across your business. To achieve this you need to assign a portion of your Engineering Bandwidth to Integrate Data from all sources, Clean & Transform it, and finally, Load it to a Cloud Data Warehouse or a destination of your choice for further Business Analytics. All of these challenges can be comfortably solved by a Cloud-Based ETL tool such as Hevo Data.
Visit our Website to Explore Hevo
Hevo Data, a No-code Data Pipeline can seamlessly transfer data from a vast sea of 100+ sources such as MongoDB, MongoDB Atlas & Python to a Data Warehouse or a Destination of your choice to be visualized in a BI Tool. It is a reliable, completely automated, and secure service that doesn’t require you to write any code!
Frequently Asked Questions
1. How to insert data from Python to MongoDB?
You can insert data from python to MongoDB by using pymongo
library.
2. How to insert a list in MongoDB using Python?
You can use insert_many()
for list insertion
3. How to insert a date in MongoDB using Python?
To insert the date in MongoDB using Python, you can use datetime
module.
Sanchit Agarwal is an Engineer turned Data Analyst with a passion for data, software architecture and AI. He leverages his diverse technical background and 2+ years of experience to write content. He has penned over 200 articles on data integration and infrastructures, driven by a desire to empower data practitioners with practical solutions for their everyday challenges.