Most enterprises collect vast volumes of data over time. This data usually contains important information regarding the business, customers, etc. Storing this data in a stable database is advisable to ensure data security and integrity. MongoDB provides a simple document-type data storing and analyzing environment for your data.

Given the vast amount of data stored in MongoDB document-like files, it can be challenging to pull out some required data portions. In such situations, engines like Elasticsearch come to use.

With Elasticsearch’s prompt indexing and analytical capabilities, you can draw filtered information from your stored data within a few minutes. All you need to do is connect the Elasticsearch cloud to MongoDB. 

The Elasticsearch MongoDB NodeJs Integration is handy for replicating data from the database to Elasticsearch for indexing and filtering.

There are many ways to connect Elasticsearch and MongoDB. This article will walk you through the Elasticsearch MongoDB NodeJs Integration. 

Prerequisites

  • Knowledge of database.

What is Elasticsearch?

Elasticsearch MongoDB NodeJS: Elasticsearch Logo
Image Source

Elasticsearch is an open-source search and analytics engine for various data types, including numerical, textual, geospatial, structured, and unstructured.

It was built on Apache Lucene and released in 2010. Elasticsearch is a central component of the Elastic Stack, also known as the ELK (Elasticsearch, Logstash, and Kibana), a set of tools for data analysis, ingestion, enrichment, and visualization. 

To use Elasticsearch, you just need to ingest raw data into the platform, where it is automatically parsed, normalized, and enriched before indexing.

Once indexed in Elasticsearch, you can search and retrieve complex summaries of their data.

It is known for its simple REST APIs, scalability, speed, and distributed nature. Elasticsearch indexes several data types for many use cases like application search, logging and log analytics, business and security analytics, infrastructure metrics, container monitoring, etc.

Understanding MongoDB with Node.JS

Elasticsearch MongoDB NodeJS: Mongodb Logo
Image Source

MongoDB is a simple document model NoSQL database that stores data in flexible, JSON-like documents. The fields can vary from document to document, and users can change the data structure over time.

It is well renowned for its load balancing and horizontal scaling features, which have provided application developers with flexibility and scalability.

MongoDB has a long history of working with Node.JS, an open-source server environment built on Chrome’s JavaScript. The combination is perfect as the schema need not be well-structured.

Due to MongoDB’s document-like data collection, it becomes convenient to deal with varied data types over the internet to be stored and accessed in several web applications using NodeJS. 

The official MongoDB Node.JS driver enables Node.JS applications to integrate with MongoDB. The driver provides an asynchronous JavaScript API and implements the network protocols to read and write from MongoDB databases.

In the next section, we will deep dive into the steps required for Elasticsearch MongoDB NodeJs Integration.

Scale your data integration effortlessly with Hevo’s Fault-Tolerant No Code Data Pipeline

As the ability of businesses to collect data explodes, data teams have a crucial role in fueling data-driven decisions. Yet, they struggle to consolidate the scattered data in their warehouse to build a single source of truth. Broken pipelines, data quality issues, bugs and errors, and lack of control and visibility over the data flow make data integration a nightmare.

1000+ data teams rely on Hevo’s Data Pipeline Platform to integrate data from over 150+ sources in a matter of minutes. Billions of data events from sources as varied as SaaS apps, Databases, File Storage, and Streaming sources can be replicated in near real-time with Hevo’s fault-tolerant architecture. What’s more – Hevo puts complete control in the hands of data teams with intuitive dashboards for pipeline monitoring, auto-schema management, and custom ingestion/loading schedules. 

This, combined with transparent pricing and 24×7 support, makes us the most special data pipeline software on review sites.

Take our 14-day free trial to experience a better way to manage data pipelines.

Get started for Free with Hevo!

Steps for Elasticesearch MongoDB NodeJS Integration

The Elasticsearch MongoDB NodeJs Integration is beneficial for replicating data from the database to Elasticsearch for indexing and filtering.

Let’s look at the steps to Elasticsearch MongoDB NodeJs connection. Data synchronization is done in real-time via a Node-based MongoDB Elasticsearch connector.

Elasticsearch MongoDB NodeJS: Elasticsearch MongoDB NodeJs Integration Using node-mongodb-es-connector
Image Source

Step 1: Installing and Setting up MongoDB

  • Download the MongoDB MSI installer package from here.
  • Install MongoDB with the installation wizard.
Elasticsearch MongoDB NodeJS: Installation and setup MongoDB
Image Source
  • Accept the license agreement, then click Next.
  • Select the Complete setup > Run Service as Network Service user.
  • Click Install.
Elasticsearch MongoDB NodeJS: Choose Setup Type in MongoDB installation
Image Source
  • Create a Data Folder in C Drive and create another folder inside it as db.
  • Set up Alias Shortcuts for Mongo and Mongod by opening your Hyper terminal running Git Bash.
  • Change to your home directory via the cd~ command.
  • Create a file called .bash_profile via the touch .bash_profile command.
  • Open the file using the vim .bash_profile command. Hit the I key and enter the insert module. 
  • Exit the module and type: wq! To save and exit. Relaunch Hyper and type mongo –version. You should see something like this.
Elasticsearch MongoDB NodeJS: Verify Setup
Image Source
  • You have successfully installed and set up MongoDB.

Step 2: Installing Elasticsearch

  • Download and install the .zip package for Elasticsearch 8.3.2 from here. Ensure that you install the MSI package.
  • Unzip it to create a folder called elasticsearch-8.3.2.
  • cd to the folder with cd C:\elasticsearch-8.3.2.
  • Enable automatic creation of system indices by configuring action.auto_create_index in elasticsearch.yml.
  • Run Elasticsearch from the command line using .\bin\elasticsearch.bat.
  • Configure Elasticsearch on the command line using: .\bin\elasticsearch.bat -Ecluster.name=my_cluster -Enode.name=node_1.
  • Check that Elasticsearch is running by sending an HTTPS request to port 9200 on localhost via 
  • Enter the password for the elastic user you generated during installation to receive a running response.
  • You can also install and run Elasticsearch as a service. Check here to know how.

Step 3: Installing Node.js

  • Download the installer from here. Choose the latest version to include the NPM package manager.
  • Install Node.js and NPM from the wizard.
Elasticsearch MongoDB NodeJS: Install Node.Js
Image Source
  • Make sure to choose the npm package manager in the next step, not the Node.js runtime.
  • Click Install.
Elasticsearch MongoDB NodeJS: Click Install to begin Installation
Image Source

Step 4: Downloading and Using Github node-mongodb-es-connector

  • Use the npm install es-mongodb-sync command.
  • To use this connector, you must create a JSON file named ElasticsearchIndexName.json or any name .json in the crawlerData folder.
  • You need to specify some required configurations:
    • m_database – MongoDB database.
    • m_collectionname – MongoDB collection.
    • m_filterfilds – MongoDB filterQuery for simple filtering support (null default value).
    • m_returnfilds – MongoDB needs it to return to the field (null default value).
    • m_connection :
      1. m_servers – MongoDB servers (array).
      2. m_authenticationusername, password, authsource, replicaset, and ssl (false default value).
    • m_documentsinbatch – an integer specifies the number of documents to send to Elasticsearch in batches.
    • m_delaytime – milliseconds between batches (default value is 1000ms).
    • e_index – Elasticsearch index where documents are saved.
    • e_type – Elasticsearch type given to documents.
    • e_connection :
      1. e_server – URL of a running Elasticsearch cluster. 
  • To start, open the CMD command window and use the node app.js command.
  • To use the Elasticsearch pipeline, install the Ingest Attachment Processor Plugin. It allows Elasticsearch to extract files in standard formats like PPT, XLS, and PDF using the Apache text extraction library Tika. Use:
sudo bin/elasticsearch-plugininstall 
ingest-attachment
  • Create a pipeline to Elasticsearch as follows:
PUT _ingest/pipeline/mypipeline
{
  "description" : "Extract attachment information from arrays",
  "processors" : [
    {
      "foreach": {
        "field": "attachments",
        "processor": {
          "attachment": {
            "target_field": "_ingest._value.attachment",
            "field": "_ingest._value.data"
          }
        }
      }
    }
  ]
}
  • Modify the node data by using “e_pipeline”: “mypipeline”.
  • Results will be displayed in both MongoDB and Elasticsearch data.

Kudos to you for completing the Elasticsearch MongoDB NodeJS connection.

Conclusion

In this article, you learned about an analytics and search engine, Elasticsearch, and a simple document-based database MongoDB. Moreover, you have acquired knowledge about Elasticsearch MongoDB NodeJS Integration.

You also learned about connecting the two to be able to retrieve filtered sections from the stored data via Elasticsearch indexing.

Amongst other ways of combining the two, the article discusses the Node.js framework to use Elasticsearch services on data stored in MongoDB databases. In a nutshell, you have walked through Elasticsearch MongoDB NodeJS Connection steps.

Don’t forget to drop your comment on the Elasticsearch MongoDB NodeJS connection. Apart from MongoDB, you would use several applications and databases across your business for Marketing, Accounting, Sales, Customer Relationship Management, etc.

It is essential to consolidate data from all these sources to get a complete overview of your business performance. To achieve this, you need to assign a portion of your Engineering Bandwidth to Integrate Data from all sources, Clean & Transform it, and finally, Load it to a Cloud Data Warehouse or a destination of your choice for further Business Analytics.

All of these challenges can be comfortably solved by a Cloud-Based ETL tool such as Hevo Data.

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline, can seamlessly transfer data from a vast sea of 150+ sources such as MongoDB & MongoDB Atlas to a Data Warehouse or a Destination of your choice to be visualized in a BI Tool. It is a reliable, completely automated, and secure service that doesn’t require you to write any code!  

If you are using MongoDB as your NoSQL Database Management System and searching for a no-fuss alternative to Manual Data Integration, then Hevo can automate this. Hevo, with its strong integration with 150+ sources(Including 50+ Free Sources), allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy.

Want to take Hevo for a ride? Sign Up for a 14-day free trial and simplify your Data Integration process. Check out the pricing details to understand which plan fulfills all your business needs.

Tell us about your experience of learning about the Elasticsearch MongoDB NodeJS Integration! Share your thoughts with us in the comments section below.

mm
Freelance Technical Content Writer, Hevo Data

Disha is deeply passionate about data science, and possesses a knack for writing on data, software architecture, and technical content catered to data teams to solve intricate business challenges.

No-code Data Pipeline for MongoDB