In this blog, you will see how you can load data from Elasticsearch to PostgreSQL using 2 methods. The first method involves the use of Python Libraries to complete the data transfer process. The second method uses Hevo’s automated Data Pipeline to set up your Elasticsearch PostgreSQL connection. Read along to decide which method suits you the best!

Prerequisites

  • An Elasticsearch account.
  • A PostgreSQL account.
  • Working knowledge of Python Libraries.

Introduction to Elasticsearch

Elasticsearch is a document-oriented database. It is used to store and manage structured, unstructured, and semi-structured data. When it comes to retrieving data from huge data sets, relational databases are slow. But NoSQL databases like ELasticsearch are flexible enough to allow more real-time engagement.

Related: Looking to replicate data from Elasticsearch to Databricks? Our blog on Elasticsearch to Databricks provides you with two simple and effective methods to achieve this seamless integration. If you’re new to Elasticsearch and want to learn how to ingest data effortlessly, check out our blog on how to ingest data to Elasticsearch.

Solve your data integration problems with Hevo’s reliable, no-code, automated pipelines with 150+ connectors.
Get your free trial right away!

Introduction to Elasticsearch Documents

Elasticsearch is a NoSQL database. This means instead of tables, data is stored in ‘documents’. NoSQL databases are non-relational, distributed, open-source, and scalable. A document is a JSON document that is similar to a row in a relational database. 

Here is an example of a document:

“_id” : 1,
“status” : ”subscriber”,
“name” : “John Smith”,
“country” : “U.S.A”

“_id” : 2
“status” : ”subscriber” 
“name” : “James Brown”
“country” : “U.K”

Introduction to PostgreSQL

PostgreSQL Logo
Image Source

PostgreSQL is a powerful open-source Object-Relational Database. It not only uses the SQL language but also adds many features to it, allowing you to store and expand very complex data workloads. It has a reputation for proven Architecture, Data Integrity, Reliability, and Scalability. Powerful features and commitment to a rich open source community are some aspects of this Database.

Due to its Robustness and Feature-Rich Query Layer, PostgreSQL is very popular in use cases that traditionally require strict table structure. It has been developed for more than 30 years and most of the major problems have been solved. This fact makes people confident in using it in a business environment.

If you are looking to streamline your PostgreSQL workflow, do read our blog on PostgreSQL import CSV, saving you time and effort. And if you’re interested in optimizing performance and achieving high availability, don’t miss our guide on setting up PostgreSQL clusters [+clustering options].

Save 20 Hours of Frustration Every Week

Did you know that 75-90% of data sources you will ever need to build pipelines for are already available off-the-shelf with No-Code Data Pipeline Platforms like Hevo? 

Ambitious data engineers who want to stay relevant for the future automate repetitive ELT work and save more than 50% of their time that would otherwise be spent on maintaining pipelines. Instead, they use that time to focus on non-mediocre work like optimizing core data infrastructure, scripting non-SQL transformations for training algorithms, and more. 

Step off the hamster wheel and opt for an automated data pipeline like Hevo. With a no-code intuitive UI, Hevo lets you set up pipelines in minutes. Its fault-tolerant architecture ensures zero maintenance. Moreover, data replication happens in near real-time from 150+ sources to the destination of your choice including Snowflake, BigQuery, Redshift, Databricks, and Firebolt. 

Start saving those 20 hours with Hevo today.

Get started for Free with Hevo!

Methods to Connect Elasticsearch to PostgreSQL

The following 2 methods can easily set up your Elasticsearch to PostgreSQL integration:

Method 1: Using Python Libraries to Connect Elasticsearch to PostgreSQL

You can easily set up your Elasticsearch Postgres connection using Python Libraries with the following steps:

Step 1: Import Python Libraries

First, import the required Python libraries such as elasticsearch, elasticsearch-dsl, and psycopg2. Elasticsearch is a library that provides a common ground for all Elasticsearch related code written in Python. Elasticsearch DSL is used to write and run queries against Elasticsearch. Psycopg2 is used for connecting to the PostgreSQL database. 

import elasticsearch
from elasticsearch import Elasticsearch
import elasticsearch_dsl
from elasticsearch_dsl import Search
import psycopg2 

Step 2: Create an Elasticsearch Object

Next, create an Elasticsearch object to send a data request to Elasticsearch. You need your Elasticsearch database login credentials to set up a connection.

es= Elasticsearch(hosts= “http://user_name:your_password@localhost:9200/”)

Step 3: Connect to PostgreSQL Database

Similarly, establish a connection to your PostgreSQL database and create a cursor object to execute your INSERT command. 

conn = psycopg2.connect(host=”localhost”, database=”postgres_db_name”, user="postgres”, password= “postgres_password”)
cur = conn.cursor()

Step 4: Get Records from Elasticsearch

Now we will use the Elasticsearch object to get the subscriber’s names.

s= Search(index= “es_db_name”).using(es).query(“match”, status=”subscriber”)
es_result= s.execute()

Step 5: Read Elasticsearch Data into a List

Read es_result into a list which can then be moved to PostgreSQL.

subscriber_list = []
for x in es_result:
    subscriber_list.append(x.name)

All the names in the records will be appended to the empty subsciber_list list. 

Step 6: Load Data into PostgreSQL

Now we just have to load this list into the appropriate solemn in PostgreSQL.

q= “INSERT INTO the_table VALUES (%s)”
cur.execute(q, subsciber_list)
conn.commit()
conn.close()

In the above section, we wrote a query that inserts values into a PostgreSQL table called ‘the_table’. Then we used the cursor we created earlier to load the values in the list subscriber_list into PostgreSQL using the query. Finally, the changes are committed and the PostgreSQL connection is closed.

Step 7: Elasticsearch to PostgreSQL: Final Code

Here is the complete code to load data from Elasticsearch to PostgreSQL:

import elasticsearch
from elasticsearch import Elasticsearch
import elasticsearch_dsl
from elasticsearch_dsl import Search
import psycopg2 

# create an elasticsearch object.

es= Elasticsearch(hosts= “http://user_name:your_password@localhost:9200/”)

# connect to PostgreSQL and create a cursor object.

conn = psycopg2.connect(host=”localhost”, database=”postgres_db_name”, user=”postgres”, password= “postgres_password”)
cur = conn.cursor()

# Get the required data from the documents.

s= Search(index= “es_db_name”).using(es).query(“match”, status=”subscriber”)
es_result= s.execute()

# Load the data into a list.

subscriber_list = []
for x in es_result:
    subscriber_list.append(x.name)


# Load the list into a PostgreSQL table using the INSERT statement and close the connection.

q= “INSERT INTO the_table VALUES (%s)”
cur.execute(q, subsciber_list)
conn.commit()
conn.close()

Method 2: Using Hevo Data to Connect Elasticsearch to PostgreSQL

Hevo Data Logo
Image Source

Hevo Data, a No-code Data Pipeline, helps you directly establish a PostgreSQL Elasticsearch connection in a completely hassle-free & automated manner. Hevo connects to your Elasticsearch cluster using the Elasticsearch transport client and synchronizes the data available in the cluster to your destination.

Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Hevo Data, an official PostgreSQL partner, offers you seamless Postgres Elasticsearch data migration in two very simple steps: 

  • Authenticate Source: Authenticate and configure your Elasticsearch account as the data source.
Configuring Elasticsearch as Source
Image Source
  • Configure Destination: Connect the PostgreSQL Database as the destination.
Configuring PostgreSQL as Destination
Image Source

Here are more reasons to love Hevo:

  • Secure: Hevo has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
  • Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
  • Minimal Learning: Hevo, with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
  • Hevo is Built to Scale: As the number of sources and the volume of your data grows, Hevo scales horizontally, handling millions of records per minute with very little latency.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support call
  • Live Monitoring: Hevo allows you to monitor the data flow and check where your data is at a particular time.

With continuous real-time data movement, Hevo allows you to combine Elasticsearch data with your other data sources and seamlessly load it to PostgreSQL with a no-code, easy-to-setup interface. Try our 14-day full-feature access free trial!

Get Started with Hevo for Free

Conclusion

The article discussed 2 methods of connecting Elasticsearch to PostgreSQL.The first method is to load data from Elasticsearch to PostgreSQL using Python Libraries. But often, in reality, NoSQL data does not have a consistent structure. The data is usually semi-structured and data needs to be transformed. There are several other complexities involved all of which require heavy engineering bandwidth to deal with. 

Instead, if you need an ETL tool that handles moving data from Elasticsearch to PostgreSQL automatically, give Hevo, a no-code data pipeline a try. Hevo can help you replicate data from Elasticsearch and 150+ data sources (including 50+Free Sources) to PostgreSQL or a destination of your choice and visualize it in a BI tool. This makes Hevo the right partner to be by your side as your business scales.

Learn more about Hevo

Want to take Hevo for a spin? Signup for a 14-day free trial and experience the feature-rich Hevo suite firsthand.

Share your thoughts on loading data from Elasticsearch to PostgreSQL in the comments below.

No-code Data Pipeline for your Data Warehouse