How To Achieve PostgreSQL Master Slave Replication

on Tutorials • June 20th, 2020 • Write for Hevo

Introduction

PostgreSQL is an extremely robust open-source database used by noted players like Skype, Reddit, Instagram, and OpenStreetMap. PostgreSQL allows replication to nodes that can run read-only queries.

This article walks you through the steps to achieve PostgreSQL Master Slave Replication. But let us first try to understand the need for replication.

When you start to face a spike in your website traffic or when you finally start to have a large number of subscribers to your services, your database might start to slow down. Once your database becomes the weak link in the chain, it will end up being vulnerable to all sorts of unexpected failures.

So how should you protect your database from this problem? You could upgrade your server to make it more powerful and allow it to handle more queries. It does solve the database’s bottleneck problem. But it is still vulnerable. If it fails, you have nothing to fall back on.

Let us think of another solution. If we add multiple servers in parallel, not only do you have scale but you also ensure that failure of one instance does not collapse the entire system. So if the main server fails, you have a second server ready to take over.

Here is what we will cover in this article:

Introduction to PostgreSQL Master Slave Replication

It is very likely that your website or application has a lot of read-queries and very few write queries. In such a case, the ideal strategy would be to replicate the master server into several slave servers. The slave servers are used only for read queries and when there is a fail-over. In the event of a fail-over, one of the slaves can become a master.

Replication allows changes happening in the cluster of the primary database to be copied to the standby database on another server. What is the point of this? Replication distributes the database to several different machines. This way, we have a backup, and queries can be handled without downtime.

There are many different ways to PostgreSQL master slave replication. Let us look into logical replication which replicates data objects based on their replication identity. Under logical replication, there is a publisher node that has several subscriber nodes pulling data. This type of replication is generally used when subscribers need to fetch incremental changes in data.

Prerequisites

  • PostgreSQL master instance
  • PostgreSQL slave instance
  • PostgreSQL 10

First of all, you need two server instances. Let us call one of them master_node and the other slave_node.

A Reliable Solution to Move Data from PostgreSQL to your Data Warehouse

Hevo Data, a No-code Data Pipeline, can help you move data from the slave to a database or a data warehouse of your choice with absolute ease. You do not have to deal with the grunt work because Hevo is fully automated and it needs minimal supervision from your side. Check out some of Hevo’s cool features :

  • Simplicity: Simplicity is the essence of Hevo. The whole interface is very intuitive and you don’t need any developer assistance to get your job done.
  • Scalability: With Hevo, you can scale up and down based on your needs. Hevo supports 100 + integrations ranging from marketing and sales applications to web and mobile applications at any scale. Check out the list of integrations here.
  • Fault-tolerant: With Hevo’s ability to detect anomalies in your incoming data and taking action, your analytics will run without facing a glitch.
  • Secure: Hevo ensures that your data is totally secure with end-to-end encryption and two-factor authentication.
  • 24*7 Support: Hevo’s team is available to serve you through chat, call, and mail.
  • Real-time Data Transfer: Hevo lets you move data in real-time and quickly gain valuable insights.

You can move your data from PostgreSQL to any data warehouse very easily using Hevo. Not convinced? You can try it for yourself. Why don’t you just give it a try by signing up for a free 14-day trial here?

Steps to Achieve PostgreSQL Master Slave Replication

Here are the steps we will be following:

  • Step 1: Creating a User in the Master Node

In the master_node, create a user called user_name. In order to do that, in the master node, enter the following commands.

$ psql 
CREATE ROLE user_name REPLICATION LOGIN PASSWORD ‘my_password’;

You will get an output that looks like this:

CREATE ROLE
  • Step 2: Authentication

The next step is to authenticate the user. To do that, we will edit the pg_hba.conf file. You have to know the path where this file is installed for you.

sudo vim path/of/your/file/pg_hba.conf

Create a line where, USER is user_name, the address is slave_node and the method is md5.

PostgreSQL Master Slave Replication : User Authentication

The purpose of this step is to allow  ‘user_name’ to connect to all the databases from the slave node.

  • Step 3: Configuring the wal_level

The next step is to configure the wal_level. wal_level determines the amount of information written to the WAL. It needs to be changed from the default ‘replica’ to ‘logical’. This will add the necessary amount of info to the wal_log which will allow the slave_node to get access to the master_node and copy the tabular data.

To edit the postgres.conf file, enter the following command in the master_node:

sudo vim path/of your/file/postgres.conf
PostgreSQL Master Slave Replication : wal_level Configuration

Since you have made changes to the wal_level, you need to restart PostgreSQL.

Use this command to restart:

sudo service postgresql-10 restart
  • Step 4: Creating a Publication

In this step, we will be creating a publication. That is, we set up a table in a database in the master node so that the slave node can copy it

To get a list of the databases, enter the following command :

$psql
l

Connect to one of the databases in the list.

c my_db

Pick a table in the database my_db and create a PUBLICATION using the following command :

CREATE PUBLICATION my_pub FOR TABLE my_table;

Next, we will grant user_name all the rights to my_table using the following command :

GRANT ALL ON my_table TO user_name;
  • Step 5: Configuring the Slave Node

In this step, we will configure the slave node. The first thing to do is to create the my_db database using this command in slave_node.

createdb my_db

Make sure that the pg_hba.conf file is prepared on the slave_node to allow the postgres user to authenticate from the master_node using the following command: 

sudo cat path/of/your/file/pg_hba.conf
PostgreSQL Master Slave Replication : Slave Node Configuration

Now go on master_node and type in the following command to get the structure of your data in my_table using a schema only dump. We do this step because PostgreSQL needs the table to exist in slave_node as well.

pg_dump my_db –t my_table –s

You can now pipe this over to slave_node.

pg_dump my_db –t my_table –s | psql my_db –h slave_node

This command will pipe the schema of my_table into the host – slave_node. The data inside the table doesn’t come with this because we have specified that we are piping schema only with the -s.

If you now check, you will find that both master_node and slave_node have my_table in the my_db database, but in the slave node, we only have the schema with the table being empty.

  • Step 6: Creating a Subscription in the Slave Node

Now let us begin replication to move the data in this table to the slave node. We do this by creating a SUBSCRIPTION.

Go to slave_node instance and enter the command stated below :

CREATE SUBSCRIPTION my_sub CONNECTION ‘dbname = my_db host = master_node user = user_name password = my_password’ PUBLICATION my_pub;

This will create the replication slot on the publisher.

  • Step 7: Verifying

In order to verify, you can enter this command in slave_node instance.

SELECT * FROM my_table;

Now you should see the data inside the table in the slave node instance as well.

To continue the verification further, go back to master_node, to enter the below command :

INSERT INTO my_table (column_1) VALUES (value);

Go back to slave_node and verify if the change can be noticed on the slave node instance.

Conclusion

The goal of replication is to make your database more available and secure. In this article, you saw how logical master-slave replication can be achieved with  PostgreSQL. The master and slave instances can now communicate with each other and your data will now be more secure.

But in order to set it all up using the above approach, you need to manually configure many details and that can eat away a lot of time and effort. The process requires a very detailed understanding of PostgreSQL and data engineering concepts. We have shown the process using the example of some simple data. But the reality is that a lot of data needs to be transformed before loading it into its destination.

But we can help you. Hevo is a No-code ETL tool that provides a very simple user interface that will handle all the configuration miseries for you. Hevo can set up an ETL pipeline for your specific needs and you can quickly move on to production.

In fact, you can try it out for free for 14 days and continue if you find it perfect for your needs. Sign up here for a free 14-day trial and test it out for yourself. No commitments.

What are your thoughts on PostgreSQL Master Slave Replication? We would love to hear from you in the comments.

No-code Data Pipeline for PostgreSQL