One of the major goals of storing data in a database is to ensure data availability. You would want to access your data at any time that you want. Having a single copy of the database can be dangerous. In case of an unprecedented event, you won’t be able to access it.

That’s why it’s a good idea for you to have multiple copies of your database. PostgreSQL database management system makes this possible. It supports the concept of replication that you can use to create multiple copies of your database.

What is Logical Replication?

When deploying applications in a production environment, it is always good for you to have multiple copies of the database. After making the copies, you must ensure that they are all in sync. This process of keeping database copies in sync is known as replication. 

There are two types of replication in PostgreSQL: 

  1. Physical replication
  2. Logical replication

The Architecture of PostgreSQL Logical Replication

The PostgreSQL logical replication follows the publish/subscribe model. The publisher node creates a publication, which is also called a change set or a replication set. The publication is simply a set of changes from one or more tables. The subscriber node creates a subscription, with the capability to subscribe to either one or more publications. Now that you know the basics of logical replication, how is the logical replication in PostgreSQL done? That’s what you will learn in the next section. 

Methods to setup PostgreSQL logical replication

PostgreSQL logical replication using Hevo’s automated data pipeline

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

Get Started with Hevo for Free

PostgreSQL logical replication using manual method

Logical replication in PostgreSQL is achieved by the construction of a stream of logical data modifications from the WAL file. It involves setting up the wal_level = logical for PostgreSQL to convert binary changes into logical ones.

PostgreSQL Logical Replication Process

The logical replication in PostgreSQL follows the steps given below:

Step 1: A snapshot of the publishing database is copied to the subscriber. This step is also called the table synchronization phase. To reduce the amount of time spent in this phase, you can spawn multiple table synchronization workers. However, you can only have one synchronization worker for each table. 

 Step 2: After the copy is completed, subsequent changes made in the publisher node will be sent to the subscriber node. The changes will be sent as they happen in real-time. The changes will be applied in a commit order to ensure there is transactional consistency. 

Method 1: PostgreSQL logical replication using Hevo’s automated data pipeline

Postgresql logical replication
Image source

Hevo Data, an Automated No Code Data Pipeline, helps you replicate data from PostgreSQL to Data Warehouses, Business Intelligence Tools, or any other destination of your choice in a completely hassle-free & automated manner. Hevo supports data ingestion replication from PostgreSQL servers via Write Ahead Logs (WALs) set at the logical level. This mode is particularly useful when there is a need to replicate the entire database as it is.

Apart from logical replication, Hevo also supports PostgreSQL replication using Custom SQL (fetches data from Postgres source using a custom SQL query at a fixed frequency) and table mode (fetches data from multiple tables while maintaining control over data ingestion from each table individually)

Hevo provides you with an all-in-one solution for replicating data from Amazon Aurora PostgreSQL, Amazon RDS PostgreSQL, Azure PostgreSQL, and various other PostgreSQL sources. This way you can focus more on your Data Analysis and let Hevo manage the PostgreSQL Data Replication.

Check out the salient features of Hevo:

  • Quick Setup: Hevo with its automated features, can be set up within minutes in a simple 3 step process. Moreover, with its simple and interactive UI, it is extremely easy for new customers to work on and perform operations.
  • Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data from PostgreSQL and replicates it to the destination schema.
  • Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Get Started with Hevo for Free

Check out our documentation to learn more about this method.

Method 2: PostgreSQL logical replication using manual method

Basic Syntax

To perform logical replication in PostgreSQL, you’ve to change the wal_level = logical parameter. The parameter tells the server to store additional information in WAL for converting binary changes into logical ones. To change the value of this parameter, you have to open the postgresql.conf file. 

On Linux, this file can be found in the following directory:

/etc/postgresql/version/main

On Windows, the file can be found in the following directory:

C:Program FilesPostgreSQLversiondata

After opening the file, go to the WRITE-AHEAD LOG settings. 

Uncomment the parameter wal_level and set it to the following:

wal_level = logical

You can then restart the PostgreSQL service. 

On Windows, you can open the Services dialog and restart the service from there. 

On Linux, you can stop and then start the service by running the following sequence of commands:

sudo service postgresql stop

sudo service postgresql start

To check whether the change has taken effect, run the following command on PostgreSQL terminal:

SHOW wal_level;

Once you get this, logical replication has been turned on. 

After that, you can create a publication by running the following command:

CREATE PUBLICATION my_publication FOR ALL TABLES;

If you don’t want to publish all tables, you can specify the tables that you need to publish. You can also choose to limit the changes that will be published. 

However, for DELETE and UPDATE operations to be replicated, the “published” table should have a REPLICA IDENTITY, which can be a primary key. This makes it possible for the modified rows to be identified in the subscriber. INSERT operations can be done without a replica identity. After the publications have been created, you can go ahead to create subscriptions in the subscriber node. 

You can use the following command for this:

CREATE SUBSCRIPTION my_subscription CONNECTION '... 
<connection string> ...' PUBLICATION my_publication;

The command will add a subscription named my_subscription for the current database that will begin to receive logical changes from the publication named my_publication. Once you run the command, a logical replication worker will be spawned, and this will receive logical changes coming from the publisher. On the side of the publisher, a walsender process will be spawned, which will be reading the WAL step-by-step, decode changes, and send them to the correct subscriber. 

With all the above knowledge, you have learnt how to setup Logical Replication Postgres and get into the practical stuff. Now, you will see a real example of how to set up logical replication in PostgreSQL. So, let’s start.

An Example of PostgreSQL Logical Replication

You will be setting up logical replication on a table and learning PostgreSQL logical replication step by step. Remember that to perform logical replication, you should have the publisher and the subscriber. In this case, you will create both in the same host. However, you will create them in different ports. 

First, you should create the publication. Let’s create it on the port 5432

Here are the steps to do this:

Step 1: Create a Table

First, create a table then insert a row into it. Run the following command:

create table table1(x int primary key, y int); 

The above command creates a table named table1 with two integer columns, x and y. Column x is the primary key for the table. The table is ready. 

Step 2: Insert a Row

Let’s insert a row into it by running this command:

insert into table1 values(10, 11);  

We have inserted a single row into the table named table1. 10 will be inserted into column x while 11 will be inserted into column y. 

Step 3: Set Up Publication

It’s now time to set up publication on the table. Here is the command for this:

create publication my_publication for table table1;   

The command will create a publication on table1 and give it the name my_publication. The publication is now ready. It’s time for you to create the subscription. You will do it on the port 5431. Following are the steps for this:

Step 3.1: Recreate the table on the subscriber node. Just run the following command:

create table table1(x int primary key, y int); 

Step 3.2: Next, we can create the subscription and give it the name my_subscription

Just run the following command:

CREATE SUBSCRIPTION my_subscription 
CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION my_publication;

Step 3.3: Since we had inserted one row into the table on the publisher side, we can check whether it has been replicated on the subscriber. Run the following command:

SELECT * FROM table1;  

So, the row has been replicated in the subscriber. 

Step 3.4: We should now check whether the changes are being replicated or not. We will simply insert a new row into the publisher and see whether this will be updated in the subscriber. So, run the following command on the publisher:

insert into table1 values(20, 25);  

Now, run the following command on the subscriber:

SELECT * FROM table1; 

So, the changes were replicated!

Limitations in PostgreSQL Logical Replication

Let’s discuss some limitations in the process of logical replication in PostgreSQL:

  1. It doesn’t replicate the schema or DDL.
  2. It doesn’t replicate sequences.
  3. Tables must have a primary key or unique key to participate in the PostgreSQL logical replication process.
  4. Bi-directional replication is not supported.
  5. Large objects are not replicated using PostgreSQL logical replication.

Know more about the limitations of PostgreSQL logical replication here.

Use Cases of Postgres Logical Replication

  • Notifying subscribers when changes happen in a single database or a subset of a database.
  • Triggers that go off when specific updates are sent to the subscriber.
  • Combining many databases into one (for example, analytical uses). replicating across many PostgreSQL major versions.
  • Replicating data between Linux and Windows PostgreSQL systems.
  • Giving several user groups access to duplicated data.
  • Allowing many databases to share a portion of the database.
  • The publisher and subscriber should have the same fully qualified name in the database.
  • The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.
  • Large objects and sequence data replication are not currently supported. 

Points to be noted before using logical replication Postgres in versions before PostgreSQL 12

1. Each publication has the ability to publish modifications to many subscribers, and each subscriber can subscribe to numerous publications.

2. You may use the ALTER PUBLICATION command to add or delete tables from an existing publication.

3. The subscriber is now unable to get a replication of the database structure and DDL definitions. The subscriber must have the published tables.

4. A standard table, not a view, materialized view, partition root table, or foreign table, may be duplicated. 

5. The publisher and subscriber should have the same fully qualified name in the database.

6. The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.

7. Large objects and sequence data replication are not currently supported

Conclusion

In this article, you have done the following:

  • You configured PostgreSQL to perform logical replication. 
  • You built the publisher and subscriber nodes. 
  • You demonstrated how to implement the publisher and the subscriber nodes on the same host. 
  • You demonstrated how to keep the publisher and the subscriber nodes in sync.

If you want to set up PostgreSQL logical replication without having to write code as described above, you can use an automated data integration solution such as Hevo.

Hevo Data is a No-code Data Pipeline that lets you extract, transform, and load data from a plethora of sources such as databases (like PostgreSQL), cloud applications, SDKs, etc. to your data warehouse. Hevo supports pre-built data integration from 150+ data sources.

visit our website to explore hevo

sign up for a 14 day free trial for a seamless data integration experience.

Share your experience of setting up PostgreSQL logical replication in the comment section below.

Nicholas Samuel
Technical Content Writer, Hevo Data

Skilled in freelance writing within the data industry, Nicholas is passionate about unraveling the complexities of data integration and data analysis through informative content for those delving deeper into these subjects. He has written more than 150+ blogs on databases, processes, and tutorials that help data practitioners solve their day-to-day problems.