PostgreSQL Logical Replication: Step-by-Step Explanation

Q: 2. What is the difference between physical and logical replication in PostgreSQL?

Physical Replication : Replicates the entire database cluster at the block level, including all transactions. Suitable for disaster recovery and high availability. Logical Replication : Replicates specific tables or changes at the row level, allowing selective replication and schema transformations.

When working with PostgreSQL in production, you might want to keep certain tables synced across different databases. This is where replication comes into play.

Replication in PostgreSQL is commonly used in production environments for high availability, load distribution, reporting, and data synchronization across environments. There are two types of replication in PostgreSQL: logical and physical.

Table of Contents

What Is PostgreSQL Logical Replication?

1. Physical replication

Physical replication in PostgreSQL is a binary-level replication method that transmits the write-ahead log (WAL) records from a primary server to one or more standby servers.

The replication involves streaming replication, file-based log shipping, or both to keep standby nodes synchronized with the primary.

This means you can replicate the entire database cluster, including databases, schemas, and configuration files, by applying changes at the storage level.

2. Logical replication

In contrast, PostgreSQL logical replication captures row-level changes, such as inserts, updates, and deletes, by decoding changes from the WAL. The decoded changes are then transmitted to another PostgreSQL instance, where they are replayed as SQL operations.

Logical decoding enables selective replication across PostgreSQL instances, making it suitable for migrations, data distribution, and integration use cases.

Understanding physical replication clarifies that logical replication focuses on targeted, row-level synchronization rather than full-cluster duplication. Now, let’s understand the structure of logical replication.

The Architecture of PostgreSQL Logical Replication

Logical replication in PostgreSQL is built on a publish and subscribe model, where changes in specific tables are replicated from one database to another using row-level data.

Publisher: The publisher node creates a publication on the source PostgreSQL database that defines what to replicate. It tracks row-level data changes and doesn’t replicate schema changes.

Subscriber: The subscriber node creates a subscription on the target PostgreSQL database that defines where to replicate. It connects to a publication on the source, receives the data changes, and applies them to local tables.

To sum it up, you can set up a publication on the source database to share table changes, and create a subscription on the target database to receive those changes.

But how is the logical replication in PostgreSQL done? Let’s find out!

PostgreSQL Logical Replication Step-by-Step Process

For accurate row-level sync, PostgreSQL logical replication follows a two-step process:

Step 1: Initial Table Synchronization

On creating the first subscription, PostgreSQL copies the existing rows of published tables to the subscriber, ensuring both nodes are in sync before replicating real-time changes.

The workflow:

Snapshot creation: A snapshot of each published table is taken from the publisher, which is associated with a Log Sequence Number (LSN).

Table sync: Each table is handled by a table synchronization worker to transfer data from the publisher to the subscriber using standard SQL copy.

Multiple sync workers can run in parallel, but only one per table. This is controlled using the max_sync_workers_per_subscription configuration parameter.

WAL buffers: Logical decoding captures the changes in the publisher from the WAL and buffers them until the sync finishes.

Consistency: Once the initial data is copied, the buffered changes are applied from the exact snapshot LSN, ensuring no changes are skipped.

Step 2: Streaming Data Changes

After the initial sync, PostgreSQL transitions to real-time change replication. Further modifications on the publisher are extracted from the WAL and streamed to the subscriber using logical decoding.

The workflow:

Capture changes: PostgreSQL reads changes from the WAL and converts them into logical messages. Changes include:

INSERT
UPDATE
DELETE

Streaming: Changes are transmitted through a persistent replication connection managed by the apply worker on the subscriber.

Enforcement: The apply worker applies changes consistently to maintain transactional consistency.

Note: Logical replication is asynchronous, which means there might be a slight delay depending on:

Network speed
System performance
WAL generation rate
Replication slot lag

Methods for PostgreSQL Logical Replication

There are primarily two methods to execute PostgreSQL logical replication:

Using Hevo’s automated data pipeline
Using the manual method

Method 1: PostgreSQL logical replication using Hevo’s automated data pipeline

Hevo offers a fully managed, no-code solution for logical replication of PostgreSQL data to cloud warehouses and BI tools via PostgreSQL’s WALs. Here’s how it works:

Step 1: Connect the PostgreSQL source

Hevo supports connections to:

Amazon RDS PostgreSQL
Amazon Aurora PostgreSQL
Azure Database for PostgreSQL
Self-hosted PostgreSQL instances

Enable wal_level = logical when configuring the source.

Step 2: Configure the replication mode

Hevo offers three modes for data ingestion:

Logical WAL replication: Uses PostgreSQL’s logical decoding feature to stream row-level changes in real time.

Custom SQL: To define a custom SQL query to extract data from the source at scheduled intervals.

Table mode: Pulls data from one or more tables individually for fine-grained control over ingestion.

Step 3: Set up the target destination

The workflow:

Choose your target destination (e.g., Redshift, Snowflake, BigQuery, etc.).
Define the target schema or dataset where replicated tables will be written.

Hevo automatically:

Creates destination tables (if they don’t exist)
Maps column types based on source schema
Handles nested or complex data structures during load

Step 4: Enable auto schema mapping

The workflow:

Hevo detects the source schema, which includes tables, columns, and data types during the initial sync.
The detected schema is mapped to the destination schema.
The platform detects new columns added over time and adjusts the mapping to reflect the changes.

You can manually review or customize the mapped schema in Hevo’s UI to ensure it aligns with your destination data model.

Step 5: Trigger real-time data replication

The workflow:

Once the initial data is ingested, Hevo begins incremental replication using PostgreSQL’s WAL-based Change Data Capture (CDC).

All new INSERT, UPDATE, and DELETE operations are captured in real time and applied to the destination.
Hevo continuously reads changes from the logical replication slot and applies them in real time.

This reduces replication lag, minimizes source load, and ensures up-to-date data with high throughput and low latency.

Step 6: Monitor and maintain replication with Hevo

Validate replication using Hevo’s key features:

Use the monitoring dashboard to track pipeline status, sync progress, and data latency in real time.
Set custom alerts and failure notifications for schema mismatches, ingestion delays, and transformation errors.
Use 24/7 live technical support to resolve configuration errors, schema drift, or replication slot mismanagement.

What customers say about Hevo:

Unlock the power of your Postgres data by seamlessly connecting it to various destinations, enabling comprehensive analysis in tools like Google Data Studio.

Check out Why Hevo is the right choice for you!

No-Code Platform: Easily set up and manage your data pipelines without any coding.
Auto-Schema Mapping: Automatically map schemas for smooth data transfer.
Pre and Post-Load Transformations: Transform your data at any stage of the migration process.
Real-Time Data Ingestion: Keep your data up-to-date with real-time synchronization.

Join over 2000 happy customers who trust Hevo for their data integration needs and experience why we are rated 4.7 on Capterra.

Move PostgreSQL Data for Free

Method 2: PostgreSQL logical replication using manual method

Below is a step-by-step breakdown for configuring logical replication manually:

Step 1: Set up logical replication parameters

To use logical replication, update your PostgreSQL configuration to set wal_level = logical. By default, it’s set to log physical changes, which won’t support logical replication.

Locate the postgresql.conf file:

For Linux: /etc/postgresql/<version>/main/postgresql.conf
For Windows: C:\Program Files\PostgreSQL\<version>\data\postgresql.conf

After opening the file, add the following parameter under the WRITE-AHEAD LOG setting:

wal_level = logical

Now, restart the PostgreSQL service after saving the configuration:

For Linux:

sudo service postgresql stop
sudo service postgresql start

For Windows: Use the Services control panel to restart the PostgreSQL service.

Next, we’ll verify the logical WAL level by running this SQL command:

SHOW wal_level;

Step 2: Create a publication on the publisher

Use this command to create a publication:

CREATE PUBLICATION my_publication FOR ALL TABLES;

Or, specify the tables you need to publish:

CREATE PUBLICATION my_publication FOR TABLE table1, table2;

You can also choose to limit the changes that will be published:

CREATE PUBLICATION my_publication FOR TABLE table1
WITH (publish = 'insert, update');

Note: For tables without a primary key, you must explicitly define one, or use REPLICA IDENTITY FULL:

ALTER TABLE table1 REPLICA IDENTITY FULL;

Step 3: Define the subscription

After the publications have been created, you can go ahead to create subscriptions in the subscriber node:

CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=<publisher_host> port=5432 dbname=yourdb user=replicator password=secret'
PUBLICATION my_publication;

Creates a subscription named my_subscription.
Subscribes to the my_publication publication.
Copies existing data from the published tables.
Starts streaming live changes to subscribers.

An Example of Postgres Logical Replication

Here, we have taken an example of two PostgreSQL instances running on the same host. Starting with one PostgreSQL instance running on port 5432 as the publisher, and another on port 5431 as the subscriber.

The pre-setup requirements include:

wal_level parameter in postgresql.conf must be set to logical.
max_replication_slots ≥ number of subscriptions.
max_wal_senders ≥ number of active replication connections.

Step 1: Create the table on the publisher

Connect to the PostgreSQL instance running on port 5432 and execute:

CREATE TABLE table1(x INT PRIMARY KEY, y INT);

The above command creates a table named table1 with two integer columns, x and y. Column x is the primary key for the table to identify rows during UPDATE and DELETE operations.

Step 2: Insert a row into the publisher

Here, we insert a test row in table1 to confirm the transfer of existing data to the subscriber.

INSERT INTO table1 VALUES (10, 11);

Inserts one row into table1 on port 5432.
Value 10 goes to column x, and 11 goes to column y.

Step 3: Set up publication for replication

Create a publication:

CREATE PUBLICATION my_publication FOR TABLE table1;

Creates a publication named my_publication.
PostgreSQL tracks row-level changes on table1 for replication to subscribers.

Step 4: Recreate the table on the subscriber node

Create the same table on the PostgreSQL instance running on port 5431:

CREATE TABLE table1(x INT PRIMARY KEY, y INT);

This step is necessary because logical replication doesn’t replicate schema changes.

Step 5: Set up replication on the subscriber

Now, we create a subscription to link with the publication:

CREATE SUBSCRIPTION my_subscription 
CONNECTION 'host=localhost port=5432 dbname=postgres' 
PUBLICATION my_publication;

Creates a subscription named my_subscription.
Connects to the publisher on port 5432.
Subscribes to the publication, my_publication.
Triggers initial synchronization to copy table1 data to the subscriber.

Step 6: Verify sync

Here we check the replication of rows inserted in Step 2 (10,11):

SELECT * FROM table1;

PostgreSQL takes a snapshot of the table on the publisher.
Copies the data to the subscriber using a table synchronization worker.
Inserts the data using standard SQL.

Step 7: Confirm streaming replication

We simply insert a new row into the publisher and check if it gets updated in the subscriber:

INSERT INTO table1 VALUES (20, 25);

Next, we run the following query on the subscriber to verify changes:

SELECT * FROM table1;

If you see this:

 x  | y
----+----
10  | 11
20  | 25

This confirms that the replication is working and that changes are being streamed from the publisher to the subscriber.

Limitations in Postgres Logical Replication

Let’s discuss some limitations in the process of logical replication in PostgreSQL:

It doesn’t replicate the schema or DDL.
It doesn’t replicate sequences.
Tables must have a primary key or unique key to participate in the PostgreSQL logical replication process.
Bi-directional replication is not supported.
Large objects are not replicated using PostgreSQL logical replication.

Take a look at the limitations of PostgreSQL logical replication in detail.

Use Cases of Postgres Logical Replication

Notifying subscribers when changes happen in a single database or a subset of a database.
Triggers that go off when specific updates are sent to the subscriber.
Combining many databases into one (for example, analytical uses). replicating across many PostgreSQL major versions.
Replicating data between Linux and Windows PostgreSQL systems.
Giving several user groups access to duplicated data.
Allowing many databases to share a portion of the database.
The publisher and subscriber should have the same fully qualified name in the database.
The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.
Large objects and sequence data replication are not currently supported.

Points to be noted before using logical replication Postgres in versions before PostgreSQL 12

1. Each publication has the ability to publish modifications to many subscribers, and each subscriber can subscribe to numerous publications.

2. You may use the ALTER PUBLICATION command to add or delete tables from an existing publication.

3. The subscriber is now unable to get a replication of the database structure and DDL definitions. The subscriber must have the published tables.

4. A standard table, not a view, materialized view, partition root table, or foreign table, may be duplicated.

5. The publisher and subscriber should have the same fully qualified name in the database.

6. The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.

7. Large objects and sequence data replication are not currently supported.

Conclusion

In this article, you have done the following:

You configured PostgreSQL to perform logical replication.
You built the publisher and subscriber nodes.
You demonstrated how to implement the publisher and the subscriber nodes on the same host.
You demonstrated how to keep the publisher and the subscriber nodes in sync.

If you want to set up PostgreSQL logical replication without having to write code as described above, you can use an automated data integration solution such as Hevo.

Want to take Hevo for a spin? Sign Up or a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also checkout our unbeatable pricing to choose the best plan for your organization.

FAQs

1. How does logical replication work in PostgreSQL?

Logical replication in PostgreSQL copies data changes from a publisher database to one or more subscriber databases. It uses a publish-subscribe model, where the publisher sends logical changes (INSERT, UPDATE, DELETE) to subscribers through replication slots.

2. What is the difference between physical and logical replication in PostgreSQL?

Physical Replication: Replicates the entire database cluster at the block level, including all transactions. Suitable for disaster recovery and high availability.
Logical Replication: Replicates specific tables or changes at the row level, allowing selective replication and schema transformations.

3. What are the limitations of logical replication in PostgreSQL?

No support for DDL (e.g., ALTER TABLE).
Doesn’t replicate sequences automatically.
Requires a primary key or replica identity for tables.
Limited conflict resolution capabilities.

Nicholas Samuel Technical Content Writer, Hevo Data

Nicholas Samuel is a technical writing specialist with a passion for data, having more than 14+ years of experience in the field. With his skills in data analysis, data visualization, and business intelligence, he has delivered over 200 blogs. In his early years as a systems software developer at Airtel Kenya, he developed applications, using Java, Android platform, and web applications with PHP. He also performed Oracle database backups, recovery operations, and performance tuning. Nicholas was also involved in projects that demanded in-depth knowledge of Unix system administration, specifically with HP-UX servers. Through his writing, he intends to share the hands-on experience he gained to make the lives of data practitioners better.

PostgreSQL Logical Replication: Easy Step-by-Step Explanation

What Is PostgreSQL Logical Replication?

1. Physical replication

2. Logical replication

The Architecture of PostgreSQL Logical Replication

PostgreSQL Logical Replication Step-by-Step Process

Step 1: Initial Table Synchronization

Step 2: Streaming Data Changes

Methods for PostgreSQL Logical Replication

Method 1: PostgreSQL logical replication using Hevo’s automated data pipeline

Step 1: Connect the PostgreSQL source

Step 2: Configure the replication mode

Step 3: Set up the target destination

Step 4: Enable auto schema mapping

Step 5: Trigger real-time data replication

Step 6: Monitor and maintain replication with Hevo

Method 2: PostgreSQL logical replication using manual method

Step 1: Set up logical replication parameters

Step 2: Create a publication on the publisher

Step 3: Define the subscription

An Example of Postgres Logical Replication

Step 1: Create the table on the publisher

Step 2: Insert a row into the publisher

Step 3: Set up publication for replication

Step 4: Recreate the table on the subscriber node

Step 5: Set up replication on the subscriber

Step 6: Verify sync

Step 7: Confirm streaming replication

Limitations in Postgres Logical Replication

Use Cases of Postgres Logical Replication

Points to be noted before using logical replication Postgres in versions before PostgreSQL 12

Conclusion

FAQs

1. How does logical replication work in PostgreSQL?

2. What is the difference between physical and logical replication in PostgreSQL?

3. What are the limitations of logical replication in PostgreSQL?

Related articles

PostgreSQL Logical Replication: Easy Step-by-Step Explanation

What Is PostgreSQL Logical Replication?

1. Physical replication

2. Logical replication

The Architecture of PostgreSQL Logical Replication

PostgreSQL Logical Replication Step-by-Step Process

Step 1: Initial Table Synchronization

Step 2: Streaming Data Changes

Methods for PostgreSQL Logical Replication

Method 1: PostgreSQL logical replication using Hevo’s automated data pipeline

Step 1: Connect the PostgreSQL source

Step 2: Configure the replication mode

Step 3: Set up the target destination

Step 4: Enable auto schema mapping

Step 5: Trigger real-time data replication

Step 6: Monitor and maintain replication with Hevo

Method 2: PostgreSQL logical replication using manual method

Step 1: Set up logical replication parameters

Step 2: Create a publication on the publisher

Step 3: Define the subscription

An Example of Postgres Logical Replication

Step 1: Create the table on the publisher

Step 2: Insert a row into the publisher

Step 3: Set up publication for replication

Step 4: Recreate the table on the subscriber node

Step 5: Set up replication on the subscriber

Step 6: Verify sync

Step 7: Confirm streaming replication

Limitations in Postgres Logical Replication

Use Cases of Postgres Logical Replication

Points to be noted before using logical replication Postgres in versions before PostgreSQL 12

Conclusion

FAQs

1. How does logical replication work in PostgreSQL?

2. What is the difference between physical and logical replication in PostgreSQL?

3. What are the limitations of logical replication in PostgreSQL?

Related Articles

Optimize your data integration with Hevo!

Related articles