One of the major goals of storing data in a database is to ensure data availability. You would want to access your data at any time that you want. Having a single copy of the database can be dangerous. In case of an unprecedented event, you won’t be able to access it.
That’s why it’s a good idea for you to have multiple copies of your database. PostgreSQL database management system makes this possible. It supports the concept of replication that you can use to create multiple copies of your database.
What is Logical Replication?
When deploying applications in a production environment, it is always good for you to have multiple copies of the database. After making the copies, you must ensure that they are all in sync. This process of keeping database copies in sync is known as replication.
There are two types of replication in PostgreSQL:
- Physical replication
- Logical replication
The Architecture of PostgreSQL Logical Replication
The PostgreSQL logical replication follows the publish/subscribe model. The publisher node creates a publication, which is also called a change set or a replication set. The publication is simply a set of changes from one or more tables. The subscriber node creates a subscription, with the capability to subscribe to either one or more publications. Now that you know the basics of logical replication, how is the logical replication in PostgreSQL done? That’s what you will learn in the next section.
PostgreSQL Logical Replication Process
The logical replication in PostgreSQL follows the steps given below:
Step 1: A snapshot of the publishing database is copied to the subscriber. This step is also called the table synchronization phase. To reduce the amount of time spent in this phase, you can spawn multiple table synchronization workers. However, you can only have one synchronization worker for each table.
Step 2: After the copy is completed, subsequent changes made in the publisher node will be sent to the subscriber node. The changes will be sent as they happen in real-time. The changes will be applied in a commit order to ensure there is transactional consistency.
Method 1: PostgreSQL logical replication using Hevo’s automated data pipeline
Hevo Data, an Automated No Code Data Pipeline, helps you replicate data from PostgreSQL to Data Warehouses, Business Intelligence Tools, or any other destination of your choice in a completely hassle-free & automated manner. Hevo supports data ingestion replication from PostgreSQL servers via Write Ahead Logs (WALs) set at the logical level. This mode is particularly useful when there is a need to replicate the entire database as it is.
Apart from logical replication, Hevo also supports PostgreSQL replication using Custom SQL (fetches data from Postgres source using a custom SQL query at a fixed frequency) and table mode (fetches data from multiple tables while maintaining control over data ingestion from each table individually)
Hevo provides you with an all-in-one solution for replicating data from Amazon Aurora PostgreSQL, Amazon RDS PostgreSQL, Azure PostgreSQL, and various other PostgreSQL sources. This way you can focus more on your Data Analysis and let Hevo manage the PostgreSQL Data Replication.
Check out the salient features of Hevo:
- Quick Setup: Hevo with its automated features, can be set up within minutes in a simple 3 step process. Moreover, with its simple and interactive UI, it is extremely easy for new customers to work on and perform operations.
- Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the schema of incoming data from PostgreSQL and replicates it to the destination schema.
- Incremental Data Load: Hevo allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
- Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Get Started with Hevo for Free
Method 2: PostgreSQL logical replication using manual method
Basic Syntax
To perform logical replication in PostgreSQL, you’ve to change the wal_level = logical parameter. The parameter tells the server to store additional information in WAL for converting binary changes into logical ones. To change the value of this parameter, you have to open the postgresql.conf file.
On Linux, this file can be found in the following directory:
/etc/postgresql/version/main
On Windows, the file can be found in the following directory:
C:Program FilesPostgreSQLversiondata
After opening the file, go to the WRITE-AHEAD LOG settings.
Uncomment the parameter wal_level and set it to the following:
wal_level = logical
You can then restart the PostgreSQL service.
On Windows, you can open the Services dialog and restart the service from there.
On Linux, you can stop and then start the service by running the following sequence of commands:
sudo service postgresql stop
sudo service postgresql start
To check whether the change has taken effect, run the following command on PostgreSQL terminal:
SHOW wal_level;
Once you get this, logical replication has been turned on.
After that, you can create a publication by running the following command:
CREATE PUBLICATION my_publication FOR ALL TABLES;
If you don’t want to publish all tables, you can specify the tables that you need to publish. You can also choose to limit the changes that will be published.
However, for DELETE and UPDATE operations to be replicated, the “published” table should have a REPLICA IDENTITY, which can be a primary key. This makes it possible for the modified rows to be identified in the subscriber. INSERT operations can be done without a replica identity. After the publications have been created, you can go ahead to create subscriptions in the subscriber node.
You can use the following command for this:
CREATE SUBSCRIPTION my_subscription CONNECTION '...
<connection string> ...' PUBLICATION my_publication;
The command will add a subscription named my_subscription for the current database that will begin to receive logical changes from the publication named my_publication. Once you run the command, a logical replication worker will be spawned, and this will receive logical changes coming from the publisher. On the side of the publisher, a walsender process will be spawned, which will be reading the WAL step-by-step, decode changes, and send them to the correct subscriber.
With all the above knowledge, you have learnt how to setup Logical Replication Postgres and get into the practical stuff. Now, you will see a real example of how to set up logical replication in PostgreSQL. So, let’s start.
An Example of PostgreSQL Logical Replication
You will be setting up logical replication on a table and learning PostgreSQL logical replication step by step. Remember that to perform logical replication, you should have the publisher and the subscriber. In this case, you will create both in the same host. However, you will create them in different ports.
First, you should create the publication. Let’s create it on the port 5432.
Here are the steps to do this:
Step 1: Create a Table
First, create a table then insert a row into it. Run the following command:
create table table1(x int primary key, y int);
The above command creates a table named table1 with two integer columns, x and y. Column x is the primary key for the table. The table is ready.
Step 2: Insert a Row
Let’s insert a row into it by running this command:
insert into table1 values(10, 11);
We have inserted a single row into the table named table1. 10 will be inserted into column x while 11 will be inserted into column y.
Step 3: Set Up Publication
It’s now time to set up publication on the table. Here is the command for this:
create publication my_publication for table table1;
The command will create a publication on table1 and give it the name my_publication. The publication is now ready. It’s time for you to create the subscription. You will do it on the port 5431. Following are the steps for this:
Step 3.1: Recreate the table on the subscriber node. Just run the following command:
create table table1(x int primary key, y int);
Step 3.2: Next, we can create the subscription and give it the name my_subscription.
Just run the following command:
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION my_publication;
Step 3.3: Since we had inserted one row into the table on the publisher side, we can check whether it has been replicated on the subscriber. Run the following command:
SELECT * FROM table1;
So, the row has been replicated in the subscriber.
Step 3.4: We should now check whether the changes are being replicated or not. We will simply insert a new row into the publisher and see whether this will be updated in the subscriber. So, run the following command on the publisher:
insert into table1 values(20, 25);
Now, run the following command on the subscriber:
SELECT * FROM table1;
So, the changes were replicated!
Limitations in PostgreSQL Logical Replication
Let’s discuss some limitations in the process of logical replication in PostgreSQL:
- It doesn’t replicate the schema or DDL.
- It doesn’t replicate sequences.
- Tables must have a primary key or unique key to participate in the PostgreSQL logical replication process.
- Bi-directional replication is not supported.
- Large objects are not replicated using PostgreSQL logical replication.
Know more about the limitations of PostgreSQL logical replication here.
Use Cases of Postgres Logical Replication
- Notifying subscribers when changes happen in a single database or a subset of a database.
- Triggers that go off when specific updates are sent to the subscriber.
- Combining many databases into one (for example, analytical uses). replicating across many PostgreSQL major versions.
- Replicating data between Linux and Windows PostgreSQL systems.
- Giving several user groups access to duplicated data.
- Allowing many databases to share a portion of the database.
- The publisher and subscriber should have the same fully qualified name in the database.
- The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.
- Large objects and sequence data replication are not currently supported.
Points to be noted before using logical replication Postgres in versions before PostgreSQL 12
1. Each publication has the ability to publish modifications to many subscribers, and each subscriber can subscribe to numerous publications.
2. You may use the ALTER PUBLICATION command to add or delete tables from an existing publication.
3. The subscriber is now unable to get a replication of the database structure and DDL definitions. The subscriber must have the published tables.
4. A standard table, not a view, materialized view, partition root table, or foreign table, may be duplicated.
5. The publisher and subscriber should have the same fully qualified name in the database.
6. The order of the columns in the subscriber table is irrelevant; the column names must match. Furthermore, a subscription table may have the same amount of columns or more.
7. Large objects and sequence data replication are not currently supported.
Read More About: Migrate Data from PostgreSQL to SQL Server
Conclusion
In this article, you have done the following:
- You configured PostgreSQL to perform logical replication.
- You built the publisher and subscriber nodes.
- You demonstrated how to implement the publisher and the subscriber nodes on the same host.
- You demonstrated how to keep the publisher and the subscriber nodes in sync.
If you want to set up PostgreSQL logical replication without having to write code as described above, you can use an automated data integration solution such as Hevo.
Want to take Hevo for a spin? Sign Up or a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also checkout our unbeatable pricing to choose the best plan for your organization.
Nicholas Samuel is a technical writing specialist with a passion for data, having more than 14+ years of experience in the field. With his skills in data analysis, data visualization, and business intelligence, he has delivered over 200 blogs. In his early years as a systems software developer at Airtel Kenya, he developed applications, using Java, Android platform, and web applications with PHP. He also performed Oracle database backups, recovery operations, and performance tuning. Nicholas was also involved in projects that demanded in-depth knowledge of Unix system administration, specifically with HP-UX servers. Through his writing, he intends to share the hands-on experience he gained to make the lives of data practitioners better.