A PostgreSQL replication slot is a vital feature ensuring the master server retains necessary Write-Ahead Log (WAL) files, even when replicas are temporarily disconnected. In streaming replication scenarios with hot or archiving standbys, replication slots preserve WAL files, allowing the master to track standby lag. When a standby reconnects, preserved WAL files are decoded and applied.

This article emphasizes the significance of PostgreSQL replication slots, covers the types, and provides commands for the creation, deletion, and monitoring of PostgreSQL replication slots. Dive in to discover more about PostgreSQL Replication Slots and implement them effectively!

What are Replication slots?

A replication slot is a PostgreSQL feature that ensures the master server keeps the WAL logs required by replicas even when they are disconnected from the master.

When streaming replication is utilized between a master and some hot or archiving standbys, a replication slot is needed to keep the WAL files alive even when the replica is offline or disconnected.

If the standby goes down, the master can keep track of how much the standby lags and preserve the WAL files it requires until the standby reconnects. The WAL files are then decoded and played back on the duplicate.

Catalyze PostgreSQL Replication with Hevo’s no-code Data Pipeline

Hevo is the only real-time ELT No-code Data Pipeline platform that cost-effectively automates data pipelines that are flexible to your needs. With integration with 150+ Data Sources (40+ free sources), we help you not only export data from sources & load data to the destinations but also transform & enrich your data, & make it analysis-ready.

Start for free now!

Get Started with Hevo for Free

Why PostgreSQL Replication Slots are Necessary?

PostgreSQL Replication Slots

If you have a master and some hot or archiving standbys, and you are running streaming replication between them, a replication slot is used to retain the WAL files even when the replica is offline or disconnected.

The ability to connect to the server and conduct read-only queries while it is in archive recovery or standby mode is referred to as “hot standby.” This is important for replication as well as restoring a backup to a specific state with exceptional accuracy. The phrase “Hot Standby” also refers to the server’s capacity to transition from recovery to normal operation while users continue to run queries and/or maintain open connections.

This feature was introduced in PostgreSQL 9.4. Prior to that, if a standby went offline until it came back online, a set of WAL files had to be kept by the master. This is done by setting the wal_keep_segment correctly or high enough to keep the replica from falling behind too much.

If this is not done, then the standby can’t keep up with the master after the master deletes the WAL files that the standby is yet to replay and you will get an error.

ERROR: requested WAL segment 000000490000067B00000073 has already been removed

The problem with setting the correct wal_keep_segment is, it is difficult to know exactly how many WAL files need to be kept. This is guesswork and if it is set too high, the master will exhaust the space in the pg_wal directory (WAL files are stored in pg_wal directory. Prior to version 10, they were stored in the pg_xlog directory).

Alternatives to the above method are either WAL archiving which means you have to write a script to move the WAL files to another long term location or create the standby that has fallen behind again from scratch.

After replication slots came in with PostgreSQL 9.4, you don’t have to worry about wal_keep_segments because replication slots make sure that the WAL files are kept forever. If it so happens that the standby goes offline, the master can keep track of how much the standby lags and retain the WAL it needs files until the standby reconnects again. Then the WAL files can be decoded and replayed to the replica.

Types of PostgreSQL Replication Slots

PostgreSQL Replication Slots: Types

PostgreSQL Replication slots are of two types:

  • Physical PostgreSQL Replication Slots
  • Logical PostgreSQL Replication slots

Physical PostgreSQL Replication Slots

The changes that take place on the main server via streaming replication are recorded in the WAL segments. These WAL files are sent to the standby server and then replayed.

So a physical replication slot can be created on the primary server and the location up to where the transactions have been sent to the standby is stored. Now when the standby loses connection, the primary server will keep those WAL files.

Logical PostgreSQL Replication Slots

Logical replication was introduced in PostgreSQL 10. Logical replication brings over only the SQL-like changes. It does not work without replication slots. Logical replication data has to be decoded using a plugin.

Hevo takes care of setting up WAL for Logical Replication for your pipeline. The steps for the same can be found in this documentation.

How to Create PostgreSQL Replication Slots?

The function pg_create_physical_replication_slot is used to create a physical replication slot. This command has to be run in the master node.

Now, you can create a replication slot by the name ‘ocean’.

postgres=# select pg_create_physical_replication_slot(‘ocean’);

How to Monitor PostgreSQL Replication Slots?

The below command displays all the replication slots that exist on the database cluster.

postgres=# select * from pg_replication_slots;

These are the columns that you will see in the pg_replication_slots view:

  • slot_name: This is a unique identifier of the replication slot which can contain lower-case letters, underscore characters, and numbers.
  • plugin: For physical slots, it will be null.
  • slot_type: Slot_type is a text indicating whether the slot is physical or logical.
  • datoid: Physical slots have no associated databases and hence is null. For logical slots, it will be the OID of the database this slot is associated with.
  • active: This is a Boolean value. It is True if the slot is currently active and is False if it is inactive.
  • xmin: This represents the oldest transaction this slot requires the database to keep.
  • catalog_xmin: This is the oldest transaction needed by the database to retain that affects the system catalogs.
  • restart_lsn: Log Sequence Number (LSN) is a unique identifier in the transaction log. Restart_lsn is the oldest WAL which might be needed by this slot.

How to Drop PostgreSQL Replication Slots?

You already know that replication slots have to be created and deleted manually. Do not let the inactive slots stay because the master will retain the WAL files needed by the inactive slot indefinitely and will fill up space in the disk.

This is the command used to disable a replication slot:

postgres=# select pg_drop_replication_slot(‘ocean’);

If you are using an automated data pipeline platform like Hevo Data, you will receive end-to-end documentation support for manually dropping the replication slot in some deleted Pipelines with PostgreSQL as the Source.

Replication Slots Configuration

In PostgreSQL, data replication can be dependable and effective when replication slots are activated. You may enable the required permissions for handling replication slots by giving the REPLICATION role to the user.

These slots are essential to the replication process by allocating space on the server to hold replicated data before it is consumed. Replication slots need to be properly configured for maintenance of this setup.

The user has to be granted the REPLICATION role to configure replication slots. You can use the Postgres Create Replication Slot command to grant a REPLICATION role to a user.

CREATE ROLE <replication_role_name> REPLICATION LOGIN;

Replace replication_role_name with a name of your preference.

If it doesn’t already exist, use the following SQL statement once the role has been created to create a new login for Log-Based extraction.

CREATE USER <database> WITH <replication_role_name>;

System Variable Requirements

Replication slot implementation calls for manual configuration. You need to be aware of the variables that need to be changed and stated in your postgresql.conf file.

  • max_replication_slots: If this parameter is set to 0, replication slots are not enabled at all. You are required to specify a slot other than 0 (default) if you are using PostgreSQL versions less than 10 and 10 for PostgreSQL 10. This variable indicates how many replication slots are allowed. The server won’t start if you set the variable to a value less than the total number of available replication slots.
  • wal_level: Must be a replica or higher (a replica is a default). Setting hot_standby or archive will map to a replica. A replica is sufficient for a physical replication slot. Logical replication slots are recommended.
  • max_wal_senders: set to 10 by default; 0 in the 9.6 version indicates replication is disabled. It is recommended to set this to at least 16.
  • hot_standby: must be enabled in versions less than 10, as it is disabled by default. This is critical for standby nodes because when enabled, you can connect and run queries during recovery or standby mode.
  • primary_slot_name:  This variable is configured in recovery.conf on the standby node. This is the slot that the receiver or standby node will utilize when connecting to the sender (or primary).

It is important to remember that most of these variables require a database service restart to reload new values.

Advantages and Disadvantages of Replication Slots

Once activated, replication slots are quite valuable. “Replication Slots” are disabled by default and must be enabled explicitly. The following are some of the benefits of employing Replication Slots:

  • Ensures that the master has adequate WAL segments for all clones.
  • It prevents the master from removing rows that can trigger a replica recovery conflict.
  • Only when all clones have devoured the transaction log may the master recycle it. The benefit is that a slave can never fall so far behind that a re-sync is required.

There are certain disadvantages to replication slots, too:

  • If a replica is disconnected for an extended period, the replication slot on the master retains WAL files, potentially leading to increased storage requirements.
  • The need to monitor replication slots, especially in environments with multiple replicas, can add complexity to the management of PostgreSQL replication.
  • The WAL files are retained by the master when the replica disconnects. This also means that the pg_wal directory may run out of space. Imagine a scenario where the replica fails forever and cannot be recovered (an orphaned replication slot) or when a replica cannot replay the WAL segments fast enough.
  • You need to monitor the slots and manually drop them. Only when you do that will the master delete anything from the pg_wal directory. You will see how you can monitor and drop replication slots later. So, while WAL retention is taken care of without manual settings, pg_wal directory space needs to be manually monitored.

Conclusion

The article introduced you to PostgreSQL replication slots, their purpose, and their limitations. However, when it comes to working with replication, you need to be an expert at PostgreSQL to set up servers from scratch and manually configure several details. Moreover, most of the time, the data is not available in the right format and you will need data engineering and PostgreSQL administration skills to transform the data.

This is where Hevo comes into the picture. With Hevo, you can seamlessly perform PostgreSQL Replication in real-time. Furthermore, Hevo’s fault-tolerant architecture ensures a consistent and secure replication of your PostgreSQL data. Using Hevo will make your life easier and make Data Replication hassle-free.

Visit our Website to Explore Hevo

Want to take Hevo for a spin? Sign Up for a 14-day free trial and experience the feature-rich Hevo suite first hand. Check out our unbeatable Hevo Pricing to select the right plan for you!

Share your thoughts on PostgreSQL Replications Slots in the comments below!

Nikhil Annadanam
Freelance Technical Content Writer, Hevo Data

Nikhil specializes in freelance writing within the data industry, delivering informative and engaging content related to data science by blending his problem solving ability.

Get Started with Hevo