A PostgreSQL replication slot is a vital feature ensuring the master server retains necessary Write-Ahead Log (WAL) files, even when replicas are temporarily disconnected. In streaming replication scenarios with hot or archiving standbys, replication slots preserve WAL files, allowing the master to track standby lag. When a standby reconnects, preserved WAL files are decoded and applied.
This article emphasizes the significance of PostgreSQL replication slots, covers the types, and provides commands for the creation, deletion, and monitoring of Postgres replication slots. Dive in to discover more about Postgres Replication Slot and implement them effectively!
What are Replication slots?
A replication slot is a PostgreSQL feature that ensures the master server keeps the WAL logs required by replicas even when they are disconnected from the master.
When streaming replication is utilized between a master and some hot or archiving standbys, a replication slot is needed to keep the WAL files alive even when the replica is offline or disconnected.
If the standby goes down, the master can keep track of how much the standby lags and preserve the WAL files it requires until the standby reconnects. The WAL files are then decoded and played back on the duplicate.
Why PostgreSQL Replication Slots are Necessary?
If you have a master and some hot or archiving standbys, and you are running streaming replication between them, a replication slot is used to retain the WAL files even when the replica is offline or disconnected.
The ability to connect to the server and conduct read-only queries while it is in archive recovery or standby mode is referred to as “hot standby.” This is important for replication as well as restoring a backup to a specific state with exceptional accuracy. The phrase “Hot Standby” also refers to the server’s capacity to transition from recovery to normal operation while users continue to run queries and/or maintain open connections.
This feature was introduced in PostgreSQL 9.4. Prior to that, if a standby went offline until it came back online, a set of WAL files had to be kept by the master. This is done by setting the wal_keep_segment
correctly or high enough to keep the replica from falling behind too much.
If this is not done, then the standby can’t keep up with the master after the master deletes the WAL files that the standby is yet to replay and you will get an error.
ERROR: requested WAL segment 000000490000067B00000073 has already been removed
The problem with setting the correct wal_keep_segment
is, it is difficult to know exactly how many WAL files need to be kept. This is guesswork and if it is set too high, the master will exhaust the space in the pg_wal
directory (WAL files are stored in pg_wal
directory. Prior to version 10, they were stored in the pg_xlog
directory).
Alternatives to the above method are either WAL archiving which means you have to write a script to move the WAL files to another long term location or create the standby that has fallen behind again from scratch.
After replication slots came in with PostgreSQL 9.4, you don’t have to worry about wal_keep_segments
because replication slots make sure that the WAL files are kept forever. If it so happens that the standby goes offline, the master can keep track of how much the standby lags and retain the WAL it needs files until the standby reconnects again. Then the WAL files can be decoded and replayed to the replica.
Types of PostgreSQL Replication Slots
PostgreSQL Replication slots are of two types:
- Physical PostgreSQL Replication Slots
- Logical PostgreSQL Replication slots
1. Physical PostgreSQL Replication Slots
The changes that take place on the main server via streaming replication are recorded in the WAL segments. These WAL files are sent to the standby server and then replayed.
So a physical replication slot can be created on the primary server and the location up to where the transactions have been sent to the standby is stored. Now when the standby loses connection, the primary server will keep those WAL files.
2. Logical PostgreSQL Replication Slots
Logical replication was introduced in PostgreSQL 10. Logical replication brings over only the SQL-like changes. It does not work without replication slots. Logical replication data has to be decoded using a plugin.
Hevo takes care of setting up WAL for Logical Replication for your pipeline.
How to Create Replication Slot PostgreSQL?
The function pg_create_physical_replication_slot is used to create a physical replication slot. This command has to be run in the master node.
Now, you can create a replication slot by the name ‘ocean’.
postgres=# select pg_create_physical_replication_slot(‘ocean’);
Sync Data from PostgreSQL to BigQuery
Integrate PostgreSQL to Snowflake
Integrate PostgreSQL to Redshift
How to Monitor Postgres Replication Slots?
The below command displays all the replication slots that exist on the database cluster.
postgres=# select * from pg_replication_slots;
These are the columns that you will see in the pg_replication_slots view:
- slot_name: This is a unique identifier of the replication slot which can contain lower-case letters, underscore characters, and numbers.
- plugin: For physical slots, it will be null.
- slot_type: Slot_type is a text indicating whether the slot is physical or logical.
- datoid: Physical slots have no associated databases and hence is null. For logical slots, it will be the OID of the database this slot is associated with.
- active: This is a Boolean value. It is True if the slot is currently active and is False if it is inactive.
- xmin: This represents the oldest transaction this slot requires the database to keep.
- catalog_xmin: This is the oldest transaction needed by the database to retain that affects the system catalogs.
- restart_lsn: Log Sequence Number (LSN) is a unique identifier in the transaction log.
Restart_lsn
is the oldest WAL which might be needed by this slot.
How to Drop PostgreSQL Replication Slots?
You already know that replication slots have to be created and deleted manually. Do not let the inactive slots stay because the master will retain the WAL files needed by the inactive slot indefinitely and will fill up space in the disk.
This is the command used to disable a replication slot:
postgres=# select pg_drop_replication_slot(‘ocean’);
pg_drop_replication_slot('ocean')
: This is a PostgreSQL function used to remove a replication slot.
- Replication Slot: A replication slot is used to track the changes (such as WAL records) for logical replication, ensuring that a subscriber does not miss any changes.
'ocean'
: The name of the replication slot to be dropped. Replace 'ocean'
with the actual name of the replication slot you want to drop.
If you are using an automated data pipeline platform like Hevo Data, you will receive end-to-end documentation support for manually dropping the replication slot in some deleted Pipelines with PostgreSQL as the Source.
Replication Slots Configuration
In PostgreSQL, data replication can be dependable and effective when replication slots are activated. You may enable the required permissions for handling replication slots by giving the REPLICATION role to the user.
These slots are essential to the replication process by allocating space on the server to hold replicated data before it is consumed. Replication slots need to be properly configured for maintenance of this setup.
The user has to be granted the REPLICATION role to configure replication slots. You can use the Postgres Create Replication Slot command to grant a REPLICATION role to a user.
CREATE ROLE <replication_role_name> REPLICATION LOGIN;
CREATE ROLE <replication_role_name>
: This creates a new role (user) with the name <replication_role_name>
. You would replace <replication_role_name>
with the desired role name (e.g., replication_user
).
REPLICATION
: Grants the role the ability to use replication. This is typically required for setting up replication between primary and standby databases in PostgreSQL.
LOGIN
: Allows the role to log into the database, making it a valid user.
Replace replication_role_name with a name of your preference.
If it doesn’t already exist, use the following SQL statement once the role has been created to create a new login for Log-Based extraction.
CREATE USER <database> WITH <replication_role_name>;
CREATE USER <database>
: This creates a new user with the name <database>
. You would replace <database>
with the desired username (e.g., replication_user
).
WITH <replication_role_name>
: This grants the newly created user the role <replication_role_name>
, which has replication privileges.
- You would replace
<replication_role_name>
with the name of the role that has been granted replication access (e.g., replication_role
).
System Variable Requirements
Replication slot implementation calls for manual configuration. You need to be aware of the variables that need to be changed and stated in your postgresql.conf
file.
max_replication_slots:
If this parameter is set to 0, replication slots are not enabled at all. You are required to specify a slot other than 0 (default) if you are using PostgreSQL versions less than 10 and 10 for PostgreSQL 10. This variable indicates how many replication slots are allowed. The server won’t start if you set the variable to a value less than the total number of available replication slots.
wal_level
: Must be a replica or higher (a replica is a default). Setting hot_standby
or archive will map to a replica. A replica is sufficient for a physical replication slot. Logical replication slots are recommended.
max_wal_senders:
set to 10 by default; 0 in the 9.6 version indicates replication is disabled. It is recommended to set this to at least 16.
hot_standby:
must be enabled in versions less than 10, as it is disabled by default. This is critical for standby nodes because when enabled, you can connect and run queries during recovery or standby mode.
primary_slot_name:
This variable is configured in recovery.conf
on the standby node. This is the slot that the receiver or standby node will utilize when connecting to the sender (or primary).
It is important to remember that most of these variables require a database service restart to reload new values.
Advantages and Disadvantages of Replication Slots
Once activated, replication slots are quite valuable. “Replication Slots” are disabled by default and must be enabled explicitly. The following are some of the benefits of employing Replication Slots:
- Ensures that the master has adequate WAL segments for all clones.
- It prevents the master from removing rows that can trigger a replica recovery conflict.
- Only when all clones have devoured the transaction log may the master recycle it. The benefit is that a slave can never fall so far behind that a re-sync is required.
There are certain disadvantages to replication slots, too:
- If a replica is disconnected for an extended period, the replication slot on the master retains WAL files, potentially leading to increased storage requirements.
- The need to monitor replication slots, especially in environments with multiple replicas, can add complexity to the management of PostgreSQL replication.
- The WAL files are retained by the master when the replica disconnects. This also means that the pg_wal directory may run out of space. Imagine a scenario where the replica fails forever and cannot be recovered (an orphaned replication slot) or when a replica cannot replay the WAL segments fast enough.
- You need to monitor the slots and manually drop them. Only when you do that will the master delete anything from the pg_wal directory. You will see how you can monitor and drop replication slots later. So, while WAL retention is taken care of without manual settings, pg_wal directory space needs to be manually monitored.
Integrate your PostgreSQL data in minutes!
No credit card required
Conclusion
The article introduced you to PostgreSQL replication slots, their purpose, and their limitations. However, when it comes to working with replication, you need to be an expert at PostgreSQL to set up servers from scratch and manually configure several details. Moreover, most of the time, the data is not available in the right format and you will need data engineering and PostgreSQL administration skills to transform the data.
This is where Hevo comes into the picture. With Hevo, you can seamlessly perform PostgreSQL Replication in real-time. Furthermore, Hevo’s fault-tolerant architecture ensures a consistent and secure replication of your PostgreSQL data. Using Hevo will make your life easier and make Data Replication hassle-free. Try a 14-day free trial and experience the feature-rich Hevo suite firsthand. Also, check out our unbeatable pricing to choose the best plan for your organization.
Frequently Asked Questions
1. What is replication slot postgres?
Replication slots are a feature in PostgreSQL that ensure the replication process between a primary database and its replicas is reliable and consistent.
2. How to check replication slot size in PostgreSQL?
You can check the size of a replication slot in PostgreSQL by querying the system catalog tables.
3. What is maximum replication slots in postgres?
The maximum number of replication slots in PostgreSQL is controlled by the ‘max_replication_slots’ configuration parameter.
Nikhil is an accomplished technical content writer with extensive expertise in the data industry. With six years of professional experience, he adeptly creates informative and engaging content that delves into the intricacies of data science. Nikhil's skill lies in merging his problem-solving prowess with a profound grasp of data analytics, enabling him to produce compelling narratives that resonate deeply with his audience.