What is pg_standby
?
pg_standby
is a command-line utility included with PostgreSQL that supports the creation of a warm standby server. It is used as a restore_command
to manage the restoration of Write-Ahead Logs (WAL) from a backup during recovery or when setting up a standby server. This utility allows a PostgreSQL server to remain in continuous recovery mode, supporting high availability and read-only queries.
Use Cases for pg_standby
- High Availability: Maintaining a warm standby server that can be quickly promoted to a primary server in case of a failure.
- Read-Only Query Load Balancing: Distributing read-only query load across multiple servers to improve performance.
- Disaster Recovery: Ensuring data is replicated to a standby server, which can be promoted if the primary server is lost.
How to Use pg_standby
The pg_standby
utility is typically used within the recovery.conf
file (for PostgreSQL versions prior to 12) or postgresql.conf
and standby.signal
files (from PostgreSQL 12 onwards) as the restore_command
. It is invoked by the PostgreSQL server during recovery or standby mode.
Here’s the basic syntax for the pg_standby
command:
pg_standby [OPTIONS]... <archiveLocation> <WALFileName> <recoveryTarget>
Examples of pg_standby
in Action
Example 1: Basic Setup
To use pg_standby
as the restore_command
, add the following line to your recovery.conf
(or postgresql.conf
for PostgreSQL 12 and later):
restore_command = 'pg_standby /path/to/wal_archive %f %p'
In this example, /path/to/wal_archive
is the directory where archived WAL files are stored, %f
represents the name of the WAL file to restore, and %p
is the path to copy the WAL file to.
Example 2: Using a Trigger File
You can specify a trigger file that, when present, will end recovery and promote the standby server to a primary server:
restore_command = 'pg_standby -t /path/to/trigger_file /path/to/wal_archive %f %p'
Creating the trigger file (touch /path/to/trigger_file
) will promote the standby server.
Example 3: Delaying Recovery
To delay recovery by a certain number of seconds to prevent the standby server from being too close to the primary server’s state, use the -d
option:
restore_command = 'pg_standby -d 300 /path/to/wal_archive %f %p'
This delays recovery by 5 minutes (300 seconds).
Common Mistakes and Issues
- Permission Errors: Ensure that the user running the PostgreSQL server has read and write permissions to the
archiveLocation
. - Incorrect Path: The path provided to
pg_standby
must be the correct location where the WAL files are archived. - Trigger File Mismanagement: Be careful with the creation and deletion of the trigger file. Creating it prematurely can cause an unwanted failover.
- WAL Retention: Ensure that WAL files are retained in the archive for as long as they may be needed by the standby server. Pruning them too soon can break replication.
Troubleshooting and Errors
- WAL File Not Found: If
pg_standby
reports that a WAL file is missing, ensure that your archive command is successfully copying WAL files to the specified archive location. - Connection Issues: If the standby server cannot connect to the primary server, check network connectivity, firewall settings, and
pg_hba.conf
configurations. - Version Mismatch: Make sure that the primary and standby servers are running the same version of PostgreSQL, as WAL formats can differ between versions.
By using pg_standby
, you can create a robust standby system that enhances the availability of your PostgreSQL databases. However, it’s essential to monitor your standby servers, manage WAL archives properly, and test your failover procedures to ensure that your high availability setup functions correctly when needed.