Guide to Using pg_standby

What is pg_standby?

pg_standby is a command-line utility included with PostgreSQL that supports the creation of a warm standby server. It is used as a restore_command to manage the restoration of Write-Ahead Logs (WAL) from a backup during recovery or when setting up a standby server. This utility allows a PostgreSQL server to remain in continuous recovery mode, supporting high availability and read-only queries.

Use Cases for pg_standby

  • High Availability: Maintaining a warm standby server that can be quickly promoted to a primary server in case of a failure.
  • Read-Only Query Load Balancing: Distributing read-only query load across multiple servers to improve performance.
  • Disaster Recovery: Ensuring data is replicated to a standby server, which can be promoted if the primary server is lost.

How to Use pg_standby

The pg_standby utility is typically used within the recovery.conf file (for PostgreSQL versions prior to 12) or postgresql.conf and standby.signal files (from PostgreSQL 12 onwards) as the restore_command. It is invoked by the PostgreSQL server during recovery or standby mode.

Here’s the basic syntax for the pg_standby command:

pg_standby [OPTIONS]... <archiveLocation> <WALFileName> <recoveryTarget>

Examples of pg_standby in Action

Example 1: Basic Setup

To use pg_standby as the restore_command, add the following line to your recovery.conf (or postgresql.conf for PostgreSQL 12 and later):

restore_command = 'pg_standby /path/to/wal_archive %f %p'

In this example, /path/to/wal_archive is the directory where archived WAL files are stored, %f represents the name of the WAL file to restore, and %p is the path to copy the WAL file to.

Example 2: Using a Trigger File

You can specify a trigger file that, when present, will end recovery and promote the standby server to a primary server:

restore_command = 'pg_standby -t /path/to/trigger_file /path/to/wal_archive %f %p'

Creating the trigger file (touch /path/to/trigger_file) will promote the standby server.

Example 3: Delaying Recovery

To delay recovery by a certain number of seconds to prevent the standby server from being too close to the primary server’s state, use the -d option:

restore_command = 'pg_standby -d 300 /path/to/wal_archive %f %p'

This delays recovery by 5 minutes (300 seconds).

Common Mistakes and Issues

  • Permission Errors: Ensure that the user running the PostgreSQL server has read and write permissions to the archiveLocation.
  • Incorrect Path: The path provided to pg_standby must be the correct location where the WAL files are archived.
  • Trigger File Mismanagement: Be careful with the creation and deletion of the trigger file. Creating it prematurely can cause an unwanted failover.
  • WAL Retention: Ensure that WAL files are retained in the archive for as long as they may be needed by the standby server. Pruning them too soon can break replication.

Troubleshooting and Errors

  • WAL File Not Found: If pg_standby reports that a WAL file is missing, ensure that your archive command is successfully copying WAL files to the specified archive location.
  • Connection Issues: If the standby server cannot connect to the primary server, check network connectivity, firewall settings, and pg_hba.conf configurations.
  • Version Mismatch: Make sure that the primary and standby servers are running the same version of PostgreSQL, as WAL formats can differ between versions.

By using pg_standby, you can create a robust standby system that enhances the availability of your PostgreSQL databases. However, it’s essential to monitor your standby servers, manage WAL archives properly, and test your failover procedures to ensure that your high availability setup functions correctly when needed.

Leave a Comment