Backing up and Restoring Postgres using pg_basebackup

There are several great tools available that handle backing up and managing the backups of your PostgresQL database. It is really important to understand the underlying process that these tools use though as well as the standard postgres commands are that you would need to run in case you ever need to do it manually.

There are 2 types of backups that you can take in Posgres, logical and physical. Logical backups are in the form if the SQL statements necessary to recreate the database (not necessarily in a human readable form). The 2 tools to take logical backups are pg_dump and pg_dumpall.

Physical backups are an exact copy of what is stored, in this case it is a copy of the actual file system as stored on disk.

A physical backup lacks some of the flexibility of the logical one (until you restore it). The advantage is that you can restore very quickly – the backup can almost be started up exactly as it is taken as it is a copy of the files. Also, you can restore to a specific point in time (PITR) when you also store the Wal files.

We use a tool called pg_basebackup to take these backups. YOu can fine the documentation on the command here: https://www.postgresql.org/docs/10/app-pgbasebackup.html

Taking a pg_basebackup is very simple, you simply issue the command and provide the data directory path of the instance that you want to back up:

pg_basebackup -h mydbserver -D /usr/local/pgsql/data

There are various options that you can specify such as storing the backup in another location or compressing the backup. The above command is the core of what you need though.

Restoring a backup that was taken with pg_basebackup.

The backup taken with pg_basebckup is an exact copy of your data directory so, all you need to do to restore from that backup is to point postgres at that directory and start it up.

SO, assuming that you wanted to restore the current system to that backup, you would first stop postgres, then replace your current data directory with the backup and then start it back up. Your database would then be as it was when the backup was taken.

Restoring the a Point in Time. (PITR)

The above is an overly simplistic example for a real world system. More likely, you would have an exact or rough point in time that you would like to recover to.

This is possible by combining the pg_basebackup with also archiving and keeping all of your WAL files.

Assuming you are archiving your WAL and have the files, it is easy to restore to any point in time that you have covered.

You would start this restore in the same way. You stop postgres, create your data directory from the backup but the you would perform a few extra steps:

First make sure that the WAL archive location is available to your server.

Create a recovery.conf file in the data directory and put the following option in there:

restore_command = 'cp /path/to/archive/%f "%p"'
recovery_target_time = '2019-09-01 08:10:00 UTC'
recovery_target_inclusive = false

The restore command is simply a copy command that lets postgres know how to get the wal files back into its pg_xlog directory. It will work out which ones that it needs.

The recovery_target_time is fairly self explanitory, it is when you would like to recover to.

recovery_target_inclusive – It is almost impossible that you would be able to recover to the exact time that you specify. If this setting is false then you would recover to the closest time before and if it is true then it would be the closest time after.

Once this is all set, you simply start Postgres up and assuming everything is correct, it will start up for you.

Leave a Comment