Postgres Vacuum and AutoVacuum.

Basics of vacuum Postgres maintains multi version consistency by keeping old versions of changes tuples instead of actually deleting them. Eventually, keeping all of those out of date versions becomes big burden in terms  of storage and performance. Eventually, you end up with bloated tableland indexes. If not felt with, eventually they would fill up tour disks but they would probably make the database unusable before then so we have a handy process to clean it all up. That is Vacuum. Postgres Vacuum goes through your tables and indexes an cleans out had tuples – that is tuples that can no longer be needed by a transaction.

Vacuum can be ran with a number of parameters.

Vacuum  – with no parameter will run on every table that the user has access to. Vacuum Analyze – runs a vacuum and then runs an analyst on the table to update the state for the optimizer.

Vacuum on its own will clear out the dead tuples but because of the way a table is written to disk, the free space will not be available to the operating system or other tables. Only the table that already owned the space will be able to resume it. The other option is to run vacuum FULL. This locks the table and rewrites the table to a new file so that no unnecessary space is used. This frees the space for other objects to use. You have to be careful with vacuum full  as locking the table means that nothing else can access it. This could easily cause an outage.

Why do you need to vacuum your Postgres database?

As mentioned above, every update cause s a new row to be written rather than changing the actual data on disk and deleting does not actually remove rows. So over time your tables and indexes become booted. This uses unnecessary disk space but also harms performance as there is more data to be sifted through for each query.

The vacuum process is so important to the Postgres database that you shoudn’tleave it to manual intervention. Postgres has the AutoVacuum process to take care of it most of the time:

The Autovacuum Damon works in the background to vacuum the databases on a server. There are several processes. The Autovacuum Launcher launches auto vacuum processes. It will spin up a new vacuum process once  every autovacuum_naptime seconds. That means that if you have more than one database in a cluster, it will only get vacuumed once every  autovacuum_naptime / num dbs seconds.

When an auto vacuum worker is spawned, it will check each table in its database. The autovacuum_max_workers parameter sets the maximum number of workers that can be running at one time. This means that if there are several large tables, all the workers can get caught up vacuuming those tables.

Leave a Comment