One of the primary responsibilities of the DBA is to create regular backups. However, maintenance of the database is not limited to this.
The mechanisms that enable a relational database to run fast are based on a complex storage system that degrades over time. It leads to a general database slowdown.
What does PostgreSQL Database Maintenance Consist of?
The following commands will help keep the PostgreSQL database in good shape:
- ANALYZE
- VACUUM
- REINDEX
These commands will be discussed in more detail below.
ANALYZE
When building a query, the query optimizer relies on data distribution statistics. As data changes, these statistics become outdated.
The ANALYZE command updates the data statistics. This command does not scan all the data but makes a random statistically significant sample, so this command is not particularly time consuming.
It makes sense to run ANALYZE manually either after bulk-adding/updating/deleting, for example, after import or synchronization. Or after running a VACUUM command. It is considered good practice to combine VACUUM with ANALYZE.
Use cases:
Running ANALYZE on a single column
ANALYZE [table-name].[column-name]
For a separate table, with the VERBOSE option to track the progress of the command
ANALYZE VERBOSE [table-name]
For the entire database, type
ANALYZE
VACUUM
When executing DELETE and UPDATE commands, PostgreSQL does not physically delete the data but only marks it as deleted. It leads to a disproportionate increase in the database size because the deleted data is still on disk. The VACUUM command solves this problem.
There are two types of this command:
- VACUUM – obsolete data in the database is marked as writable. This command will not reduce the size of the database file but allows you to reuse the space.
- VACUUM FULL – the database file is completely overwritten. It results in reducing the size of the database as small as possible. However, this command takes a lot of time and completely blocks the table.
Use cases:
Running VACUUM on a specific table
VACUUM(ANALYZE) <table-name>
Running VACUUM FULL with the VERBOSE option to track the progress of the cleanup. Please note that this command blocks the database
VACUUM(FULL, VERBOSE)
Running for the entire database
VACUUM(ANALYZE)
When running VACUUM, keep in mind that this command places a significant load on the database.
REINDEX
Over time, blank or nearly blank pages form in the indexes. To optimize the index and reduce the space it takes up, you can run a REINDEX command from time to time. Also, REINDEX can fix a broken index.
The REINDEX command completely blocks read access to the table. As a rule, it is executed manually if there is suspicion of index corruption or as one of the measures to increase the speed of queries.
Use cases:
REINDEX can be applied both to the entire database and to a table or a specific index. To rebuild the index named my_index, use the following command
REINDEX INDEX my_index;
To rebuild all indexes in a table
REINDEX TABLE my_table;
And to rebuild all indexes in the database
REINDEX DATABASE broken_db;
About the AUTOVACUUM Daemon
PostgreSQL has an AUTOVACUUM Daemon that automatically executes a VACUUM(ANALYZE) command when the number of dead records exceeds a threshold. More details about the execution principles are available here.
However, relying on AUTOVACUUM alone may not be the best solution. The command places a significant load on the hard drive. If VACUUM starts at the time of peak load on the database, it will lead to significant delays.
What and When to Run?
You can run VACUUM and ANALYZE commands once a day at the time of minimum load on the server, for example, at night. It will prevent them from being automatically executed by the AUTOVACUUM Daemon during peak loads.
It makes sense to run REINDEX command manually when there are index issues.
How to Automate VACUUM ANALYZE
All the commands described above are SQL queries that can be run using the psql utility from the console. In its simplest form, this command looks like this
psql -u postgres -p[password] -d [my-database] -q ‘VACUUM(ANALYZE)’
In Linux, you can schedule a regular run of the console command via cron
00 00 * * * psql -u postgres -p[password] -d [my-database] -q ‘VACUUM(ANALYZE)’
In Windows, this can be done using the Windows Schedule, as follows:
Press win+r, type taskschd.msc
Select Create Task
From the Triggers tab, add execution schedule
From the Action tab, add the following data into the program field, specify psql in add argument field:
-u postgres -p[password] -d [my-database] -q ‘VACUUM(ANALYZE)’
How to Automate Maintenance Jobs with SqlBak
The SqlBak service is focused primarily on the regular creation of backups. However, it also allows you to automate database maintenance.
Compared to automation through standard operating system utilities, SqlBak has several advantages:
- Email notifications
- Job history
- Running .sh or .bat scripts
To automate PostgreSQL maintenance using SqlBak follow these steps:
Install a thin client on the server and connect the database. Connection details are available here.
Create a maintenance job.
Add SQL script.
VACUUM ANALYZE
Specify your email (optional) to receive job execution updates.
Start the job.