The Importance of Backups
MySQL Backup Tools
It’s More Than Just Running the Backup Command
Where to Store Backups
Backup Time and Frequency
Testing Backups Routinely
Automating the Process
The Importance of Backups
Backups are an important part of database administration and process management. Developing a robust and sophisticated backup strategy can help keep your database safe from a myriad of potential problems.
Because the list of possible pitfalls is overwhelming, we generally cannot assume the reason why our database got broken. After the breakdown, the database can find itself in two possible states:
- The database is corrupted and won’t start. Typically, this is a consequence of hardware issues that have resulted in partially or completely damaged database files. This is not only possible when a meteorite hits your server, it usually happens due to errors in data storage devices.
- The database is starting up, but invalid data was written into it. This can be due to a human factor such as accidentally deleting a table. Or it can be the result of an error in the software that works with the database – say a programmer wrote a code that, instead of deleting data from a particular day, deletes data for the entire year.
The first case can be quickly resolved if you have a hot standby server. However, both cases can be solved only if you have a backup with correct data for every day (backup-point-in-time).
So even if your budget allows you to set up master-slave replication and keep a standby server running at all times, you still need to back up your databases regularly.
MySQL Backup Tools
A backup is a file or a group of files that contain information which allows you to restore data in a database. There are three different ways to back up your data in MySQL. Each has its own pros and cons.
mysqldump is a utility that is part of MySQL Server. It generates SQL script that can recreate a database. The main advantage of this backup method is that the backup is created in an open editable format.
The main drawback of the method is the speed when working with large databases. Creating a backup and restoring a database from a backup takes a while. It should also be noted that in order to implement incremental backups, you will need to additionally backup bin log files.
Xtrabackup is an open-source utility developed by percona. Xtrabackup makes a backup by copying the database files. The distinctive feature of Xtrabackup is that you do not need to stop MySQL instance (hot backup), you only need to stop it to perform the restore operation.
This backup is much faster than mysqldump. Xtrabackup allows you to create incremental and differential backups as well as particular backups. However, unlike mysqldump, the backup file cannot be opened and edited in a text editor.
MySQL Enterprise Backup is a hot backup utility developed by Oracle. It is largely identical to Xtrabackup. Detailed feature comparison of the two is available here.
It’s More Than Just Running the Backup Command
The command that creates the backup is just a step in a number of actions required to secure your data. To be on the safe side, after creating a backup, you need to perform a few more steps.
Backup Compression
Normally a backup compresses very well. It is always advisable to perform compression, as this saves space in the storage, and also reduces the load on the network when transferring a backup file. You can compress a backup either by streaming or after the backup is created.
Backup Encryption
Most likely, the data in your database is protected at least by a password. However, if someone takes possession of the backup file, then they will have access to all the data. Therefore, the backup file must be encrypted before being sent to storage. The easiest way to do this is to set a password during file compression.
Transferring the File to Storage
Keeping a single copy of your backup along with the MySQL instance on the same server is a bad idea, because in the event of a disk failure, you will lose both the MySQL server and the backups. A backup must be moved away from the server where it was created. The best option is to have multiple copies of backups in different locations.
Deleting Outdated Backups
Wherever you store your compressed backups, they must be deleted. Otherwise, over time, they may exhaust the available free space, and this, in turn, may make it impossible to save new backups to the storage.
Failure Notifications
It is unlikely that you will perform these steps manually over and over. In one way or another, you will have to automate the process, containing all the steps described above. But at each of these steps, errors may occur. In this case, a notification must be sent to the DBA’s email so that they analyze the issue and fix it.
Where to Store Backups
The easiest way is to transfer a backup file via FTP to another server or NAS. However, in the era of cloud storage, you can delegate the storage of backups to one of the cloud providers. It’s no harder than working with FTP. Most large cloud storages support console commands for transferring and working with data in the cloud.
Cold storage space is usually very cheap. The most popular storage options are AWS S3, Azure Storage and Google Cloud Storage, but there are many options on the market that you might find work better for you.
When choosing cloud storage, consider not only the price, but also the speed of data exchange between your server and cloud storage. As a rule, large vendors support multiple endpoints, allowing high-speed access to them worldwide.
Backup Time and Frequency
In a nutshell – the more often, the better. However, the backup process consumes server resources and backups consume storage space, so a compromise must be found. Incremental backups can allow you to back up more often, and they take up less space. However, even with incremental backups, you need to make full backups regularly.
When deciding the backup frequency and timing, one has to account for various factors. The include the maximum tolerable downtime hours for data recovery, the intensity of the server load, the time and how often the software that works with the database is updated and the working hours of employees that could potentially break something in the database.
Most companies run backups every night.
Testing Backups Routinely
Perhaps the most important thing that people tend to forget is to regularly check the recovery script. The automated backup process with failure notifications is decent. However, the main task of the backup is not just to put the backup file in cloud storage, but also to have a copy of the database that can be restored. Try to restore your database to a different server from time to time, and make sure that the data is restored exactly the way it should be.
Automating the Process
You can create a backup and send it to the storage manually. However, you should not rely on doing it manually every day. You can forget or get sick, and your time is worth money. The process of creating a backup can and should be automated.
There are two notable ways to approach this.
Creating a Number of Scripts to Automate the Process
Thanks to very powerful shell scripts, process automation in Linux is not the most difficult task. By creating a bash script, you can have in place the solution that suits you. An example of such a script can be found here.
You might face difficulties if you need to send a backup to cloud storage that has no client for Linux. Also, when you have multiple servers you have to spend time supporting the scripts you have written, run regular tests and make sure that backups are created and restored consistently.
Using Third Party Utilities
Although you can automate literally everything with a bash script, in practice it will take time to implement a working, transparent and properly tested solution. Therefore, it makes sense to consider third-party utilities that implement the entire process.
For example, the SqlBak service can create, compress, encrypt and transfer a backup to cloud storage automatically. It will also test the recovery process by performing a recovery on another server and email the result and the execution log to you.