The script starts by creating a copy of all of the important information in a staging area. Keeping the staging area intact will help speed up the backup after it has been run once since rsync is used for most of the copying. After the staging area is updated, we use rdiff-backup as our incremental backup system. The rdiff-backup repository can then be copied to other local and offsite locations to increase redundancy in the backup. The output is sent by email every time the backup runs, so we are automatically updated on its status, and can correct any errors that may have taken place.
Following is a description of the tools that are used to back up our data, as well as some of the tools that are holding out data. Different sets of data must be backed up in different was in order to maximize efficiency and reliability.
We use a mixture of standard desktop hardware, server hardware, and virtual dedicated servers for our infrastructure. Services requiring reliability are all run on the virtual dedicated servers providing uptime guarantees. All of our backups are gathered and stored on two local computers, each with a raid 1 array, and copied to an offsite computer.
Bash is an interactive shell and a scripting language that is installed by default on Linux, and can be configured on many other operating system. We use it for the backup system because it is easy to call other programs that are already designed to do a lot of the work that we need to do for the backup. This gets us a full backup in less than 100 lines of code. The small amount of code makes it easy to proof read and test for errors.
Rsync is a very common program for copying data from one location to another when bandwidth can be limited, or when large amounts of data must be copied. It first checks if the file exists in the destination, and, if it does exist, makes sure that it is that same file. If the file already exists, and is up to date, the file is not copied. For our backup, this saves a lot of time since there is generally only a small amount of files that change.
After the staging area has been updated, we use rdiff-backup to keep track of 10 days of history. rdiff-backup copies files from the source to an rdiff-backup repository. Instead of deleting files that have been removed and overwriting files that have changed, it stores a backup of any changes. It is possible to go into the repository later and restore any version that has not been removed.
OpenVZ allows us to quickly and easily set up new virtualized environments, known as containers. OpenVZ containers are not fully virtualized, so they do not require as many resources as traditional virtual machines. Our OpenVZ containers are used to quickly and easily set up testing environments, and to provide some of our services within our office that do not have the uptime requirements of the services running on the offsite virtual dedicated servers.
Logical Volume Management
Logical Volume Management (LVM) is an abstraction layer between the hard drive and the partitions and filesystems on that hard drive. One or more hard drives or partitions, known as physical volumes in LVM language, can be combined into a volume group. The volume group can then be divided into multiple logical volumes, which behave like partitions and can be formatted like a normal partition would be. There are many features in LVM that provide an advantage over normal partition management including resizing partitions and volume groups while the filesystems are mounted, moving the logical volumes from one physical drive to another, also while the filesystems are mounted, and many other management tasks. Our backup script takes advantage of the ability to create a snapshot of a filesystem. A snapshot creates another logical volume that is an image of the first logical volume frozen in time. Any changes that are made to the files on the original logical volume will be stored in the extra space at the end of the volume group that is dedicated to the snapshot when the snapshot is created. Business can continue as normal as we create a backup from a consistent set of files where we no longer need to worry about someone writing to the files while we are backing them up.
MySQL is a database that comes with it’s own set of backup tools to ensure that you get a consistent copy of the database. Mysqldump locks the database when necessary to make sure that it does not make a copy of the database when some other program is in the middle of making a change to the database.
Subversion is a version control system. We use it for all of our development projects to ensure that each team of developers is working with the most recent version of the code. Because all of our projects use subversion, we have built up a large collection of code that would take a long time to fully back up each day. We use subversion tools along with some bash scripting to compare the most recent version of the code in version control with the version that we last backed up, and only back up the changes.
Grep, Sed, and Others
There are many other tools used throughout the script. The two biggest examples are grep, a regular expression program, and sed, a stream editor which takes some input and changes it as specified. If you want to know more about any of these tools, you can look them up in the Linux manual pages, or you can find more information on them using your favorite search engine.
Following are the variables that are used in this script. Some of the values have been changed for security reasons.
# space separated list of servers that have openvz and lvm set up CONTAINER_SERVERS="fox walrus" # server to put a second copy of the backup on, located in MN SECONDARY_BACKUP="walrus" # offsite server OFFSITE_BACKUP="offsite.server.com" # Location on the openvz servers where the lvm snapshot of the openvz partition is mounted SRC="/vzsnap/" # Location on the main backup server to collect all files STAGING="/backup/staging/" # Location of the rdiff-backup repository RECENT="/backup/recent_backup/" # Location on the second server for a copy of the rdiff-backup repository SECOND_RECENT="/backup/recent_backup/" # Location on the offsite server for a copy of the rdiff-backup repository OFFSITE_RECENT="/home/backup/recent_backup/" # Command locations LVCREATE=/sbin/lvcreate LVREMOVE=/sbin/lvremove VZLIST=/usr/sbin/vzlist VZCTL=/usr/sbin/vzctl
# containers echo "Staging Containers" for host in $CONTAINER_SERVERS do containerlist=`ssh $host $VZLIST -aH -o veid` for container in $containerlist do echo "Backing up $container from $host" ssh $host $LVCREATE -L2G -s -n vzsnap /dev/filestore/vz > /dev/null # check if the container is running run="`ssh $host vzlist | grep --only-matching $container`" if [ "$run" == "$container" ] then # check if mysql server is running inside the container if [ -f /vz/private/$container/etc/init.d/mysql ] then ssh $host "vzctl exec $container 'if [ -f /var/run/mysqld/mysqld.pid ]; then echo mysql is running; echo /var/run/mysqld/mysqld.pid > /mysqld.pid; fi'" if [ -f /vz/private/$container/mysqld.pid ] then pass="`ssh $host cat /vz/private/$container/etc/mysql/debian.cnf | grep --max-count=1 password | sed 's/^password = //'`" user="debian-sys-maint" ssh $host vzctl exec $container mysqldump --user=$user -p$pass --all-databases > /backup/staging/$container.dump ssh $host vzctl exec $container rm /mysqld.pid fi fi fi ssh $host mkdir -p $SRC ssh $host mount /dev/filestore/vzsnap $SRC rsync -a --delete root@$host:$SRC/private/$container $STAGING/ rsync -a root@$host:$SRC/etc/vz/conf/$container.conf $STAGING/ ssh $host umount $SRC ssh $host $LVREMOVE /dev/filestore/vzsnap --force > /dev/null ssh $host rmdir $SRC done done
For each server that is running OpenVZ containers, we want to make a full backup of each of the containers.
- We use LVM to create a snapshot of the filesystem. This gives us a version of the files that we can copy that is consistent with itself at a given point in time.
- In the case that MySQL is running in the container, we want to make sure that we do not get a corrupt copy of the database just in case it was in the middle of a commit when the LVM snapshot was taken. For this, we will use the standard MySQL backup tool mysqldump. This will also allow us to easily examine an old version of the database without fully restoring the entire container.
- Finally, we copy the configuration for the container, and all of the files within the container.
This will give us enough to restore any of our containers to the point at which the backup was taken.
Version Control (SVN)
echo "Staging Source Control" REPOS=`ssh scm.mentormate.com ls /files/repo/svn/` echo $REPOS for repo in $REPOS do echo "processing svn repo $repo" head=`svn info svn+ssh://scm.mentormate.com/files/repo/svn/$repo/ | grep "^Revision: .*" | sed 's/^Revision: //'` curfile=$STAGING/svn.$repo.lastversion # check if there is currently a backup of the repo if [ -f $curfile ] then # if yes, read the last saved file current=`cat $curfile` else # if no, start from the beginning current=0 fi if [ $current -lt $head ] then if [ $current -ne 0 ] then current=$(( $current + 1 )) fi padcurrent=`printf "%05d" $current` padhead=`printf "%05d" $head` # take incremental backup ssh scm.mentormate.com svnadmin dump -r$current:$head --incremental /files/repo/svn/$repo | gzip --best > $STAGING/svn.$repo.$padcurrent-$padhead.dump.gz echo "$head" > $curfile else echo "$repo does not have any changes" fi done
We use subversion for version control. It’s very easy to dump a full backup of a subversion repository, but when the repositories get large, this can start to take a lot of time and bandwidth. It is better to only back up the new revisions.
- Get a list of all subversion repositories
- The head is the version that was most recently committed to the repository. This can be obtained by checking the svn info.
- The head that we backed up the last time that we made a backup was saved to a file. This will give us a range for the last backup that we made to the current head.
- An incremental backup is taken of only the new revisions, and the current head is written to a file. The filename will tell us which revisions were backed up.
When restoring this, the files can be combined and loaded into a new repository, or they can be loaded into a new repository one after the other. If you decide that there are too many files for one repository, you can simply delete all of the files for that repository, including the last revision file, from the staging area, and a full backup will be taken the next time the backup script runs.
Backup Router Configuration
wget -q https://192.168.1.1/diag_backup.php --post-data 'Submit=Download' --no-check-certificate --http-user=admin --http-password="**********" --output-document=$STAGING/pfsense.xml
We use pfSense for our router software. This allows us great flexibility at a very low cost. pfSense has a web page with a button for downloading the configuration file. Wget can be set to send the appropriate information to request the configuration file. In the event that the router hardware fails, we can quickly install pfSense on new hardware, and load the configuration.
Crontab from the Backup Server
crontab -l > $STAGING/crontab.`hostname`
If our main backup server goes down, we want to make sure that we know everything that it was doing. All of our important backups jobs run as a single user, so we only need to get one crontab here.
Staging to Incremental Backup
rdiff-backup $STAGING/ $RECENT/ rdiff-backup --remove-older-than 10B $RECENT/
Now that we have everything in the staging area, we want to copy it to a more long term location. We will do this using rdiff-backup, which will store multiple revisions of our files without making a full copy of each one. We then need to remove old revisions of the files so that the repository does not get too large. In this case, we keep 10 versions of the files.
rsync -azq --delete $RECENT/ $SECONDARY_BACKUP:$SECOND_RECENT/ rsync -azq --delete --bwlimit=10 $RECENT/ user@$OFFSITE_BACKUP:$OFFSITE_RECENT/
The final step is to make sure that we have multiple copies of our big backup. The rdiff-backup repository is copied to a second onsite server, for easy restore in the event of hardware failure on the backup machine, and an offsite server.
By gathering all of our information into one place, then taking an incremental backup and sending that to remote servers, we have easily created a backup script that stores incremental versions of all of our important data in one place.