Automate backups

README: this section assumes that you have done the previous step, Add an EBS volume, because it assumes that the program data is on a separate volume to the database data.

In this step we will automate the production of backups for both our instance, which holds the root volume containing the OS and the programs, and our data volume, which holds the Postgres data files. We will do this by accessing the AWS API with bash scripts, then we will use cron jobs to trigger the scripts at defined intervals.

Note that the current Amazon guide for configuring the EC2 tools to access the API on Ubuntu is here. The guide seems much better structured now than when I first looked at it 3 years ago! So actually a lot of what I will say about configuration here is also found in the official guide.

Please note also that the EC2 API tools using Java as mentioned in this guide have since been superceded by the AWS CLI tools, which use Python. If starting from stratch, I would recommend installing the newer AWS CLI tools only, and replacing the EC2 commands below with the CLI equivalents.

The EC2 API needs Java, so firstly we install that and then define the JAVA_HOME variable.

  1. SSH into the server: ssh -i /path/to/my-key-pair.pem ubuntu@{Elastic IP}
  2. Install Java: sudo apt-get install -y openjdk-7-jre
  3. Open the file containing the environment variables: sudo vi /etc/environment
  4. Add this line so that JAVA_HOME is defined on reboot: JAVA_HOME="/usr"
  5. Define JAVA_HOME in the current session: export JAVA_HOME=/usr
  6. Test JAVA_HOME setting: $JAVA_HOME/bin/java -version You should see something like the result above

Now we download and configure the EC2 CLI tools. For this part you need both the "Access Key ID" which at the time of writing you can get from your account Security Credentials in the AWS console, and the "Secret Access Key" which you hopefully will have noted when you created your account. Looking now at my account credentials, there is a notice that Amazon wants me to use IAM, its permissions system, instead of logging in as the root user with the CLI tools. Given that I am the only user of my AWS account, I will not do this right now, but if you want to look into it further, read this.

  1. Install unzip: sudo apt-get install unzip
  2. Download the zipped tools: wget http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
  3. Make an "ec2" user directory: sudo mkdir /usr/local/ec2
  4. Unzip into the ec2 directory: sudo unzip ec2-api-tools.zip -d /usr/local/ec2 and note the name of the directory containing the version number. In my case it is "ec2-api-tools-1.7.3.0"
  5. Open the file containing the environment variables: sudo vi /etc/environment
  6. Append the following to the PATH variable, changing the directory name as needed: :/usr/local/ec2/ec2-api-tools-1.7.3.0/bin
  7. Add a new line, changing the directory name as needed: EC2_HOME="/usr/local/ec2/ec2-api-tools-1.7.3.0"
  8. Add a new line: AWS_ACCESS_KEY=<access key>
  9. Add a new line: AWS_SECRET_KEY=<secret key>
  10. Close and save. Now we set the variables in the current session:
  11. export EC2_HOME=/usr/local/ec2/ec2-api-tools-1.7.3.0
  12. export PATH=$PATH:$EC2_HOME/bin
  13. export AWS_ACCESS_KEY=<access key>
  14. export AWS_SECRET_KEY=<secret key>
  15. That's it. Now test the configuration by running ec2-describe-regions. You should get a list of ec2 regions as shown. If so, congratulations!

Now we can communicate with the AWS API. But the default region is US East, "ec2.us-east-1.amazonaws.com", so we probably need to change that.

  1. In the AWS console, click on EC2, then look at the URL. Mine ends with the parameter "?region=eu-west-1". Comparing this to the table of EC2 regions we see above, that means I need to change my region to "ec2.eu-west-1.amazonaws.com"
  2. Back into the environment variables: sudo vi /etc/environment
  3. Add a new line: EC2_URL=ec2.eu-west-1.amazonaws.com
  4. Again, export it for the current session: export EC2_URL=ec2.eu-west-1.amazonaws.com
  5. Now test: ec2-describe-volumes. You should see information about the 2 volumes we currently manage in this region, MyServer1-root and MyServer1-data

The configuration section is now complete. Now let's look at the backup commands we need.


Before we create a bash script to automate a task, we test the individual commands to make sure they work. Firstly we'll test the command to backup the instance, excluding the data volume. This will create an image (AMI) as we saw in the previous section, and it will also shut down the server, so we need to be prepared for that.

  1. In the AWS console, click on EC2, then on Volumes on the left, then select the data volume (MyServer1-data). In the Description tab below, look at the Attachment Information. Make a note of the Instance ID (something like "i-ab0c12d3") and the device name of this volume (probably "/dev/sdf")
  2. Now click on AMIs on the left and note the images (AMIs) that you currently have. We will use this screen to check that a new image has actually been created
  3. Enter the following, replacing the Instance ID for sure, the device name if needed, and the server name and date if you wish: ec2-create-image i-xxxxxxxx -n "MyServer1-root-2015-03-04" -b "/dev/sdf=none" The last part (the "-b" parameter) excludes the data volume from the image, meaning our data is stored separately. This way, we can boot up any combination of instance and data that we want
  4. You will see a server shutdown message, and you will be locked out for 3 or 4 minutes while the image is created and the server reboots
  5. Go back to the AMIs screen in the AWS console. You should see the new image with the status "Pending". The AMI name column has the same value as the "-n" parameter we entered in the command

Now we will test the command to backup the data volume. This method is slightly different: volumes are backed up as "snapshots". If you click on Snapshots under EC2 in the AWS console, you will see some snapshots which were automatically created when we created the instance images in Add an EBS volume and just now using the CLI tools. These are the snapshots of our root volume and form part of the image.

  1. In the AWS console, click on EC2, then on Volumes on the left, and make a note of the Volume ID of the data volume (something like "vol-abc0d1ef")
  2. Now click on Snapshots on the left and note the snapshots that you currently have. We will use this screen to check that a new snapshot has actually been created
  3. Enter the following, replacing the Volume ID for sure, and the volume name and date if you wish: ec2-create-snapshot vol-xxxxxxxx -d "MyServer1-data-2015-03-04"
  4. There will be a small pause then you should get the command prompt back
  5. Go back to the Snapshots screen in the AWS console. You should see the new snapshot with the status "Pending". The description column has the same value as the "-d" parameter we entered in the command

Great, so now we have working commands to make the instance image and the data volume snapshot. Let's put them into batch files and set up logging of the results.


Now we will create the instance backup script, run it, and log the results to a log file

  1. From Ubuntu home: cd ~
  2. Create a directory for the bash scripts: mkdir scripts
  3. Create a directory for the log files: mkdir scripts/logs
  4. Create the script for the instance backups: vi scripts/backup_instance.sh
  5. Enter the code as shown above. Some explanations:
  6. Make the script executable: chmod u+x scripts/backup_instance.sh
  7. Run the script: /home/ubuntu/scripts/backup_instance.sh
  8. It should run, meaning you will be kicked off the instance for a few minutes, and a new AMI should appear in the AWS console
  9. When you can SSH back in, verify the script log was written: less scripts/logs/backups.log
  10. Verify the archive log was written: less scripts/logs/archive_backups.log

Now for the data volume backup script, which is similar.

  1. From Ubuntu home: cd ~
  2. Create the script for the data volume backups: vi scripts/backup_data.sh
  3. Enter the code as shown above.
  4. Note the use of xfs_freeze to freeze ("-f") and unfreeze ("-u") the XFS file system containing our data. We do this to make sure that we get a consistent snapshot
  5. Make the script executable: chmod u+x scripts/backup_data.sh
  6. Run the script: /home/ubuntu/scripts/backup_data.sh
  7. It should run and a new snapshot should appear in the AWS console
  8. Verify the script log was written: less scripts/logs/backups.log
  9. Verify the archive log was written: less scripts/logs/archive_backups.log

So now we have the tested bash scripts. It just remains to run them via cron so that we can make them run automatically.


The crontab is where we define cron jobs (i.e. automated execution of scripts) that should run. We will add entries for the backup_instance script to run once a week, since we don't change the programs very often, and the backup_data script to run every night, just after midnight.

  1. Open crontab in editing mode: crontab -e
  2. The first time you do this, it will ask which editor you want to use. Enter "3" for vim, which is the one I have shown in previous sections
  3. As shown in the image, scroll to the bottom then add this line: PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/ec2/ec2-api-tools-1.7.3.0/bin This tells cron where to find the EC2 CLI commands we use, like "ec2-create-snapshot"
  4. Add this line to run backup_instance.sh at 00:01 every Saturday and log to backups.log: 01 00 * * 6 /home/ubuntu/scripts/backup_instance.sh >> /home/ubuntu/scripts/logs/backups.log 2>&1. The details of the mysterious crontab scheduling options are here.
  5. Add this line to run backup_data.sh at 00:10 every day and log to backups.log: 10 00 * * * /home/ubuntu/scripts/backup_data.sh >> /home/ubuntu/scripts/logs/backups.log 2>&1
  6. Save the crontab in the normal vim way, i.e. :x. The first time you do this, it will ask you to enter an encryption key. I don't use that, I just press enter twice
  7. You should manipulate the crontab times and get it to run a minute in the future, to make sure it works. Then less scripts/logs/backups.log to see the result, and you should see something like the picture above. You should of course also have a new snapshot or AMI in the AWS console

This was a long, technical section, congratulations to anyone who made it through all that!



comments powered by Disqus