Understanding Tape Backup Strategies December 9, 2009Posted by General Zod in Backup, Tech.
In the past, one of the many roles I’ve occupied in the IT industry is that of the Backup Administrator for a previous employer. It never really became a very time-consuming job (unless something was broken), but I do believe that it required a good deal more skull sweat than most people believe.
At this previous job, I had happened to be fortunate enough to have access to an auto-loading 4-drive tape library with a 50 cartridge capacity… and a sum of close to 200+ tapes to use. This meant that I only had to go through the monotony of loading in about 30-40 fresh tapes about once a month. Then, it was a simple matter to export 6 or 7 tapes each week to send to offsite storage. Even with data restore requests and regular administrative maintenance; I usually would not dedicate much more than 2 hours per week to this duty. (If you are doing this job for your company, then YMMV.)
While I was employed by this company, I was actually managing the backup solutions for multiple groups and organizations. As a result, I dealt with many non-technical people (read as "managers") who never seem to understand what I’m referring to when I describe the backup strategies I was utilizing. This is not their fault. They have other concerns in their lives, and should not be expected to think about such things on a regular basis. If they understood everything that you do, then they wouldn’t need you (so look upon this as job security). The big drawback was that I was continuously having to re-explain my strategies to each of them fairly regularly. To that end, I had written a blurb on various backup strategies that I’d kept in a TXT file, and would simply past it into an email whenever I was asked for clarification.
If you’re not a very technical person, then you may feel that this starts off sounding a bit complicated, but soon you will realize that it’s just a matter of learning the vocabulary. Once you understand what your Backup Administrator is talking about, then the rest will just fall into place.
Types of Backups
Full backups are just what they sound like. The full backup job is archiving a copy of every file and folder from a specified drive to the target tape media. This type of backup is usually run one a week or once a month (depending on the importance of the data and how often the data changes. As each file or folder is duplicated to tape, the Archive bit is reset on it (which I will discuss the importance of in more depth momentarily). The down side to running a full backup is that it takes longer to run than any other backup job.
The Archive bit is a flag that exists on every file to indicate whether or not the file needs to be backed up. This bit is enabled whenever a file is changed, and is reset during Full and Incremental backup jobs.
Incremental backups are used to backup only the files that have changed since their most recent last backup. Each file’s Archive bit is examined during the job run, and only those files whose Archive bit is enabled will be copied. The Archive bit is reset after each file is backed up. Using incremental backups will provide quick backup jobs, but will require more complicated tasks to be performed to restore a specific day’s data in it’s entirety.
Full backups run every Sunday night.
Incremental backups run every night (except Sunday).
On Friday, someone wants to restore the entire contents of a folder as it would have appeared at the close of business on Wednesday. This would require performing 4 separate data restore jobs.
Sunday (full) + Monday (inc) + Tuesday (inc) + Wednesday (inc)
Differential backups are used to copy the sum of all files that have been changed since the last full or incremental backup. Like in the Incremental backup, each file’s Archive bit is examined during the job run, and only those files whose Archive bit has been enabled will be copied. The difference is that the Archive bit is NOT reset on the files during the Differential backup job.
Combining the use of Differential and Incremental backups are not generally a good idea unless you really know what you are doing. However, it does make life easier if you have an enormously large amount of data to backup.
I was responsible for backing up a 12 TB SAN that was quite full, and the users made a habit of updating the contents rather vigorously.
Due to the network performance at the site, running a full backup of that beast would usually takes about 24+ hours to complete. This was NOT something that I wanted to do on a weekly basis. So backing up that SAN looked something like this…
Full backup — 1st Saturday of each Month.
Incremental backups — All other Saturdays.
Differential backups — Every Sunday thru Friday.
If someone wanted to restore the entire contents of a folder as it would have appeared at the close of business on the 3rd Wednesday of the month, there would be 3 restore jobs required.
Restored from 1st Saturday of the month (full)
Restored from 2nd Saturday of the month (inc)
Restored from 3rd Wednesday of the month (diff)
Tape Rotation Cycles
Another important aspect of your tape backup strategy is how to rotate you tapes. Most organizations do not want to buy new tapes every week to accommodate their needs. Doing so would become quite expensive very quickly… not to mention you would soon become buried in used data cartridges. Typically, a tape rotation is decided upon based on how long the data needs to be stored on tape. After this time has elapsed, the tapes will be erased and reused.
Round-Robin is the simplest of tape rotations to understand. This type of cycle uses a single tape set for a set period of time, but only keeps the information for the length of that time period. This is usually a sufficient backup solution for very small businesses. However, it forces the Administrator to frequently re-use the tapes over and over again. Due to wear and tear, these tapes would be replace frequently. Another con to this type of rotation is that there is no long-term storage solution.
A weekly Round-Robin tape rotation would mean that there are tapes marked "Monday", "Tuesday", "Wednesday", etc. Each day’s tapes are used on that specified day of the week, and then re-used 7 days later.
Grandfather-Father-Son is a more commonly used tape rotation strategy. The concept behind this strategy revolves around performing a large-sized monthly backup which is sent to long-term storage (or "grandfather"), performing moderately-sized weekly backup which is sent to storage for a moderate period of time (or "father"), and performing small-sized daily backups which are only required to be kept until the next weekly backup is run (or "son").
I prefer to use the Grandfather-Father-Son tape rotation for the backup of my above mentioned 12 TB SAN.
Grandfather = Monthly Full backups (which are stored offsite for 1 year).
Father = Weekly Incremental backups (which are stored offsite for 3 months).
Son = Daily Differential backups (never leave the building, and are overwritten every 30 days).
Incremented Media is another simple method of tape rotation. The idea behind it is to simply have a collection of numbered tapes which are used in order until the end of their numerical run. After you run out of tapes, you start over again. This method results in an even amount of wear and tear across all of your media. However, if you need to restore data it will be a complicated process to determine which tapes you require. Luckily, the good backup applications will track that information for you.
A series of 10 incremented tapes would be used in the following order:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4…
Tower of Hanoi is the most complicated tape rotation strategy to use. This strategy is based on a puzzle by the same name. The Tower of Hanoi puzzle was invented by a mathematician named Édouard Lucas in 1883. In the puzzle, there is a series of rings (of different sizes) which are all stacked on the left-most of 3 pegs. The objective is to move all of the rings to the right-most peg. However, you can only move one ring at a time, and no ring can be moved on top of a ring of smaller size. If you’d like to try this puzzle, then you can check out a flash version at…
In relation to tape rotations, the Tower of Hanoi is a recursive cycle which does provide a very good long-term storage strategy for your data. However, it is very complicated to understand and keep track of. Basically, a new layer of tape rotations are added for each tape set that is added, and each of the new tape sets are only reused every other rotation. Mathematically, if we assume that a different tape set is used each week, then this means that a new tape set is added every 2^(N-1) weeks, and that tape set is reused every 2^N weeks. Personally, I find this rotation strategy entirely too complicated for most needs. I would only recommend you consider attempting to use it if you are required to store tape backups for several years.
If we assume that each tape set represents a full week of backups, then you can see the pattern to which each tape set is used.
Series of Weeks: A, B, A, C, A, B, A, D, …
Weeks each Tape Set is Used:
A 1, 3, 5, 7, …
B 2, 6, …
C 4, …
D 8, …
If you are trying to decide upon a tape backup strategy, then I would probably make a quick recommendation based simply off the size of your organization.
Small organizations: I have seen many small organizations who can fit all of their data files onto a single tape. Since backing up this small amount of data does not take much time, many small organizations prefer to do full backups on a daily basis, and to rotate their tapes in a round-robin strategy. Assuming there are no actual IT personnel staffed by the company, this method of daily full backups puts everyone’s mind at ease and does not require a large collection of tapes. However, keep a couple of spare tapes on-hand in the event one of them breaks.
Medium organizations: Companies of this size usually prefer to implement a Grandfather-Father-Son rotation method. Since this size of organization probably has a small IT staff, they should all be made aware of how the backup jobs and tape rotations are implemented. To prevent confusion, the dates of the backups and their associated tapes should be documented for future reference. The actual backup methods used can vary depending upon your needs, but I would recommend a minimum of the following… Monthly Full backups, Weekly Incremental backups, and Daily Differential backups.
Large organizations: It is assumed that large organizations will have a large IT staff. Backup Administrators will be responsible for managing to a large amount of data which will no doubt be changing as fast as the surface of a lake. To that end, they will most likely prefer to tighten up the backup jobs to require Differential backups to be performed every couple of hours, Incremental backups every evening, and Full backups a minimum of once per week. The tape rotation would also be at the Backup Administrators discretion.
These are just samples and suggestions. I do recommend that you develop your own backup solution based on your company’s personalized needs.
If you are a small company with no IT staff, then it would probably be wise for you to call an outside IT contractor to give you recommendations, help setup your backup solution, and even train some of your staff on how to perform the backups.
If your company is large enough to employ your own IT staff; then feel free to give them your input, but put your trust into them to make the right choices. After all, that’s what they were hired for…