This is a set of instructions for performing backups (for now just incrementals are described in detail).

first, log into vnfe4 as root:
[user@bh3 ~]$ ssh root@vnfe4
root@vnfe4's password:
the password is available via Matt Choptuik.

make sure a tape is in the drive
Go to incremental (daily) backups
Go to level-0 (monthly) backups

Incremental backups

[root@vnfe4 ~]# mt -f /dev/nst0 status
SCSI 2 tape drive:
File number=448, block number=0, partition=0.
Tape block size 1024 bytes. Density code 0x81 (DLT 15GB compressed).
Soft error count since last status=0
General status bits on (81010000):
EOF ONLINE IM_REP_EN

[root@vnfe4 ~]# mt -f /dev/nst1 status
SCSI 2 tape drive:
File number=18, block number=0, partition=0.
Tape block size 1024 bytes. Density code 0x81 (DLT 15GB compressed).
Soft error count since last status=0
General status bits on (81010000):
EOF ONLINE IM_REP_EN

WARNING: the above output is specific to the tape in the drive at the time of this production, the important information is the last line that shows the tape is at the end of file (EOF) meaning the end of the data on the tape, and that it is online.

NOTE: that only two drives exist /dev/nst0 and /dev/nst1.

I select /dev/nst0 to perform daily backups (incremental), /dev/nst1 is used for monthly backups (level-0)

daily (incremental) backups:

first we want to put in a fresh tape.
[root@vnfe4 ~]# mt -f /dev/nst0 eject
this rewinds the disk in the drive, then ejects it. The ejected tape will be labelled incremental # where # is some integer from 1 to 5.
select a new disk which will be labelled incremental #+1 (if # is 5 then insert disk 1)

make sure the tape is loaded properly:

[root@vnfe4 ~]# mt -f /dev/nst0 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 1024 bytes. Density code 0x81 (DLT 15GB compressed).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN

If it returns any other information that what is shown above the tape did not load properly. In that situation try

[root@vnfe4 ~]# mt -f /dev/nst0 load

if this does not change the status, you will need to try a different disk. There is a possibility that the disk is bad. Label it as bad and do not use it again.

Assuming the tape is good and has now loaded properly you are ready to perform the daily (incremental) backups. The scripts needed for backups are:
/home/backups/scripts/head-store-incremental-backup
/backups/scripts2/bh-incremental_find
/backups/scripts2/bh-incremental_tar

the first of these scripts handles the search for new files and the tape backup in one script. The second two scripts; first, find the necessary files to be backed up, and second, actually put them on the tape. Both are in place in crontab as:

[root@vnfe4] crontab -e
# Format of lines: #min hour daymo month daywk cmd
00 23 * * 0,1,2,3,4,5,6 /home/backups/scripts/head-store-incremental-backup > /dev/null 2>&1
00 23 * * 0,1,2,3,4,5,6 /backups/scripts2/bh-incremental_find
00 02 * * 0,1,2,3,4,5,6 /backups/scripts2/bh-incremental_tar

NOTE: These are set to late night runs. This is to keep the head free from processes during the time we expect people to be online.
NOTE: in crontab all columns are separated by a space or tab character

The partitions being backed up on vnfe4 are:
/home
/home2
/home3
/root

The partitions being backed up on bh machines are:
/home
/home2
/root
/etc

I have included /home2 on all machines so that when new HDD are installed the scripts will not need to be changed. (Until of course we have 3 home partitions)

Feel free to read the scripts for details, ultimately the part that should concern you is the list of machines that are being backed up. These scripts are available online on another page in my sys-admin page

When the backups are complete (IE the next day sometime) do:

[root@vnfe4 ~]# mt -f /dev/nst0 status

this will tell you how many files are now stored on the tape.
There will be 4 files per partition (in the case of vnfe4) or 4 files per machine (in the case of the bh machines)

The first file is a blank,
the second is a label indicating what partition/machine is backed up,
the third is a blank,
the fourth will be the files themselves.

to access the information on the tapes first rewind the tape:

entirely &rarr mt -f /dev/nst0 rewind
n-files &rarr mt -f /dev/nst0 bsf n (recall you will want to do multiples of 4 to find what you are interested in)

(note: mt -f /dev/nst0 fsf n moves forward n files on the tape)

then

tar tvf /dev/nst0

which will tell you only the file names, it does not extract the files themselves.

once you find the partition you are interested in recovering you will need to move back one file on the tape again

mt -f /dev/nst0 bsf 1

cd /backups/restore

tar xvf /dev/nst0

the last line will now extract the files to the hard disk in /backups/restore this process may take a while, tapes are much slower than other back up processes. They make up for that lack of speed by being more reliable overall.

Once complete:

ls -alh /backups/restore

to make sure you recovered the correct files.

Level-0 backups

first we want to put in a fresh tape.
[root@vnfe4 ~]# mt -f /dev/nst1 eject
this rewinds the disk in the drive, then ejects it. Take this tape and place it in an empty plastic sleeve.
If backing up vnfe4 (head) find a tape which does not contain last month's backup (I usually select one that is really old to prevent over-wearing a particular tape). This new tape will be labelled head-store. The month/year of the backup should be written on the label on the tape. Do this in pencil, I leave one in the cluster, but if it is not there (they go missing all the time) buy some on Matt's account at the bookstore they have some really cheap pencils.

make sure the tape is loaded properly:

[root@vnfe4 ~]# mt -f /dev/nst1 status
SCSI 2 tape drive:
File number=0, block number=0, partition=0.
Tape block size 1024 bytes. Density code 0x81 (DLT 15GB compressed).
Soft error count since last status=0
General status bits on (41010000):
BOT ONLINE IM_REP_EN

If it returns any other information that what is shown above the tape did not load properly. In that situation try

[root@vnfe4 ~]# mt -f /dev/nst1 load

if this does not change the status, you will need to try a different disk. There is a possibility that the disk is bad. Label it as bad and do not use it again. Another sign of a bad disk is one that will not load in the machine. If the tape drive does not load the disk do not force it. Label it as bad and leave it. I am not certain these disks are bad, but I would not trust data on them until I can verify they are good.

Assuming the tape is good and has now loaded properly you are ready to perform the monthly backups. The scripts needed for backups are:
/home/backups/scripts/head-store-incremental-backup
/backups/scripts2/bh-incremental_find
/backups/scripts2/bh-incremental_tar

the first of these scripts handles the search for new files and the tape backup in one script. The second two scripts; first, find the necessary files to be backed up, and second, actually put them on the tape. Both are in place in crontab as:

[root@vnfe4] crontab -e
# Format of lines: #min hour daymo month daywk cmd
00 20 03 1-12 * /backups/scripts/head-store-level-0-backup_find 00 01 04 1-12 * /backups/scripts/head-level-0-backup_tar 25 20 23 1-12 * /backups/scripts2/bh-level-0_find 30 21 23 1-12 * /backups/scripts2/bh0-level-0_tar 00 02 24 1-12 * /backups/scripts2/bh-level-0_tar

NOTE: These are set to night runs. This is to keep the head free from processes during the time we expect people to be online.
NOTE: in crontab all columns are separated by a space or tab character
NOTE: the "find" scripts find the files to be backed up, since this is a much longer process than the incrementals so Scott Noble and I separated them.
NOTE: bh0 has a lot of files on it (several years worth of student's HW and projects and... so it is set up to save to a separate disk when necessary, bh2 is also on that separate script since the 410 TA has the same overload of files. This may be a problem in the future since starting and stoping a write is slower than just copying a large single file. (IE we may need to zip these old HW files in the future. otherwise backups are far too slow.
NOTE: the tar scripts actually copy the files from the HDD to the tape

The bh backups depend on a local (to bh machines) script called /usr/local/find_all which may be problematic as it depends on a DNS bh#.physics.ubc.ca and as the phas IT people change our DNS this script may need to be modified.

The partitions being backed up on vnfe4 are:
/home
/home2
/home3
/root

The partitions being backed up on bh machines are:
/home
/home2
/root
/etc

I have included /home2 on all machines so that when new HDD are installed the scripts will not need to be changed. (Until of course we have 3 home partitions)

Feel free to read the scripts for details, ultimately the part that should concern you is the list of machines that are being backed up. These scripts are available online on another page in my sys-admin page

When the backups are complete (IE the next day sometime) do:

[root@vnfe4 ~]# mt -f /dev/nst1 status

this will tell you how many files are now stored on the tape.
There will be 4 files per partition (in the case of vnfe4) or 4 files per machine (in the case of the bh machines)

The first file is a blank,
the second is a label indicating what partition/machine is backed up,
the third is a blank,
the fourth will be the files themselves.

to access the information on the tapes first rewind the tape:

entirely &rarr mt -f /dev/nst1 rewind
n-files &rarr mt -f /dev/nst1 bsf n (recall you will want to do multiples of 4 to find what you are interested in)

(note: mt -f /dev/nst1 fsf n moves forward n files on the tape)

then

tar tvf /dev/nst1

which will tell you only the file names, it does not extract the files themselves.

once you find the partition you are interested in recovering you will need to move back one file on the tape again

mt -f /dev/nst1 bsf 1

cd /backups/restore

tar xvf /dev/nst1

the last line will now extract the files to the hard disk in /backups/restore this process may take a while, tapes are much slower than other back up processes. They make up for that lack of speed by being more reliable overall.

Once complete:

ls -alh /backups/restore

to make sure you recovered the correct files.


Maintained by ajpenner@physics.ubc.ca. Supported by Matt Choptuik