Thursday, 16 September 2010

Complete backup solution implementation


The following is the complete backup solution implemented using combination of amanda and rsync scripts. This is the documentation for the backup process.

Backups are being handled with primary backup server running amanda server package version 3.1.2-1.rhel5.i386 and clients running amanda client packages. Rsync backup script is used for synchronizing amanda configuration files and amanda vtapes (files containing amanda backups) to the remote backup server.

The two backup servers are redbck01 and redbck02. Both are hp proliant DL360 running centos 5.5 with 1.8 Tb of storage and 6 GB memory. redbck01 is primary backup server and redbck02 is the secondary offisite backup server.

Amanda server configuration:
redbck01 is configured as amanda server which runs the configuration 'DailySet1' everynight for backing up the Disk List entries or clients as specified in the disk list file. Amanda configuration files for 'DailySet1' are located in /etc/amanda/DailySet1/ . amanda uses amanda.conf and disklist files located in config directory while running the backup configuration.
The following are some of the lines in /etc/amanda/DailySet1/amanda.conf which define details of backup configuration:
mailto "root" # emails the amanda report after the amanda backup runs_
tpchanger "chg-disk" # tape changer for virtual tapes
tapedev "file://space/vtapes/DailySet1/slots" # the tape device (where the virtual tapes are defined and data is stored)_
tapetype "HARDDISK" # use hard disk instead of tapes (vtape config)
dumpcycle "4 weeks" # the number of days in dump cycle before starting to reuse the tapes
runspercycle 20 # the number of amdump runs in dumpcycle days
tapecycle 25 tapes # the number of tapes in rotation
logdir "/etc/amanda/DailySet1/log" # log directory
define tapetype HARDDISK {
lenght 50000 mbytes # the maximum capacity of each slot is 50 GB * 25 slots = 1.25 TB max
}
auth "bsdtcp" # use bsdtcp for authentication between amanda server and clients
define dumptype comp-user-tar {
user-tar # we're using comp-user-tar in our disk list entries as the dump type as it uses fast compression
compress client fast } Amanda determines the backup levels by its own algorithm which guarantees full backups of all disklist entries in a dumpcycle and incremental backups according to the necessity.
The following are the disklist entries from the /etc/amanda/DailySet1/disklist file which defines the clients and directories being backup up along with dumptype:
xenora02 / comp-user-tar
clxbld01 /etc comp-user-tar
clxbld01 /root comp-user-tar
clxbld01 /home comp-user-tar
clxbld01 /usr/svn comp-user-tar
clxbld01/usr/local comp-user-tar

Amanda server uses /var/lib/amanda/.amandahosts file to allow the backup clients to connect back to the server when doing restores. Specify fully qualified domain names of all clients as follows:

xenora02 root amindexd amidxtaped

To check and verify the amanda configuration is working without any errors: On redbkc01, run the amcheck tool as amandabackup user:
-sh-3.2$ amcheck DailySet1
To run the backup manually on redbck01, as amandabackup, run amdump to start the DailySet1 backup.
-sh-3.2$ amdump DailySet1
after amdump finishes it will email the report to the specified addresses in the config file.
To find out what has been backed up we can use amadmin tool with find argument for a quich summary of entire backups.
-sh-3.2$ amadmin DailySet1 find
amdump will run automatically every night. Crontab entries for the amandabackup user are:
0 16 * * 1-5 /usr/sbin/amcheck -m DailySet1 # every week day at 4 pm run amcheck to verify the configuration is working
5 21 * * 1-6 /usr/sbin/amdump DailySet1 # every mon-sat at 5 past mid night run the amdump on DailySet1 config
amdump maximum runtime at the current configuration is approximately: 4 hours
Synchronizing redbck01 and redbck02:
Our main approach for backups is that in case of client failure we must be able to recover the data as early as possible without any single point of failure. So for example a client is dead and data has to be restored to the newly created client we can get the data from either redbck01 or redbck02. But as the redbck01 is the primary backup server and is located inside the network it is quicker to recover from it. If in case of a disaster, redbck01 is lost then we can get the data from offsite backup server redbck02 by configuring amanda client accordingly.
Therefore synchronising the data between redbck01 and redbck02 is done by using rsync. Amanda server package is installed in redbck02, the data needed from redbck01 is '/etc/amanda/' which holds amanda configuration files and '/space/vtapes' which holds the backups from clients (Disk list entries). A rsync backup script 'redbck02rsync.sh' is being used for syncing the data from redbck01 to redbck02 every night.
Crontab entries for user root on redbck01 which runs the rsync script every night are:
#0 0 * * 1-6 /root/redbck02rsync.sh -s /etc/amanda -d redbck02:/etc#0 1 * * 1-6 /root/redbck02rsync.sh -s /space/vtapes -d redbck02:/space
Amanda Client Configuration:
Amanda clients use /var/lib/amanda/.amandahosts file for allowing backup server to connect to the amanda client. So the file must have the following entry of backup server:
redbck01.
Amanda client configuration file is /etc/amanda/amanda-client.conf which is used for connecting to the server for restoring data. It contains details of configuration name, index server (either redbck01 or redbck02 depending on situation), tape_server and authentication protocol information. The following are some of the lines of /etc/amanda/amanda-client.conf which defines the client configuration file.
conf "DailySet1" # your config name
index_server "redbck01." # your amindexd server
tape_server "redbck01." # your amidxtaped serverauth "bsdtcp"
Recovery
To recover data to the client in the event of data loss or system failure, we can either use primary backup server redbck01 or offsite backup server redbck02. we must specify the configuration, index_server and tape_server details in /etc/amanda/amanda-client.conf file.
after the configuration, as root run amrecover to initiate the data recovery process.
[root@dlxmkt01 ~]# amrecover
AMRECOVER Version 3.1.2. Contacting server on redbck01.
...220 redbck01 AMANDA index server (3.1.2) ready.Setting restore date to today (2010-09-16)200
Working date set to 2010-09-16.200 Config set to DailySet1.501
Host dlxmkt01 is not in your disklist.Trying host dlxmkt01 ...
501 Host dlxmkt01 is not in your disklist.Trying host dlxmkt01 ...501 Host dlxmkt01 is not in your disklist.
Use the sethost command to choose a host to recover
amrecover>
The list of commands below will demonstrate a recovery of a set of different files and directories to the "/tmp/amanda/phooper/" directory. amrecover may not ask for sethost command if the system hostname is fully qualified domain name.
amrecover> listhost
200- List hosts for config DailySet1
201- xenora02.
201- clxbld01.
201- isobck01.
200 List hosts for config DailySet1
amrecover> sethost dlxmkt01.
200 Dump host set to dlxmkt01.
amrecover> listdisk
200- List of disk for host dlxmkt01.
201- /
200 List of disk for host dlxmkt01.
amrecover> setdisk /
200 Disk set to /.
amrecover> cd /
amrecover> ls
2010-09-16-00-05-03 var/
2010-09-16-00-05-03 usr/
2010-09-16-00-05-03 tmp/
2010-09-16-00-05-03 tdev15/
2010-09-16-00-05-03 sys/
2010-09-16-00-05-03 srv/
2010-09-16-00-05-03 selinux/
2010-09-16-00-05-03 sbin/
2010-09-16-00-05-03 root/
2010-09-16-00-05-03 proc/
2010-09-16-00-05-03 opt/
2010-09-16-00-05-03 mnt/
2010-09-16-00-05-03 media/
2010-09-16-00-05-03 lib64/
2010-09-16-00-05-03 lib/
2010-09-16-00-05-03 initrd/
2010-09-16-00-05-03 home/
2010-09-16-00-05-03 etc/
2010-09-16-00-05-03 dev/
2010-09-16-00-05-03 data/
2010-09-16-00-05-03 boot/
2010-09-16-00-05-03 bin/
2010-09-16-00-05-03 .
2010-09-14-00-05-03 init
2010-09-14-00-05-03 .autofsck
amrecover> lcd /
amrecover> add /tmp/amanda/phooper
Added dir /tmp/amanda/phooper/ at date 2010-09-16-00-05-03
amrecover> extract
Extracting files using tape drive changer on host redbck01.
The following tapes are needed: DailySet1-15
Extracting files using tape drive changer on host redbck01.
Load tape DailySet1-15 nowContinue [?/Y/n/s/d]? y
amrecover> quit
200 Good bye.
If redbck01 fails we can recover the data from redbck02 by running the rsync script 'redbck02rsync.sh' which is located in /root/ directory from redbck02

Wednesday, 15 September 2010

Rsync Backup Script

Backup script for synchronising directories among two or more systems. The most obvious one, at least when anything that is being sent to the system scheduler, was logging. The next two are sort of ad-hoc judgment calls. A usage message (which is quite lengthy) was added simply because, well this script has been running for two months and if the user would have to rerun it by hand - frankly they wouldn't have a clue. The last item was an override for the flags and protocol in the form of additional arguments. The rsync flags are sent in via a quoted string which makes it easier to parse in the script. The reader should feel free to change the script as they see fit.

Adding the ability to do different multiple sources and/or destinations was dropped due to the amount of heads and arms the script would have to grow to facilitate the capability. Additionally, logging is simplified by invoking the script by itself for each session. Remember the golden Unix utility rule:

Do one thing and do it well

Bearing that that in mind, a multiple src/dst script would be better served with a completely different script, lest the current one morph into some sort of evil hydra.

Adding Logging

It is a shell script and as such it makes perfect sense to use the shell utility logger to access syslog directly.

PROTO=ssh RSYNC=rsync RSYNC_FLAGS="-az --delete -e $PROTO" LOGCMD=logger

LOGID=rsyncondemand
# The name of our /var/log/messages entries
LOG_FLAGS="-t $LOGID" # Flags for the logger interface

Note that UTIL was changed to RSYNC - it is pretty much a given that rsync is being used, so for sanity it was changed. The additions here are the logger command and one flag, what the entry format will look like. Here is a sample from /var/log/messages:

Mar 16 00:32:12 pyxis syncondemand[3410]: Test

The rest

There are only three entries, however, anyone could feel free to change them. They are as follows:

for i in $PROTO $RSYNC $LOGCMD

do if ! type ${i} >/dev/null; then
$LOGCMD $LOGID "${i} not found"
bomb "${i} not found" fi done

A message in the utilities checker loop. If, for instance, the utility were installed blindly on a new machine and tossed into crontab, this rather informative message would let the administrator know that they need to install a utility or application.



$LOGCMD

$LOGID "Starting $RSYNC operation from $SRC to $DST"
$RSYNC $RSYNC_FLAGS $SRC $DST $LOGCMD $LOGID "Finished $RSYNC operation from
$SRC to $DST"

A start and stop messages for recording the amount of time. What happens if they start to take extraordinary long amounts of time? These particular messages let the admin know ahead of time and can easily be added to log analysis tools.

Additional Flags & Proto

Note the top of the script with the following global definitions:

PROTO=ssh RSYNC_FLAGS="-az --delete -e $PROTO"

In this version only those two are altered. Changing the protocol requires that the actual final rsync invocation be changed as well.

First, make the call and global different:

PROTO=ssh RSYNC_FLAGS="-az --delete -e " ... $RSYNC $RSYNC_FLAGS $PROTO $SRC $DST

Note that the protocol now must be defined either by the ssh default or using the modification.

Next, all that has to be done is to add a method to change the default rsync flags and protocol. It can all be done in the switch/case loop:

while getopts s:d: ch; do   

case ${ch} in
s)SRC=${OPTARG}
;;
d)DST=${OPTARG}
;;
f)$RSYNC_FLAGS=${OPTARG}
;;
p)$PROTO=${OPTARG}
;;
esac done
shift $((${OPTIND} - 1))

The only downside is that the -f has to use double quotes to replace all of the strings - regrettable - yes; but there is a usage message after all.

The Easy Part - Usage

Usage messages in shell scripts, frankly, rock. Simply because there is an all ending one simple way to do them even though they look kind of freaky relative to the indentation of the rest of the script, as usual, only the code can explain it:

usage() {     cat <<_usage_>

Basically it is just redirection and yes, it does make a well formed script look really ugly,

however, using that particular method over typing echo... echo... echo... and avoiding echo

calls makes it well worth the apparent ugliness.


Example Crontab entries for the script are:

$crontab -e

0 1 * * 1-6 /root/redbck02rsync.sh -s /etc/amanda -d redbck02:/etc

0 5 * * 1-6 /root/redbck02rsync.sh -s /space/vtapes -d redbck02:/space


Summary
This simple script should be a good start to organizing and managing rsync style backups. It is small,

concise and easy to use or modify. Some food for thought might be adding ssh options

(such as forced versions) and extended operations like cloning.


passwordless ssh and rsync script

Method1 using RSA authentication


he SSH protocol is recommended for remote login and remote file transfer which provides confidentiality and security for data exchanged between two computer systems, through the use of public key cryptography. The OpenSSH server provides this kind of setup under Linux. It is installed by default. This how-to covers generating and using ssh keys for automated usage such as:

  1. Automated Login using the shell scripts.
  2. Making backups.
  3. Run commands from the shell prompt etc.

Task: Generating SSH Keys

First, log on to your workstation ( for example log on to workstation called admin.fbsd.nixcraft.org as vivek user). Please refer the following sample setup - You will be log in, on your local system, AS THE USER you wish to make passwordless ssh connections.

Fig.01 ssh key based authentication

ssh key based authentication

Create the cryptographic Key on FreeBSD / Linux / UNIX workstation, enter:
ssh-keygen -t rsa


Assign the pass phrase (press [enter] key twice if you don't want a passphrase). It will create 2 files in ~/.ssh directory as follows:

  • ~/.ssh/id_rsa : identification (private) key
  • ~/.ssh/id_rsa.pub : public key

Use scp to copy the id_rsa.pub (public key) to rh9linux.nixcraft.org server as authorized_keys2 file, this is know as Installing the public key to server.


scp .ssh/id_rsa.pub vivek@rh9linux.nixcraft.org:.ssh/authorized_keys2


From FreeBSD workstation login to server:


ssh rh9linux.nixcraft.org


Changing the pass-phrase on workstation (if needed):


ssh-keygen -p


Use of ssh-agent to avoid continues pass-phrase typing
At freebsd workstation type:


ssh-agent $BASH
ssh-add


Type your pass-phrase

Now ssh server will not use prompt for the password. Above two commands can be added to your ~/.bash_profile file so that as soon as you login into workstation you can set the agent.

Deleting the keys hold by ssh-agent

To list keys, enter:
ssh-add -l
To delete all keys, enter:
ssh-add -D
To delete specific key, enter:
ssh-add -d key


Method2 using DSA authentication

Here is a list of the steps that I had to do to get automatic replication of /home/folder1 (or any other folder) on one server to /home/folder2 on another server:

Passwordless SSH

To get replication working securly you firstly need to be able to connect via SSH without using passwords:

First server setup

ssh-keygen -t dsa

(press enter twice to give a blank password)

cd vi .ssh/.config

Press "i" to enter insert mode and copy this into the file:

Host remotehost
User remoteuser
Compression yes
Protocol 2
RSAAuthentication yes
StrictHostKeyChecking no
ForwardAgent yes
ForwardX11 yes
IdentityFile /home/localuser/.ssh/id_remotehost_dsa

Do NOT change the last line - it is supposed to say remotehost (not an actual host name). Now,

:wq

(save and exit vi)

chmod 700 .ssh vi .ssh/id_dsa.pub

It should look like this:

ssh-dss AAAA..............v root@HOSTNAMEOFSRV01

where there is lots of random letters/numbers where the dots are. Select it all and copy it. Make sure that it is all on one line with no spaces at the start or finish (which will happen if you copy it using putty on windows; test it by pasting it into notepad)
Tip: To copy from putty on windows select the text from within vi and pres Ctrl + Shift. To paste text enter insert mode and press the right mouse button.

Second Server Setup

cd vi .ssh/authorized_keys

Enter insert mode (press i) and paste the key, again ensuring that there are no spare newlines or spaces. Save the file and exit vi (press :wq then return, as above). Now you just need to set some permissions otherwise SSH will ignore the files you just created:

chmod 700 .ssh chmod 644 .ssh/authorized_keys

Testing passwordless SSH

On the first server, type

ssh srv02

where srv02 = the hostname of the second server. It could be an IP address too.

If it just logs you in (no passwords), then you are done. If not double check the above and start google searching your errors


Replication

You have two options for replication: Unison and Rsync.

  • Rsync is one-way (will overwrite changes on the second server).

  • Unison is two-way (will allow changes on both servers (though clearly not at the same time!)

Setting up RSYNC

Skip to the unison section if you want two-way replication

Rsync is normally installed so I will not go through installing it. To make the rsync connection run the following command on srv01:

rsync -e ssh -avz --delete /home/folder1/ srv02hostname:/home/folder2
rsync -e ssh -avz --delete /etc/amanda/ redbck02:/etc/
rsync -e ssh -avz --delete /space/vtapes/ redbck02:/space/

again, where srv02 is the hostname or IP of srv02. This will make /home/folder2 on srv02 (the second server) identical to /home/folder1 (be aware that this will delete all files in /home/folder2 on srv02 that are not in /home/folder1 on srv01!)

You can put as many of these as you line in the crontab (crontab -e). You now have rsync set up: congratulations.

Setting up UNISON

cd /bin wget http://www.cis.upenn.edu/~bcpierce/unison/download/stable/
latest/unison.linux-textui mv unison.linux-textui unison chmod +x unison

Then run this example at the first cluster to sync /var/www/html/ directories on both:

unison /var/www/html ssh://srv02hostname//var/www/html -batch

again, where srv02 is the hostname or IP of srv02

This will take a very long time to run for the first time but is very quick after that.

You can put as many of these as you line in the crontab (crontab -e). You now have unison set up: congratulations.



References:

http://www.davz.net/static/howto/sshkeys

Tuesday, 14 September 2010

Extend LVM Disk Space With New Hard Disk

This is a step-by-step guide used to extend logical volume group disk space, that’s configured under LVM version 1.x of Redhat Enterprise Linux AS 3. Although, this guide has also been used to extend LVM disk space with a new SCSI hard disk, that’s configured with LVM version 2.x in Debian Sarge 3.1.

So, it’s good enough to serve as a reference for Linux users, who plan to extend LVM disk space in Linux distributions other than Redhat and Debian Linux.

Although it’s not necessary, it’s advised to perform full file system backup before carry out this exercise!

The most risky step is to resize file system that resides in a LVM logical volume. Make sure the right file system resizer tool is used. If you’re using resize2fs to resize a Reiserfs file system, I guess you’ll know how bad will be the consequences.

Apparently, you’ll need resize_reiserfs to resize a Reiserfs file system, which is part of the reiserfsprogs package.

Steps to extend /home file system that mounts on logical volume /dev/vg0/lvol1 of volume group vg0, by using a new 36GB SCSI hard disk added to RAID 0 of HP Smart Array 5i Controller.


1) Log in as root user and type init 0 to shutdown Redhat Enterprise AS 3 Linux.

2) Add in the new 36GB SCSI hard disk. Since HP Smart Array 5i is configure for RAID 0, it’s fine to mix hard disks of different capacity, except that hard disk speed must be the same! A mix of 10K and 15K RPM hard disks might cause Redhat Enterprise Linux fails to boot up properly.

Normally, HP Smart Array 5i Controller will automatically configure new hard disk as a logical drive for RAID 0. If not, press F8 on boot up to get in HP Smart Array 5i Controller setup screen and manually create logical drive as part of RAID 0.

How to tell if new hard disk is not configured as logical drive for RAID 0?

Physically, the hard disk green light should be on or blinking to indicate that it’s online to RAID system.

From OS level, 3rd hard disk in RAID 0 of HP Smart Array 5i Controller is denoted as /dev/cciss/c0d2. So, type

fdisk /dev/cciss/c0d2

at root command prompt. If an error message Unable to open /dev/cciss/c0d2 or alike is returned, it means that new hard disk is not online to RAID system or Redhat Linux.

3) Boot up Redhat Enterprise Linux into multi-user mode and confirm it’s working properly. This step is not necessary, but it’s a good practice to prove that the server is working fine after each change has been made, be it a major or minor change.

4) Type init 1 at root command prompt to boot into single user mode. Whenever possible, boot into single user mode for system maintenance as to avoid inconsistency or corruption.

5) At the root command prompt, type

fdisk /dev/cciss/c0d2

to create partition for the 3rd SCSI hard disk added to RAID 0. Each hard disk needs at least one partition (maximum 4 primary partitions per hard disk) in order to use the new hard disk in a Linux system.

6) While at the fdisk command prompt, type m to view fdisk command options.

7) Type n to add a new partition, followed by p to go for primary partition type.

8) Type 1 to create the first partition. Press ENTER to accept first cylinder default as 1, and press ENTER again to accept the default value for last cylinder, which is essentially create single partition that use up all hard disk space.

9) Type t to change the partition system id, or partition type. As there is only one partition, partition 1 is automatically selected for action. Type L to list all supported partition type. As shown in partition type listing, type 8e to set partition 1 as Linux LVM partition type.

10) Type p to confirm partition /dev/cciss/c0d2p1 has been created in partition table. Type w to write the unsaved partition table of changes to hard disk and exit from fdisk command line.

11) Type df -hTa to confirm /home file system type, that’s mounts on logical volume /dev/vg0/lvol1. For this case, it’s an ext3 file system type.

12) Type umount /home to un-mount /home file system from Redhat Enterprise Linux.

13) Next, type LVM command

pvcreate /dev/cciss/c0d2p1

to create a new LVM physical volume on the new partition /dev/cciss/c0d2p1.

14) Now, type another LVM command

vgextend vg0 /dev/cciss/c0d2p1

to extend LVM volume group vg0, with that new LVM physical volume created on partition /dev/cciss/c0d2p1.

15) Type pvscan to display physical volumes created in Linux LVM system, which is useful to answer questions such as “How many physical volume created in volume group vg0?”, “How much of free disk space left on each physical volume?”, “How do I know which physical volume should be used for a logical volume?” “Which physical volume has free disk space for used with a logical volume?”, etc.



Sample output of pvscan command:

ACTIVE PV “/dev/cciss/c0d0p4″ of VG “vg0″ [274.27GB / 0 free]
ACTIVE PV “/dev/cciss/c0d1p1″ of VG “vg0″ [33.89GB / 0 free]
ACTIVE PV “/dev/cciss/c0d2p1″ of VG “vg0″ [33.89 GB / 33.89 GB free]
total: 3 [342.05 GB] / in use: 3 [342.05 GB] / in no VG: 0 [0]

Alternative, type vgdisplay vg0 | grep PE to confirm that new physical volume has been added to volume group vg0. Take note of Free PE / Size, 35GB in this case, that’s free disk space added by new physical volume in volume group vg0.

16) Execute LVM command

lvextend -L +33G /dev/vg0/lvol1 /dev/cciss/c0d2p1

to extend the size of logical volume /dev/vg0/lvol1 of volume group vg0 by 33GB on physical volume /dev/cciss/c0d2p1.

17) Now, the most risky steps to start. Type this command

e2fsck -f /dev/vg0/lvol1

to force ext3 file system check on /dev/vg0/lvol1. It’s a must to confirm file system is in good state, before implement any changes on it.

CAUTION – Utility e2fsck is only used to check EXT file system such as ext2 and ext3, and not other file system such Reiserfs file system!

Once the ext file system check completes without errors or warnings, type command

resize2fs /dev/vg0/lvol1

to resize EXT3 file system of /home, that mounts on logical volume /dev/vg0/lvol1, until it takes up all free disk space added to /dev/vg0/lvol1.

CAUTION – Utility resize2fs is only used to resize EXT file system such as ext2 and ext3, and not other file systems such as Reiserfs file system!

Both e2fsck and resize2fs utilities are part of e2fsprogs package. And both utilities takes some minutes to complete, depends on the size of target file system.

If everything alright, type mount /home to re-mount /home file system. Next, type df -h to confirm that /home file system has been extended successfully.

references:
http://www.walkernews.net/2007/02/27/extend-lvm-disk-space-with-new-hard-disk/