≡ Menu

rsync

A shell variable may be assigned to by a statement using following syntax:
var=value

If value is not given, the variable is assigned the null string. In shell program it is quite useful to provide default value for variables. For example consider rsync.sh script:
#!/bin/bash
RSRC=$1
LOCAL=$2
rsync -avz -e 'ssh ' user@myserver:$RSRC $LOCAL

This script can be run as follows:
$ ./rsync.sh /var/www .
$ ./rsync.sh /home/vivek /home/vivek

It will sync remote /home/vivek directory with local /home/vivek directory. But if you need to supply default values for a variable you can write as follows:

#!/bin/bash
RSRC=$1
LOCAL=$2
: ${RSRC:="/var/www"}
: ${LOCAL:="/disk2/backup/remote/hot"}
rsync -avz -e 'ssh ' user@myserver:$RSRC $LOCAL

: ${RSRC:="/var/www"} ==> this means if the variable RSRC is not already set, set the variable to /var/www. You can also write same statement with following code:

if [ -z "$RSRC" ]
then
   RSRC="/var/www"
fi

You can also execute a command and set the value to returned value (output). For example if the variable NOW is not already set, execute command date and set the variable to the todays date using date +"%m-%d-%Y":

#!/bin/bash
NOW=$1
....
.....
: ${NOW:=$(date +"%m-%d-%Y")}
....
..

How do you install and use rsync to synchronize files and directories from one location (or one server) to another location? - A common question asked by new sys admin.
[click to continue…]

Perform backups for the Linux operating system

This question asked again and again by a new Linux sys admins:

How do I perform backups for my Linux operating system?

So I am putting up all necessary information you ever need to know about backup. The main aim is to provide you necessary software, links and commands to get started as soon as possible.

Backup is essential

First a backup is essential. You need a good backup strategy to:

  • Minimize time from disaster such as server failure or human error (file deleted) or acts of God
  • To avoid downtime
  • Save money and time
  • And ultimately to save your job ;)

A backup must provide

  • Restoration of a single/individual files
  • Restoration of file systems

What to backup?

  • User files and dynamic data [databases] (stored in /home or specially configured partitions or /var etc).
  • Application software (stored in /usr)
  • OS files
  • Application configuration files (stored in /etc, /usr/local/etc or /home/user/.dotfiles)

Different types of backups

  • Full backups: Each file and directory is written to backup media
  • Incremental backups (Full + Incremental backup): This backups are used in conjunction with full backup. These backups will be incremental if each original piece of backed up information is stored only once , and then successive backups only contain the information that changed since the previous one. It use file's modification time to determine which file need to backup.

So when you restore incremental backup:

  1. First restore the last full backup
  2. Next every subsequent incremental backup you need to restore

Preferred Backup Media

  1. Tape (old and trusted method)
  2. Network (ftp, nas, rsync etc)
  3. Disk (hard disk, optical disk etc)

Test backups

Please note that whichever backup media you choose, you need to test your backup. Perform tests to make sure that data can be read from media.

Backup Recommendation

My years of experience show that if you follow following formulas you are most likely to get back your data in worst scenario:
(a) Rotate backup media
(b) Use multiple backup media for same data such as ftp and tape
(c) Keep old copies of backups offsite

In short create good disaster recovery plan.

General procedure to restore a Linux/UNIX box

There is not golden rule or procedure but I follow these two methods:

Method # 1: Reinstall everything, restore everything, and secure everything

Use this method (bare metal recovery) if your server is cracked or hacked or hard drive is totally out of order:

  1. Format everything
  2. Reinstall os
  3. Configure data partitions (if any)
  4. Install drivers
  5. Restore data from backup media
  6. Configure security

Method # 2: Use of recovery CD/DVD rom

Use this method if your box is not hacked and system cannot boot or MBR damaged or accidental file deletion etc:

  1. Boot into rescue mode.
  2. Debug (or troubleshoot) the problem
  3. Verify that disk partitions stable enough (use fsck) to put backup data
  4. Install drivers
  5. Restore data from backup media
  6. Configure security

Linux (and other UNIX oses) backup tools

Luckily Linux/UNIX provides good set of tools for backup. We have almost covered each and every tool mentioned below. Just follow the link to get more information about each command and its usage:

It is also recommended that you use RAID or LVM (see consistent backup with LVM) or combination of both to increase reliability of data.

A note about MySQL or Oracle database backup

Backing up database server such as MySQL or Oracle needs more planning. Generally you can apply a table write lock and use mysql database dump utility to backup database. You can also use LVM volume to save database data.

A note about large scale backup

As I said earlier tar is good if you need to backup small amount of data that does not demands high CPU or I/O. Following are recommended tools for backup that demands high CPU or I/O rate:

(a) amanda - AMANDA, the Advanced Maryland Automatic Network Disk Archiver, is a backup system (open source software) that allows the administrator to set up a single master backup server to back up multiple hosts over network to tape drives/changers or disks or optical media.

(b) Third party commercial proprietary solutions:
Top three excellent commercial solutions:

If you are looking to perform the tasks of protecting large-scale computer systems use above solutions and following two books will give you good idea:

Recommended further readings

I hope this small how to provide enough information to anyone to kick start your backup operation. Tell me if I am missing something or if you have a better backup solution or strategy, please comment back.

This is Part II in a series on Execute Commands on Multiple Linux or UNIX Servers Simultaneously. The full series is Part I, Part II, and Part III. Many times, you want to execute a command not only on one server, but also on several servers. For example, find out

  • Version of kernel
  • Version of Apache web server
  • Update static html or images files on all web servers via rsync
  • Find out user information, server information, memory usage etc
  • Security/patch checking

tentakel

I have already covered how to execute commands on multiple Linux or UNIX servers via shell script. The disadvantage of script is commands do not run in parallel on all servers. However, several tools exist to automate this procedure in parallel. With the help of tool called tentakel, you run distributed command execution. It is a program for executing the same command on many hosts in parallel using ssh (it supports other methods too). Main advantage is you can create several sets of servers according requirements. For example webserver group, mail server group, home servers group etc. The command is executed in parallel on all servers in this group (time saving). By default, every result is printed to stdout (screen). The output format can be defined for each group.

How it works?

Consider the following sample setup:

admin workstation   Group                  Hosts
|----------------> www-servers        host1, host2,host3
|----------------> homeservers        192.168.1.12,192.168.1.15
IP: 192.168.1.1

You need to install tentakel on admin workstation (192.168.1.1). We have two group servers, first is group of web server with three host and another is homeservers with two hosts.

The requirements on the remote hosts (groups) need a running sshd server on the remote side. You need to setup ssh-key based login between admin workstation and all group servers/hosts to take full advantage of this tentakel distributed command execution method.

System requirement

Tentakel requires a working Python installation. It is known to work with Python 2.3. Python 2.2 and Python 2.1 are not supported. If you are using old version of python then please upgrade it.

Configuration

Let us see howto install and configure tentakel.

Step # 1 : Download tentakel

Visit sourceforge home page to download tentakel or download RPM files from tentakel home page.

Step # 2: Install tentakel

Untar source code, enter:

# tar -zxvf tentakel-2.2.tgz

You should be root user for the install step. To install it type

# make
# make install

Step # 3 Configure groups

For demonstration purpose we will use following setup:

   admin pc                    Group           hosts
Running Debian Linux       homeservers     192.168.1.12 192.168.1.15
User: jadmin

Copy sample tentakel configuration file tentakel.conf.example to /etc directory

# cp tentakel.conf.example /etc/ tentakel.conf

Modify /etc/tentakel.conf according to above setup, at the end your file should look like as follows:

# first section: global parameters
set ssh_path="/usr/bin/ssh"
set method="ssh"  # ssh method
set user="jadmin"   # ssh username for remote servers
#set format="%d %o\n" # output format see man page
#set maxparallel="3"  # run at most 3 commands in parallel
# our home servers with two hosts
group homeservers ()
+192.168.1.12 +192.168.1.15
# localhost
group local ()
+127.0.0.1

Save the file and exit to shell prompt. Where,
group homeservers () : Group name
+192.168.1.12 +192.168.1.15 : Host inclusion. name is included and can be an ip address or a hostname.

Step # 4 Configure SSH password less login

Configure ssh-key based login to avoid password prompt between admin workstation and group servers for jadmin user.

Step # 5 Test tentakel

Login as jadmin and type the following command:

$ tentakel -g homeservers

interactive mode
tentakel(homeservers)>

Where,
-g groupname: Select the group groupname The group must be defined in the configuration file (here it is homeservers). If not specified tentakel implicitly assumes the default group.

At tentakel(homeservers)> prompt type command uname and uptime command as follows:

exec "uname -mrs"
exec "uptime"

Few more examples
Find who is logged on all homeservers and what they are doing (type at shell prompt)

$ tentakel -g homeservers "w"

Executes the uptime command on all hosts defined in group homeservers:

$ tentakel -g homeservers uptime

As you can see, tentakel is very powerful and easy to use tool. It also supports the concept of plugins. A plugin is a single Python module and must appear in the $HOME/.tentakel/plugins/ directory. Main advantage of plugin is customization according to your need. For example, entire web server or mysql server farm can be controlled according our requirements.
However, tentakel is not the only utility for this kind of work. There are programs that do similar things or have to do with tentakel in some way. The complete list can be found online here. tentakel should work on almost all variant of UNIX/BSD or Linux distributions.

Time is a precious commodity, especially if you're a system administrator. No other job pulls people in so many directions at once. Users interrupt you constantly with requests, preventing you from getting anything done and putting lots of pressure on you. What do you do? The answer is time management. Read our book review of Time Management for System Administrators. Continue reading Execute commands on multiple hosts using expect tool Part III of this series.

Reference:

  • Read tentakel man page for tentakel configuration options
  • tentakel home page

Update: Damon confirmed that it works on Windows too with little modification.