DNS and Offsite Fail Over

The Internet was engineered with fail over and high availability in mind. Using basic concepts of DNS it’s easy to implement automatic offsite fail over in the event of a network outage or disaster.

10 year after-thought: This isn’t the ideal setup for failover. The situation that brought this about was that we were a webteam hosting on a University network that was incredibly unstable. We felt passionate about uptime and so placed an offsite server in another location and convinced the IS department to allow us to host a secondary DNS server. I used the secondary DNS server to provide failover in an environment where the concept of failover was foreign. When the primary network went down, so did the primary DNS, which left our secondary server as the only authoritative DNS. Thus a creative solution to uptime in a parent culture where uptime was not a priority.

Setup:

  • Master DNS and web servers are located at our primary location.
  • Secondary DNS and web servers are located at an offsite location.

Logic:

  • If Internet connectivity is lost at the primary location both web and DNS servers will fail to respond to requests.
  • The secondary DNS server continually monitors these services and in the event of an outage modifies the slave’s DNS so that requests are directed to the offsite location.
  • When the Master comes back online DNS replication automatically kicks back in and reverts DNS to its original settings.

I have written the following script to monitor both primary web and DNS servers for outages. When the amount of servers that are still up fall below $limit a search and replace is performed on the slave’s zone file. The reason I avoided a graceful restart via ‘rndc’ is so that I would not have to increment the serial number of the slave and thus throw the master/slave out of sync.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#!/usr/bin/perl -w
use strict;
# Randy Sofia 2009
# DNS Modifier
# Check listed servers to see if they are up. 
# If less than $limit are up then @searchreplace $zonefile.
 
#----------------[SETTINGS]----------------
my @webservers; 
push @webservers, ("10.0.0.30", "10.0.0.40");
 
my @dnsservers;
push @dnsservers, ("10.0.0.38", "10.0.0.22");
 
my $zonefile="/etc/namedb/slave/db.slave.edu";
 
my @searchreplace;
push @searchreplace, ('10.0.0.30', '10.10.0.20');
push @searchreplace, ('10.0.0.40', '10.10.0.20');
 
my $limit=1;    # Minimum amount of servers up 
#------------------------------------------
 
my $result;
my $totalservers=scalar(@webservers) + scalar(@dnsservers);
my $upcount=$totalservers;
 
foreach my $webserver (@webservers) {
        $result = `/usr/local/bin/wget -q -t1 -T2 -O - $webserver`;
        $upcount-- if (!$result)
}
 
foreach my $dnsserver (@dnsservers) {
        $result = `/sbin/ping -t1 -c1 $dnsserver`;
        $upcount-- if ($result =~ /.+0 packets received/) 
}
 
printf ("%d/%d servers up", $upcount, $totalservers);
 
&ChangeDNS($zonefile, @searchreplace) if ($upcount < $limit);
 
sub ChangeDNS  {
        my $filebuffer;
        my $file=shift;
        open READZONEFILE, $file or die $!;
 
        while (my $line = <READZONEFILE>) {
                for (my $i=0; $i<scalar(@_); $i=$i+2) {
                        $line =~ s/$_[$i]/$_[$i+1]/g;
                }
                $filebuffer .= $line;
        }
        close(READZONEFILE);
        open WRITEZONEFILE, ">", $file or die $!;
                print WRITEZONEFILE $filebuffer;
        close(WRITEZONEFILE);
        system("/etc/rc.d/named restart");
}

Set the TTL of your DNS to a couple minutes and set to run every minute or so in your crontab. I did not write this script with portability in mind so you may have to modify the location of your binaries and zone file. Binaries are default for FreeBSD. You’ll also need to install wget. Master/slave DNS configuration is not covered here.

Warning: Do not attempt to implement this unless you understand the logic behind it and are capable of performing tests in a staging environment. Additional variables such as a tertiary offsite location require additional tweaks to the logic that make this solution work. We are available for consulting if in doubt.

Incremental Snapshots: ZFS and `rsync`

Many modern file systems provide a feature called snapshots. As the name implies, snapshots allow you to make an “image” of a file system with one beneficial feature: multiple “images” can be made over time without the space requirements of storing multiple copies of your data. This is accomplished by referencing previously unchanged data rather than storing it multiple times over.

Perhaps you’d like to provide web developers access to files from 20 days ago simply by browsing to a directory /home/backup/20_days_ago/ .  From here you’d like them to be able to look at individual files and determine if 20 days is sufficient and if not try /home/backup/21_days_ago. If the set of files you want to backup is 90GB and you’d prefer not to waste 90GB*30days (2.7TB) worth of disk space, snapshots are for you.

Both ZFS and UFS provide the functionality to create snapshots at the file system level and it’s relatively simple to do. If you don’t have the luxury of using ZFS or for some reason it’s not feasible using your current filesystem there is another quick and dirty method, you can use rsync.

The ZFS Method:

1
2
3
4
5
#!/usr/local/bin/bash
days=31
pool=home/sites
zfs snapshot $pool@DAILY-`date +%Y-%m-%d`
zfs list -t snapshot -o name | grep $pool@DAILY- | sort -r | tail -n +$days | xargs -n 1 zfs destroy

Chmod the above script executable and give it a place in your daily crontab. It will automatically purge snapshots that are older than the specified number of days. The above script can easily be modified to take in arguments or create snapshots in smaller increments.

Files can be found and browsed in the directory /home/sites/.zfs/snapshot/

The rsync method:
update: While this still works and is quite clever- rsnapshot makes this method much much more simple to accomplish.

Thanks to basic file system principles this can also be accomplished without the use of file system snapshots. Here is a quick way to accomplish the task of storing the last 31 days worth of “snapshots” of a directory from any ‘nix file system.  In this example we’ll create snapshots of /home/sites/  in /home/backup/{0..31}_days_ago

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/usr/local/bin/bash
source=/home/sites/
dest=/home/backup/
 
test -d $dest || mkdir $dest
rm -rf $dest"31_days_ago/"
 
for i in {31..1}
do
     prev=$(($i-1))
     mv $dest$prev"_days_ago" $dest$i"_days_ago"
done
/usr/local/bin/rsync -a --delete --link-dest=$dest"1_days_ago" $source $dest"0_days_ago/"

Don’t forget to set it as executable `chmod 755 above_script`

Cron entry:
0 23 * * * /location/of/above_script > /dev/null

Each time the script is run it moves the directories x_days_ago to x+1_days_ago and creates 0_days_ago as the most recent snapshot.