SUMMARY: Mirroring POP/SMTP servers

From: Rich Kulawiec (rsk@itw.com)
Date: Wed Feb 05 1997 - 16:40:51 CST


First, my original query:

        I've been asked to take a look at replicating a POP/SMTP server in
        order to minimize possible downtime in the event of a disk failure.
        
        I've already figured out that simply mirroring the relevant disk
        partitions via DiskSuite will work if the replicas are both on
        the same machine...but what I'd like to work out is a way to
        do this with the replicas on *different* machines. I've been
        searching the web pretty heavily this morning looking for any
        indication of how the state of both replicas can be kept in sync,
        but haven't found anything terribly helpful. Anybody out there
        managed to pull this off (in any environment, including SunOS,
        Solaris, BSDI, Linux)?

Answers seem to fall into three categories:

1. Use (various) high-availabilty/redundancy products from Qualix,
        Uniq, or Sun.
2. Try various approaches related to RAID/dual-ported disks.
3. Try to keep things in sync with various bits of other software.

Each is discussed below, with some responses reproduced in their entirety.

====================
Solutions of type 1:
====================

        From: cguevara@velu.com (Ing. Carlos R. Guevara)
        
        What this looks like is a CLUSTER of e-mail machines....for high
        availability purposes....
        
        Qualix Group distributes a software application called FIRSTWATCH or
        Qualix HA....that let's you do just that....It connects to machines via
        redundant nics and they both connect to a same disk subsystem ....
        
        The advantage of this package (which is priced depending on the
        machines used, so if it's just going to be used for e-mail could be
        small machines so the solution is not that expensive) is that it takes
        into consideration all aspects of a failure and switchover...whether
        just SENDMAIL crashed or it's a full system panic, and the second
        machine takes over for the entire duties of the redundant system.
        
        It can be used as both symmetric HA (each machine can do whatever, and
        when one goes down, the other takes over it's duties and keeps doing
        it's own as well, i.e - one machine can be an NFS server, the other the
        mail server, if the mail server goes down, the other machine handles
        both NFS and mail tasks, and vice-versa..) or asymmetric HA (one
        machine is just the back-up of the first, and can do no real work)....
        
        A client of mine is using Qualix HA (a.k.a. Firstwatch) and it is just
        a great tool...
        
        Even if the entire machine doesn't go down, it automatically can try to
        restart a service or make a failover to the other machine....It's
        functionality if you want highly available mail services may be very
        good.....
        
        P.S - This is not the only HA software on the market....for sun another
        important player is OpenVision and Sun itself....but both their
        products are considerably more expensive (Sun's requires you to use
        their larger machines with fully redundant system boards, power
        supplies, etc. - accomplishes much higher data availability, but the
        cost is huge).

Dan Penrod <penrod@wcnewmedia.com> pointed out that you can find
out more about the HA system by checking http://www.qualix.com.

And Peter Marelas <maral@phase-one.com.au> gave another web pointer:
http://www.ov.com/.

        From: Glenn Satchell <Glenn.Satchell@uniq.com.au>

        You might like to check out our Web pages for a product we wrote to do
        just this. It's called UPFS, the UP File System. There is a nice
        management overview and a technical white paper which goes into some of
        the details. Currently runs on Suns under Solaris 2.4, 2.5 and 2.5.1.

        http://www.uniq.com.au

And peter.bestel@uniq.com.au (Peter Bestel) seconded that idea. (Which means
that those guys at Uniq in Australia are awake!)

====================
Solutions of type 2:
====================

        From: Jim Harmon <jim@telecnnct.com>

        This doesn't actually answer your question, but it may offer an
        alternative solution if you're already using a RAID system... bearing
        in mind I have no information about the size or criticality of your
        system, or of your budget to solve this problem.
        
        I've come across a dual/quad-hosted 5-channel Raid Controller that you
        can share between (as the name implies :) 2-4 hosts.
        
        This would allow using a single RAID file system such as your POP/SMTP
        server to actually be hosted on 2 machines. One as prime, the second
        as backup. This could be as secondary filesystem so that each host
        would boot and function independantly until one or the other failed.
        
        Depending on the size of the settup, cost is somewhere between 10K and
        25K. Company is BoxHill, and the controller is called "Raid Box 5300"

        From: "Robert Tommaselli" <rtommase@fir.fbc.com>

        You can use ODS for disk mirroring on a set of dual ported scsi
        disks. ie. connect your external disks to two seperate machines.
        Using disk sets you can negotiate control of the disks and your
        replicas take care of them selves on the data disks anyway. The
        cpus negotiate who has control and who takes control in a failure
        situation. You have now setup a High Availability(HA) system.
        Openvision will sell this to you for 10K a server. You can
        write shell scripts that do the same.

          xxxxxxxxxx |---------------| xxxxxxxxxxxxx
          x cpu 1 x ...........| disks |.........x cpu 2 x
          xxxxxxxxxx |_______________| xxxxxxxxxxxxx

        From: Dan Baritchi <dan.baritchi@mci.com>

        You can certainly do this, but it may not be worth the trouble.
        However... You can use disksuite (or another similar product, eg.
        veritas volume mgr.) to mirror one storage array to another. This can
        be done even with the arrays being several miles (or more) apart. Call
        you sun microsystems sales rep and ask about data replication
        possibilities. However, you may just want to have another machine
        (your last resort mx record) spool all your mail and then process that
        spool when your primary machine comes online.

        If you have the funds, and this is really critical, you might look at
        sun's solstice high availability software.... but that can get pretty
        expensive. This way you have several (2+) servers working in unison,
        and if one dies, the other takes over it's job. And if both are
        working with a shared storage array (mirrored to another array for
        redundancy) you should not have any data loss (with a little luck).

"Roger B.A. Klorese" <rogerk@veritas.com> suggested:

Check back with us late this year... we have network mirorring extensions to
Volume Manager coming.

====================
Solutions of type 3:
====================

        From: Thomas Looney <looney@unite.net>

        take a look at imap ver 4 Sun ( www.sun.com ) have both clients
        and server available. It allows users to request their email
        from multiple servers, in case one goes down or to cope with load.

        From: Justin Young <justiny@cluster.engr.subr.edu>

        Okay. there is a popular perl program called mirror that you can get
        at a # of places. It uses ftp however so any decent packet sniffer
        will be able to see your passwords. However, if you're using sendmail
        then maybe security isn't your biggest problem. This will
        automatically synchronize files at a specified time interval between
        two machines. One possible way to secure it is to use two ethernet
        cards on each machine. This is cheaper and less restrictive than a
        firewall. OR** You can really spend big bucks and cluster your
        machines. However, this only works with Ultra Enterprise servers. Let
        me know if you *need* a mirror URL. You can search archie or get it
        from the packages directory of http://wuarchive.wustl.edu/. l8r

        From: Chris Liljenstolpe <cds@io.com>

        We are looking at doing something similar. While it is not a
        full replication, the place I am putting the single point of
        failure is much more reliable than a general computer. I am
        planning on two or more POP/SMTP servers (only one would be
        MX'ing at a time), spools would be local, but maildrops (and
        therefore the POP source) would be served off of a NetApp NFS filer.

        -+Chris

And so, a hearty round of thank-yous to:

Ing. Carlos R. Guevara
Dan Penrod
Jim Harmon
Thomas Looney
Justin Young
Robert Tommaselli
Peter Marelas
Dan Baritchi
Chris Liljenstolpe
Glenn Satchell
Peter Bestel
Roger B.A. Klorese

for taking the time to provide ideas, hints, pointers, and solutions.

---Rsk
Rich Kulawiec
rsk@itw.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:44 CDT