SUMMARY: (but not really) backups of multi-user systems

From: Rich Kulawiec (rsk@itw.com)
Date: Tue Mar 11 1997 - 21:13:01 CST


This topic keeps coming up in on this mailing list, so I thought that
I would take this note and send it along to everyone -- I sent an
earlier draft of this to someone who'd asked about it, but never
saw a summary from that person. I invite comments on this and will
post a real summary (per our usual protocol) once they stop flowing in.

Notes: I use the terms "dump" and "restore" because that's what I'm
accustomed to. Substitute "ufsdump" and "ufsrestore" as needed.

About Multiuser Backups on Unix Systems

Unless you have a *really* specialized environment, this is much less
of a problem than you might think. Oh, sure, the folks who peddle
add-on backup software, like Legato Networker and Budtool and the
like, will try to tell you that unless you use their package that you
risk losing data or other horrible things. And, very strictly speaking,
they're right -- but not by much. Yes, there's a non-zero, finite chance
that something will go amiss and you'll miss a file here or a directory
there, but the chance is pretty darn small. *Unless* you have a
really specialized environment, like I said above. (I'll explain
what that means later.)

Why? Because all modern versions of dump have code in them that
goes to some lengths to try to cope with filesystems that may be
changing while they're being dumped. That code has a history
that involves Berkeley, CalTech, and Purdue -- and possibly others --
and goes back most of a decade. Back in the days of 4.3BSD and 2.8BSD,
lots of folks were running Unix systems in 24x7 production mode, mostly
in academic environments where taking the systems down at 3 AM to do
backups wasn't a terribly good option, because zillions of budding
hackers were banging away at ADM3A's and VT100's at that hour, and
got testy if their compiler projects were interrupted. So various
and sundry people starting figuring out ways to be able to run dump
(a) without ending up with an unreadable dump image on tape and
(b) without skipping half the filesystem. If you take a look at
the code for dump in the freely-available BSD sources, you'll find
most of this work -- I also know that it is definitely in the SunOS 4.1.4
dump, because I've seen the source code (the nice part about working
somewhere with a source license) and I have a number of reasons to
believe that it's in Solaris's ufsdump, Ultrix's dump, etc. (After all,
why wouldn't it be, since any of Sun/Digital/et.al. could just grab
it from the BSD source tree? No, of course, I would never dissassemble
their code to check. :-) )

Now, this code is not completely bulletproof -- in fact, I know some
explicit ways to break it by sending dump a SIGSTOP, then doing some
ugly things to the filesystem, then sending it a SIGCONT and watching
it fall all over itself. But that's an awfully artificial case, and
I've never seen it arise on a real-world machine.

This is why I've deployed live backups on every Unix network I've touched
over the last ten years. And in that time, I have yet to enounter a
dump tape that I couldn't restore. The size of those networks ranges
from tiny (my Sparc here at home) to rather large (several hundred
machines with several hundred filesystems and a dozen tapedrives).
I've probably pulled close to a thousand tapes during that time to
extract files for restoration, and haven't been disappointed yet.

So what I'm telling you is that unless there are some really severe
circumstances in place at your site, you can probably do this too
and not lose any sleep over it. There are a number of things that
you can put in favor, as well -- most of them are probably already
true, but I think it's worth my while to list them.

1. The quieter the filesystems are, the better. (No surprise.)
For most sites, this means doing dumps in the middle of the night.

2. The less busy the machine is, the better. This not only relates to #1,
but it means that more CPU cycles will be available for dump, which
means that dump runs faster, which means that it runs in less elapsed
time, which means a smaller window in which the filesystem can change.

3. Dumps which span multiple tapes are a *bad* idea. Besides a long
history of multi-tape related bugs, and besides the pain-in-the-ass
that this represents, it also means that there could be a substantial
amount of time going by while somebody figures out that tape #1 is
full and feeds tape #2 to dump. Again, the faster dump runs, the less
time the filesystems have to change. (And as an aside, I have yet
to use a tape stacker/jukebox/carousel that didn't make me want
to dropkick it after a month.)

4. Doing dumps across the network is not the greatest idea in the
world. It's hard to do securely, and it really drops the throughput
rate, which means that dumps take longer, which means...(you know).
Tapedrives are now getting cheap enough that putting a reasonably
high-capacity drive on each machine isn't totally unrealistic -- and in
some cases, it's a much better/cheaper solution than putting a stacker
on one central machine.

5. If you have database applications (e.g. Oracle) then dumping the
raw database files is nice...but probably not useful. Use the utilities
which come with the database package to take an ASCII snapshot of the entire
database and make sure *that* is backed up. Same for Sybase or whatever
other applications you run that store their data in some customized
interal format -- but export/import it via ASCII. Having your data
in ASCII also means that if disaster ensues, you can at least attack
the problem with standard Unix tools like sed/awk/perl, whereas if
it's in the raw database form...well, you're stuck. Also, these snapshot
tools are usually able to take advantage of their knowledge of
the database's internal structure in order to create a static
and self-consistent picture of the database's contents.

6. Back up *all* the filesystems on *all* your machines. I don't
care if you're using rdist to keep /usr/local in sync -- back it
up anyway. Should there be an rdist problem, or an intruder, or
any other kind of problem that's restricted to a single machine,
you will want that backup image. Besides, tape is cheap, cycles
are cheap, and system administrator time is scarce and expensive.

7. Use a rotating schedule of backups, full (level 0) and incremental
(levels 1-9). If you can do the rotation daily, that's even better.

For example:

Machine Filesystem Mon Tue Wed Thu Fri Sat Sun
fred / 0 1 2 3 4 5 6
fred /usr 0 1 2 3 4 5 6
barney / 4 5 6 0 1 2 3
barney /usr 4 5 6 0 1 2 3
barney /home00 2 3 4 5 6 0 1

There's a bunch of reasons for doing this. For starters, a file
that was being changed in fred:/usr on Monday when the full dump
came through will probably not be changing on Tuesday when the
partial comes through. [Note: it pays to examine your "cron"
and "at" job queues to make sure that large batch jobs of whatever
nature are not trying to run at the same time as your backups.
It's a Bad Thing to run the one job that you have that modifies
/etc/passwd every night at the exact same time that you're trying
to create a dump of /. ;-) ]

This also helps balance out the size of the dump images that are headed
for tape -- very helpful if you're putting a bunch of dump images on
one tape, which you probably are. For instance, the level 5 dump
of barney:/home00 on Thursday is probably going to be pretty small,
which is good, because barney:/usr is getting dumped at level 0...
probably on the same tape.

This also relates back to #6: dumping fred:/usr and barney:/usr on
different schedules means that even if the tape with fred:/usr
at level 0 on it fails or gets lost, at least you have barney:/usr
at level 0. Hey, it's better then starting over with distribution media.

8. If you're still worried, then set up your scripts to do a
"restore tf" or "restore tvf" on each dump image when you're
done scribbling it on the tape. This also has the nice
side-effect of giving you a catalog of all your dump tapes,
which is nice when a user comes up to you and says "Can you
restore /home00/luser/foobar? Uh...no, I don't know when
I changed it last."

9. About the "verify" option to dump: the comment in the manual page
(for most versions of dump) is pretty accurate: if you try this on
anything but an umounted filesystem, it's probably going to whine
at you. Actually, it'd be really nice if there was just a "verbose"
flag that would emit the names of files/directories as they're being
dumped, but that's not really a realistic expectation. I think
the "answer", if there is one, for people who want to do some kind
of verification on their backup images, is to rewind and
do a "restore tf" or "restore tvf" on the dump image. See #8 above.

10. I told you I'll explain what a "really specialized environment"
was. Well, I'd say that:

- An environment with multiple databases which are used 24x7 with
no real slowdown, making it difficult to find a window during which
they can be dumped to ASCII, e.g. a transaction-processing environment

- An environment with non-vanilla filesystems, e.g. journaled filesystems

- An environment whose CPU utilization is so high 24x7 that it's
difficult to grab enough cycles to run dump in a reasonable period
of time

constitute really specialized environments. This doesn't necessarily
mean that you can't use plain old dump for your backups; but it does
mean that you may need to be somewhat more clever about how you do it.

11. One more note: using tar or cpio (or GNU tar or bar or whatever)
just makes it worse. All of them have their problems, and none of
them have code to cope with non-quiescent filesystems.

12. This may all sound pretty darn complicated -- looking back, I certainly
have written a lot here. Chalk it up to this morning's coffee finally
kicking in. ;-) But the bottom line is that 99% of the people
out there can just use dump/ufsdump with a little planning and
avoid the expense and hassle of going single-user or using one
of the third-party tools. And given what I've seen happen to sites
using those third-party tools, you really do want to avoid them.
(I'm aware of one site that has an expensive 3rd-party backup
package which now occupies 2 people full-time as they try to coax
it to actually do their backups and restores. It's unbelievable.)

Cheers,
Rich Kulawiec
rsk@itw.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:48 CDT