SUMMARY: How safe are compressed tape dumps?

From: David Carter (dmc@cam.sri.com)
Date: Tue May 04 1993 - 11:39:04 CDT


I asked:

> We have a 2.3GB Exabyte drive and would like it to be bigger.
>
> Some experimentation suggests it is quite feasible to dump and restore
> compressed files, like this:
>
> # dump 0f - /dev/rsd0a | compress | dd of=/dev/nrst0 bs=112b
>
> # dd if=/dev/nrst0 bs=112b | zcat | restore if -
>
> For the "/" partition I tried dumping, only about half as much tape is
> used as in direct-to-tape dumping; the dump tapes 25% longer, and uses
> about 45% of the available CPU (this on a Sparc-2) rather than the
> negligible amount used by "dump" alone.
>
> My main worry is about data integrity. If there is any corruption in a
> compressed file, uncompressing falls over, whereas restore direct from
> tape might be able to bypass the corrupted area and recover what
> follows it. (I'm guessing).
>
> Can anyone say whether introducing compress/uncompress into the
> process does increase the risk of not being able to restore? Are there
> any other drawbacks to this I haven't thought of? (Is the only reason
> anyone buys special hardware compression boxes to save CPU cycles on
> the tape host?) And would any dd/dump/tar experts like to suggest
> improvements to the commands I listed above?

There seems to be a fair consensus...

* It is more risky to use compress, at least in principle. A bad block
will foul up all that follows for that filesystem, whereas "restore"
should be able to skip bad blocks.

* HOWEVER, at least five people said they had been doing essentially
what I suggest for some time (years in some cases) and had had no ill
effects. (Sounds a bit like the warnings against dumping active
filesystems: the risk is non-zero, but we all do it, and I at least
have never heard of anyone coming to grief through the practice).

* If I do go for a scheme like this, I should use gzip rather than
compress. It seems gzip has some error-recovery capabilities, and
gives better compression ratios into the bargain. It's said to be on
prep.ai.mit.edu:/pub/gnu/gzip-1.0.7.tar. (Haven't tried it yet). Some
people mentioned other utilities too:

  afio (free code) -- can compress individual files, hence more robust, but
  doesn't offer interactive restore, and there are problems with
  Unix-domain sockets.

  "BRU" (Backup-Restore-Utility) from Enhanced Software Technologies Inc.
  Sounds more flexible than dump/restore, e.g. there is wildcarding on
  filenames for compress or not and dump or not.

  amanda, from cs.umd.edu, uses compress by default.

---

For some people (not for me) compressed dumps are not an option because of shortage of time overnight.

Another point worth considering is how I would get everything back if we had a major disaster and had to restore a dump tape on someone else's kit. ---

Conclusion: I think I will go for some form of compression, using gzip and/or one of the other free utilities mentioned above. We dump everything important just about every night, so a bad block needn't spell disaster, especially if compressed dumps are alternated with uncompressed on different occasions.

Thanks to:

"Jim Phillips" <PHILLIPS@syr.ge.com> David Fetrow <fetrow@pike.biostat.washington.edu> John A. Murphy <jam@philabs.Philips.Com> Keith Pilotti <kfp@qualcomm.com> Postmaster <Piete.Brooks@cl.cam.ac.uk> djm@blue.millipore.com (Drew Montag) feldt@phyast.nhn.uoknor.edu (Andy Feldt) hkatz@nucmed.NYU.EDU (Henry Katz) lsf@holmes.astro.nwu.edu (Sam Finn) matt@wbst845e.xerox.com (Matt Goheen) paulo@dcc.unicamp.br (Paulo Licio de Geus) rakowski@land.nr.usu.edu (Andrew Rakowski) shandelm@jpmorgan.com (Joel Shandelman) strombrg@hydra.acs.uci.edu swc@cs.unh.edu (Steve W Chappelow)

David



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:48 CDT