SUMMARY: Is it ufsdump or is it format that is croaking?

From: Syed Zaeem Hosain (szh@zcon.com)
Date: Sun Jul 20 1997 - 22:56:46 CDT


Hi, all.

Here is a summary, although I have not yet resolved the problem. As of
this point, I have done over 15 or 16 relabels (with various flavors of
cylinders left out, etc.) and an equivalent number of reloads from the
tar tape of the files, and still cannot get ufsdump to work cleanly.

********************** THANKS TO (in no special order):

Greg Price greg_price@stortek.com
Steve Butterfield teve.butterfield@pcs.co.uk
Juergen Schreiner Juergen.Schreiner@teefax.mch.sni.de
Marc S. Gibian gibian@stars1.hanscom.af.mil
Tim Evans tkevans@eplrx7.es.dupont.com
Casper Dik casper@holland.Sun.COM
Viet Hoang vhoang@lucent.com
Jingyi Zhou jzhou@bam.bam.com
Jim Harmon jim@telecnnct.com
Brion Leary brion@dia.state.ma.us
Glenn Satchell glenn@uniq.com.au
David Robson robbo@box.net.au.

and my apologies to anyone I left out or forgot to list.

********************** RESULTS

No go yet! I am still having drive problems doing ufsdump, and am
about to just call the mftr (although the drive seems to be working
normally otherwise) and see if I can get it replaced - I think it is
still under warranty.

The other final possibility is that ufsdump may have a bug. In which
case, I will have to wait and check out Solaris 2.6 since it is not
clear that Solaris 2.5.1 ufsdump will have any bug fixes released
soon.

********************** SUMMARY OF SUGGESTIONS

1. Is the drive fsck'ing correctly?

Yes, I have tried fsck'ing the partition on that disk, with and without
having it on-line, in multi- and single- user mode. Everytime, it comes
up clean without any problems. Even when ufsdump fails, the fsck passes
perfectly.

2. Is the drive quiescent when doing the ufsdump?

Yes. I have tried it with and without the partition being mounted. I
also tried (entirely in single user mode):

        a fresh format of the disk a fresh partition and label a fresh
        reload of the tar'ed tape of the files a fresh ufsdump

and had it fail.

3. You should never use cylinder 0 as part of the swap (this came from
many people).

I had thought this problem was not an issue in Solaris 2.X, but I tried
the suggestion anyway. I relabelled the disk and made sure that I had
the first (and then tried again with the first two) cylinders set aside
for a "root" unused and unmounted partition, made sure that the swap
partition started after those cylinder(s), and it did not make a
difference. Ufsdump still failed.

I even tried to set aside the first 64 megabytes of the drive in slice
0 and slice 1 (intending to come back later and add them back in as
swap), and this also failed in ufsdump. Even when the slice 0 and slice
1 partitions were not added in or used in any way.

4. There is a bug in Sun ufsdump. Sun is aware of it and will release a
fix soon.

I am leaning towards this and/or the drive being bad as the problem.

5. Solaris 2.5.1 may be having some difficulty with code that is being
converted from 32 bit to 64 bit and still has bugs.

Quite possible. See above response too. Needless to say, I also did get
responses from people who have been working with large drives *without*
any known problems. So, I do not know what to say!

5. Solaris does not support partitions larger than 2GB.

This one I had great difficulty accepting. Some respondees also said
that they are using Solaris 2.5.1 with large drives (in fact, I was
under the impression that Solaris 2.3 onwards had eliminated this as an
issue). Sun informs me that disk drives of 3 to 9GB should not present
any problem at all with partitions larger than 2GB!

Nevertheless, I went ahead and tried the suggestion anyway. I tried
relabeling the disk with two cylinders as root, 64MB as swap, 2Gb as
/export/home and the rest as another partition, and still got the
ufsdump error!

6. Slice 2 should be called "backup". It is a reserved word, and the OS
will recognize it.

I had trouble believing this was the problem. But I tried it anyway.
the slice 2 (full disk) partition is labelled "backup" now, just like
the other boot disk. No luck. I still get ufsdump errors.

7. "In this case you do have duplicated track. Try it, I am almost
sure."

After questioning, I determined that the person was referring to the
fact that slice 2 overlaps with other partitions. But I strongly
believe that this is irrelevant! This partition (originally called 'C'
in SunOS 4.1.X) has always been the entire disk. It is useful to keep
it there for information and other purposes; the boot disk has this
partition in this form too; plus I am not using that full partition
anyway. So I did not try to change/remove the slice 2 partition in any
further tests.

8. There might be some problem with the entry in /etc/vfsdump.

I tried ufsdump with the raw partition name as well as the mounted
partition name as well as single-user mode to make sure that any
mounted drive partitions were quiscent. No luck!

9. Does the problem happen/get worse over time? I.e. the partition
starts out okay and gets bad later?

Nope. Entirely in single user mode, I have tried: format, newfs, mount
partition, reload old files from tar tape, umount partition, ufsdump
and had it fail!

10. Am I using Prestoserve?

Nope. So this should not be an issue, fortunately!

11. Am I using the correct version of ufsdump?

As far as I can tell. This is the February patched version built into
"Solaris 2.5.1 4/97 Hardware" release and incorporates the last known
ufsdump patch (103847-02) unless there is a more recent version of that
patch.

12. Am I using raw device names, or the mounted partition names?

I have tried both, even after fresh relabelling, and no luck!

13. Solaris ufsdump can detect end of tape, so tape density and length
are not needed anymore.

Ah! Okay, this is good to know for the future. I tried my simple backup
scripts without the density and length info, just for grins, and it had
no good result yet!

14. Micropolis drives are not good.

Yes, well, I do agree with you. I have had bad experiences with them in
the past, and I do not know *why* I allowed myself to be convinced into
buying this one. The lower price probably. Oh, well, time to call the
people I bought this drive from some years back and hope it is still
under warranty.

********************** ORIGINAL POST

Hi, all.

I am having a problem with "ufsdump" in "Solaris 2.5.1 4/97 Hardware"
that I am very concerned about and would appreciate some very quick
feedback about what I may be seeing.

I have just recently (finally) migrated to Solaris 2.5.1 on my home
system and am experiencing some flakiness that I cannot explain yet.
The disk drive in question is working quite normally otherwise, but
I do not like operating without clean backups that I can rely on.

********************** INFO

First some background info:

System: Sparc 5/170 clone
OS: Solaris 2.5.1 4/97 Hardware
Patches: None beyond what is included in Solaris 2.5.1 4/97
Disk: Micropolis 1936

Information on the disk from format:

format> current
Current Disk = c0t1d0
<Andataco MICROP 1936-21MW1002002 cyl 2781 alt 2 hd 21 sec 101>
/iommu@0,10000000/sbus@0,10001000/espdma@4,8400000/esp@4,8800000/sd@1,0

format> inquiry
Vendor: MICROP
Product: 1936-21MW1002002
Revision: HW0A

partition> print
Current partition table (original):
Total disk cylinders available: 2781 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
  0 unassigned wm 0 0 (0/0/0) 0
  1 swap wm 0 - 59 62.14MB (60/0/0) 127260
  2 unassigned wm 0 - 2780 2.81GB (2781/0/0) 5898501
  3 unassigned wm 0 0 (0/0/0) 0
  4 unassigned wm 0 0 (0/0/0) 0
  5 unassigned wm 0 0 (0/0/0) 0
  6 unassigned wm 0 0 (0/0/0) 0
  7 unassigned wm 60 - 2780 2.75GB (2721/0/0) 5771241

As you can see from the above, I do not have any overlapping partitions
on the drive or anything. I use the slice 1 for extra swap, and slice
7 for the /export/home directory on my system.

********************** SYMPTOMS

During "ufsdump" on the /export/home slice, I get the following
warnings (many of them with different numbers for block):

  DUMP: Warning - block 336740416 is beyond the end of `/dev/rdsk/c0t1d0s7'
  DUMP: Warning - block 1077957184 is beyond the end of `/dev/rdsk/c0t1d0s7'

and also the following types of errors:

  DUMP: bread: dev_seek error
  DUMP: bread: DEV_LSEEK2 error
  DUMP: Warning - cannot read sector 2760016522 of `/dev/rdsk/c0t1d0s7'
  DUMP: bread: DEV_LSEEK2 error
  DUMP: Warning - cannot read sector 2760016523 of `/dev/rdsk/c0t1d0s7'

with equally bizarre blocks for other messages. After 32 messages, I
get a request "if I want to continue" and as long as I keep entering
"yes", it eventually finishes the ufsdump after giving me a few hundred
of these "errors" and "warnings". But I am not sure of the quality of
the files on the backup! I do not have another disk of that size to
ufsrestore the data and see if anything is amiss.

Needless to say, the very first thing I did was to "tar" all the files
on that partition (very successfully, I might add! No errors, and the
tar read-back later also proceeded normally) onto a 4mm DAT tape (on
the same tape drive as I used for "ufsdump") and did a completely fresh
format and newfs operation on the slice. After reloading the tar tape
back in, I did another ufsdump and get the same error messages (but I
am not sure whether block numbers are the same this time).

I have also checked the cables, terminations, etc. The drive is also
working quite well and normally, so I do not think this is a scsi
problem. Other "ufsdumps" on another disk slices proceed quite totally
normally, so I am at a bit of a loss.

********************** QUESTIONS

Why is "ufsdump" messing up like this? Why didn't format or newfs give
me any errors if indeed the drive is messed up. I did a verify after
the format and it also passed completely. Can anyone please help out
here?

                                                        Thanks!

                                                                Z

---------------------------------------------------------------------
| Syed Zaeem Hosain P. O. Box 610097 (408) 441-7375 |
| Z Consulting Group San Jose, CA 95161 szh@zcon.com |
---------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:59 CDT