SUMMARY: Disk label problem

From: SGauthier@domainpharma.com
Date: Tue Jul 06 1999 - 12:36:31 CDT


Give the prize to Michael Maciolek for solving the problem to my stupid
error.

This guy is great! Hope they pay you well Mike! Incidently, where did
you learn about that?

I've included everyone's response with specific responses to their
suggestions.

One common theme with most responses to my problem was that I could pull
the partition table from the /etc/format.dat file... problem with that is
it assumes I used a standard partition map... which of course I did not!
... all my luck.

I appreciate the effort everyone did for their responses. I'll try to be
more detailed with further questions to this list... save everyone some
time.

So I revive the partition map... only to put me at the same place I was
when I blew the disk partition map away... dealing with another disk
problem... :)
Except in this case, it's an intermittent SCSI failure... I was trying to
ufsdump the data off this disk that started the whole fiasco.

================================================================================
>From ArtMan:

Sounds to me all you did was to re-write the label on the OS drive?
If so and that is ALL you did, it should be easy to get back, just
restore the label to what the Cylinder data was before the re-label.
Since it sounds like you have two drives of the same model number and
same "default" cylinder layout, just use format->"app disk"->part->print
to record what the app drive looks like and replicate that to the OS disk
via format->"os disk"->part-> and create the partition map ->label to
record the
label back onto the os drive.

                              artman
================================================================================
My response: Basically the same as above... I forgot to mention that the
partition map wasn't standard from /etc/format.dat.
          Appreciate the info... if only it had been that simple! :)
(User's wouldn't be all over me!)

================================================================================
>From Stephen P. Richardson:

Quoting SGauthier@domainpharma.com:
>
> Is there a way to retreive the last partition map or reconstruct it?

I'm afraid it sounds like you are outta luck on this - if you
overwrote the label, and don't have a backup, that exhausts the
possibilities that I am aware of. However, you may be able to
piece this back together. If one of the standard partition schemes
in /etc/format.dat was used, you may be able to just use that.

You can mount root (assuming is starts at block 0) and take a look
in /etc/format.dat.

Even if that doesn't work, you may be able to guess by looking at the
size of root, setup a partition of that size, then check the next
partition out (with fsck). The fsck will fail until you find the
correct first cylinder. You may be able to iteratively work your
way through this, although since swap has no filesystem it could
get a bit tricky. It may be worth it to try anyway.

> feelin' stupid...

Yeah, well, that happens to all most of us from time to time -
keeps us humble.

--
Regards,
Stephen

================================================================================ My response: Good idea as well... thanks for the vote of confidence... this experience definitely deflated my head! :)

================================================================================ >From Richard Bond:

Yes, on the Linux list.

- Richard Bond

================================================================================ My response: Thanks, but what's the list??? :)

================================================================================ >From Michael Maciolek:

Did you get an answer to this yet? If you've got no info available from backups, this method should work for you.

The contents of the partitions should still be intact; it is only the partition table that's gone. So if you knew the start points and sizes of all the partitions, you could rebuild your partition table and everything would be fine.

I'm going to assume your boot disk is c0t3d0 and it contains at least five partitions, not necessarily in this order: root, swap, /usr, /var, /opt.

Each filesystem contains information about its own size in its header - the only problem is 'swap', which is a raw partition and has no filesystem (therefore, no helpful header to tell you how big it is.)

Start by doing an 'fstyp -v /dev/rdsk/c0t3d0s2 | head -12' and look at the 12th line, which tells you how many cylinders in the first filesystem. Use this to relabel your root partition. While you're in there, set slice 3 to to start just after root; make it span the entire disk. You'll truncate it later, but for the moment, it gives you a 'handle' on everything on the disk that's not part of the root partition.

Now, run an fsck on the root filesystem, just to be sure it's OK. Mount it on a temporary mount point so you can peek at your vfstab, which will tell you the names and slice numbers for all your other partitions, e.g. root on slice 0, swap on slice 1, var on slice 3, usr on slice 4, or whatever you have on your system.

Your goal is to find the end of the swap partition (and the start of the next real filesystem after it.) Unless you're a very good guesser, you're not going to poke around randomly...you want a more methodical approach.

Every filesystem has a filesystem header, and one constant element in that header is a 'magic number', hexadecimal 0x011954, which occurs 9564 bytes past the beginning of the filesystem, i.e. it's stored in bytes 9564-9567.

All you have to do is look for occurrences of the magic number, which will either be (a) a random natural occurrence of that bit pattern, (b) an actual filesystem header [SUCCESS], or (c) a replica of the filesystem header.

Use od to dump the contents of slice 3 (which you set up to start just past the end of root - slice 0 - remember?) and grep for the magic number:

od -x /dev/rdsk/c0t3d0s3 | grep '0001 1954$'

This will look for the 'magic number' (which happens to occur at the end of a line of od's output). There may be other random occurrences of data which are identical to the magic number, so be prepared to do some trial-and-error work. Let it run for a while - it may take as much as an hour, depending on the size of your swap area, to chug through all that space and locate the magic number that marks the start of the next partition. Be patient.

Eventually, you'll see a like that looks something like this, and it should be followed closely by another identical (except for the left-most number) line.

276542520 0000 0008 0000 035c 0000 0560 0001 1954 276562520 0000 0008 0000 035c 0000 0560 0001 1954 ---- ----

The first number is an octal address - it's the location of this particular 16-byte string of numbers. If you subtract octal 22520 from the first number, you should get the start of that partition, i.e. the size of your swap space.

Do the necessary arithmetic - convert to decimal, divide by 512 bytes/sector to get the 'sector count' - then go into 'format' and adjust slice 1 to have that size, and slice 3 to start where 1 ends.

Now, you should be able to do 'fstyp /dev/rdsk/c0t3d0s3 | head -12' and get the size of the next slice, fix the partition table, fsck, set the next slice to span the remainder of the disk, and repeat until done.

================================================================================ My response: IT WORKED!!! Like a charm... what helped me to solve the trial-and-error portion of the solution fast was the fact that I knew the swap partition was the same size as the RAM. Making the block number on a cylinder boundary was all it took for trial and error!

================================================================================ >From Carsten B. Knudsen:

Hi Steve,

If the two disk have (or should I say: had :^o) exactly identical layouts, it is possible with Solaris 2.x format(1m) to take a slice table from one disk, save it, and apply it to another:

1: Enter the format utility 2: Select the healthy disk 3: Go to the "partition" submenu 4: Try "print" just to make sure :-) 5: Use "name" to give this table a temporary name, e.g. "veryown" 6: Quit "partition" 7: Using the "disk" command, select the broken disk WITHOUT QUITTING FORMAT 8: Enter "partition" again 9: Using "select", pick your saved partition table. 10: "label" to save it to disk 11: Quit format

I have learned the hard way (!) that format(1m) only allows this when the two disks are EXACTLY identical. But then, in your case, I guess there's a fair chance that they are - assuming that the machine came with two disks.

I don't know if this is of any use, but good luck anyway.

regards,

/Carsten ================================================================================ >From Carsten #2:

Hi again,

On second thoughts, my previous suggestion was perhaps too obvious - if that would do it, you would probably already have copied it in by hand, right... so, sorry for sending such a basic reply to a not-so-basic question :-)

If you are fluent in C, you might consider writing a little program that scans through the entire disk, looking for UFS headers and reporting their locations and sizes. Given that, you might be able to reconstruct the partition table including swap space and all. Have a look at the ufs_fs(4) man page.

good luck - again.

/Carsten

================================================================================ >From Carsten #3:

Oops, did I say "ufs_fs(4)" ? I meant "fs_ufs(4)"./CBK

================================================================================ My response: Good idea... along the lines of what Michael came up with... Interesting idea of making the C program automate the solution but I certainly plan on NOT letting this happen again! I was thinking of adding the cylinder #'s to the /etc/vfstab file as a comment line for each partition. Since I always make the root partition slice 0, even if I lost the partition map, making a partition the whole disk will still give me access to the root partition and therefor the /etc/vfstab.

================================================================================

THANK YOU ALL! AWESOME LIST!

-Steve

\\|// (0~0) ------------------------------oooO-(_) -Oooo------------------------------------ Steve Gauthier Domain Pharma Corp UNIX Systems Administrator 10 Maguire Road PHONE: (781) 778 - 3953 Lexington, MA 02421 FAX: (781) 778 - 3800 10 Maguire Road E-mail: sgauthier@domainpharma.com _________________________Oooo.____________________ .oooO (___)



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:23 CDT