SUMMARY: metaset and state database slice question

From: Lineberger, Aaron <alineberger_at_navcirt.navy.mil> Date: Wed Jun 22 2005 - 15:30:34 EDT · This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:49 EST

Well, I received zero responses on this except from another admin that
was interested in if I had any responses. Well, here is what I found
after throwing together a dirty test environment:

Setup: 2 Ultra-10 systems with a single Qlogic 2200 card in each using
LUN masking and persistent binding to only see the LUN I was interested
in, zoned off on Qlogic SANbox2 FC switches to see the same LUNS as
presented by our Zzyzx (don't buy one) RAID array. It should be noted
that this is not a multi-node diskset, nor is it set up using mediators,
nor is it using Sun Cluster software. The mediator thing is still a
little fuzzy to me as eventually I'll be placing this on systems that
have multiple FC HBAs and am not sure if I'll need to use a mediator
configuration or not.

Pre-Configuration: It took a while to get the controller number to match
on both systems due to the qlc driver taking over every time I removed
the qla2200 driver, but after removing both and then replacing my
original driver_aliases, driver_classes, minor_perm, and name_to_major
files and putting in place my new path_to_inst file everything worked
out peachy. (Note on the path_to_inst file: I modified this on one of
the systems since I used it's second FC card in the second test machine,
this made everything the same all the way from the hba instance number
to the c2t130d6 designation.) Also, rpc.metad requires .rhosts
authentication or possibly root needs to be in group 14, but I used
..rhosts for root on each system. This does not mean you have to enable
the r-commands in inetd.conf, just the SLVM entries!

Disk Set Configuration: So, now that my FC SAN attached LUN has the same
controller/target/disk information on both systems I'm able to start
building the diskset. On the master I ran:

>metaset -s test-set -a -h host1 host2

#This creates the set and adds the hosts to it.

>metaset -s test-set -a c2t130d6

#This answered my original question as it completely destroyed my
original partition information and slice 2 does get blown away, so make
sure you do this with a new disk, or backup all the data so that you can
restore it if you are using an existing disk. If you plan on using a
multi-node diskset in the future make partition 7 at least 256MB
starting with cylinder 0. This was my partition layout:

Part      Tag    Flag     Cylinders         Size            Blocks

  0        usr    wm      25 - 65532      813.65GB    (65508/0/0)
1706352384

  1 unassigned    wm       0                0         (0/0/0)
0

  2 unassigned    wm       0                0         (0/0/0)
0

  3 unassigned    wm       0                0         (0/0/0)
0

  4 unassigned    wm       0                0         (0/0/0)
0

  5 unassigned    wm       0                0         (0/0/0)
0

  6 unassigned    wm       0                0         (0/0/0)
0

  7        usr    wu       0 -    24      317.97MB    (25/0/0)
651200

>metaset -s test-set -t

#This makes the machine you are on take control of the diskset so you
can actually use it.

>metainit -s test-set d99 1 1 c2t130d6s0

#This creates the concat/stripe volume for my LUN assigned to my
diskset. I chose this since the underlying LUN is actually a RAID5
array.

>newfs -Tv /dev/md/test-set/rdsk/d99

#This creates the UFS filesystem on the volume. It should be noted here
that if you place this in the /etc/vfstab with mount set to yes it will
fail to mount at boot because the system is not the diskset owner at
boot time. I'm still looking into this and it looks like a temporary
solution is to use an rc script.

Now you can mount/umount the filesystem to your hearts content! To
gracefully fail over the filesystem to the second host you need to do
the following:

On the owner host (metaset -s test-set shows me that host1 is the
current owner.) umount the filesystem then run the following:

>metaset -s test-set -r

# This releases the ownership of the diskset.

Logged on the second host, take ownership using -f (force) if necessary:

>metaset -s test-set -t

#This takes ownership of the diskset on the second host.

You can verify you have ownership by running metaset -s test-set. All
volume manager functions need to use -s test-set to ast upon the
diskset. (e.g. metadb -s test-set -i, metastat -s test-set)

I did run into some issues in setting this up and at one point ended up
running metaset -s test-set -P to completely purge the test-set diskset
from the system. I was able to recreate the set, add c2t130d6 to the
set, and create the d99 volume again using metainit without destroying
the data on the disk! (No newfs in there...)

All in all this was a good exercise into the Solaris Volume Manager and
it taught me that most people take LV management for granted when using
VXFS or AIX. ;)

Best Regards,

Aaron Lineberger

*RFC2795 (IMPS) Compliant

-----Original Message-----
From: Lineberger, Aaron
Sent: Monday, June 13, 2005 4:40 PM
To: 'sunmanagers@sunmanagers.org'
Subject: metaset and state database slice question

All,

I have a general metaset question for anyone using SVM on Solaris 9 that
has migrated from a single system to two systems using disk sets. The
metaset man page states:

For use in disk sets, disks must have a dedicated slice (six or seven)
that meets specific criteria:

        o  Slice must start at sector 0

        o  Slice must include enough space for disk label

        o  State database replicas cannot be mounted and does not

           overlap with any other slices, including slice 2

The third statement was the one that really caught my attention as I
have always seen examples on this list which show slice 2 to be the
entire disk which overlaps slice 7 which holds the state database
information.

Out of sheer curiosity, can anyone explain why this is part of the
criteria for disk sets? I find it interesting that it's not mentioned in
the metadb man page.

As for migrating a system into an environment where it will use disk
sets with another host, does this mean that just because slice 2 gets
changed to accommodate this criteria that I need to back up and then
restore the data on slice 0 even though the actual sizes of slice 0 and
slice 7 will not change? Judging by the man page it would seem that this
is the case:

     If the existing partition table does  not  meet  these  cri-

     teria,  Solaris Volume Manager repartitions the disk. A por-

     tion of each drive is reserved in slice 7 (or slice 6 on  an

     EFI labelled device), for use by Solaris Volume Manager. The

     remainder of the space on each drive is placed into slice 0.

     Any existing data on the disks is lost by repartitioning.

Anyone have thoughts on this or experience in adding a second system to
an existing SVM configuration?

Please reply to the list and not to my email address. TIA

Aaron Lineberger
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers