Summary: Different minor-numb. at 2 E3000 conected to SSA112

From: bernhard_fank@ukl.uni-heidelberg.de
Date: Thu Mar 12 1998 - 11:20:22 CST


Thank to all who answered (full list at the end):

With the hint of Rogerio Rocha I found the solution. My mistake was: I attempted
to get the "backup"-Server to behave like the Server (The owner of the
array). This didn't work. I should have to try it vice versa (- Stubbornedly I
didn't tried it vice versa before, to show SUN that there is elsewhere a bug
making the difference.;-]). The "Backup"-Server has "wholes" in the numbering of
the ssd, the Server not (all other lines of the path_to_inst-s were the same ).
So I
1. copied the path_to_inst of the "backup"-Server to the original Server,
2. destroyed all metadevices and metaset, and dev/ices hinting to the array and
3. built the dev/ices again with boot -rs.
4. Than I created the metaset again. The Server and the "backup"-Server became a
host of the same set. Of course the server as the owner.
5. After creating the metadevices again all looks fine.

- be carefull with this procedure -

Bernhard Fank

-----------------------------------------------------------------
My original question:
>Hi sun-managers,
>2 E3000-Server should backup one another. For this reason they are
>connected via Fibre Channel on a SSA112. But the minor-numbers of the
>disks on the SSA112 are differing (SDS for example is using the
>driver-numbers). The servers have the same configuration.
>Analyzing the problem:
> 1. Usual solution: We deleted the path_to_inst and the according
> /dev's and /devices, rebuilded the path_to_inst, /dev's and /devices
> with boot -ar.
> --> No change in minor-numbers.
> 2. Proof the hardware: We put the local boot-disks of the one server
> into the other and booted this server.
> --> The server couldn't use the minor-numbers of the other.
>We couldn't find a solution with SUN's staff (silver contract). They
>advised us to disconnect the array and optical modules and reconfigure,
>connect the empty array and reconfigure. Afterwards we should fill the
>empty array disk-layer by disk-layer and reconfigure each time.
>We can't stop the system for so a long time and we don't think this is
>a solution to our problem. See Analyzing above.
>I would be very pleased, if you could help us. Further description
>below. ....

>----------------------------------------------------------------------
>Example out of merged and sorted path_to_inst (the two Hosts are
>called krz10 and krz18 - the first 4 Addresses don't differ):
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,1 1
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,1 1
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,2 2
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,2 2
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,3 3
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,3 3
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,4 4
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@0,4 4
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,0 5
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,0 16
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,1 6
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,1 17
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,2 7
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,2 18
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,3 8
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,3 19
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,4 9
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@1,4 20
>ssd
>krz10 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@2,0 10
>ssd
>krz18 /sbus@2,0/SUNW,soc@d,10000/SUNW,pln@a0000000,802949/ssd@2,0 32
>ssd
>....

------------------ Answers -------------------------------
--- my comments
<mdb@dosmanos.cwiz.com (Martin D. Baldenegro) >:
Try the following things:
  1. move the path_to_inst file to old.path_to_inst
  2. creat new path_to_inst file with only the
     following line in it:
     #path_to_inst_bootstrap_1
  3. remove the entries under /dev/vx
  4. do this on both systems and boot with the -r option.

--- It sounds good, but I didn't try it.
------------------------------------------------------------
"Coffindaffer Virginia" <Virginia.Coffindaffer@wang.com>
Hinted me to HA-Software, so did Mark Spooner.
<mark.spooner@Central.Sun.COM>:
Qualix HA+ modifies the numbers as part of a HIghly Available NFS server
installation.

--- We don't like it now.
--- SDS supports disksets with one host being owner and the
--- second the backup. No more products, at this time.
------------------------------------------------------------

Seth Rothenberg <SROTHENB@montefiore.org>
reported, that he faced a similar problem. The dev-names differed
c2t4d4s4 c3t4d4s4. They moved this names and rebooted.

--- We had problems with the minor numbers.

------------------------------------------------------------

<rogerio@bvl.pt>:
We had a similar problem, with a 2000 and 3000.
Solution (it's working) from our frendly supplier :
 Edit /etc/path_to_inst , so that there is only one
reference to each array, and the major and minor are the same
     ex. these entrys :
"/io-unit@f,e0200000/sbi@0,0/SUNW,soc@1,0/SUNW,pln@a0000000,8424cf/ssd@
4,4" 79 " ssd"
"/io-unit@f,e0200000/sbi@0,0/SUNW,soc@1,0/SUNW,pln@b0000000,8424cf/ssd@
4,4" 174 "ssd"
"/io-unit@f,e1200000/sbi@0,0/SUNW,soc@3,0/SUNW,pln@b0000000,8424cf/ssd@
4,4" 39 " ssd"
   where modified to
"/io-unit@f,e0200000/sbi@0,0/SUNW,soc@1,0/SUNW,pln@b0000000,8424cf/ssd@
4,4" 54 " ssd"
Yes we did have three (3) entrys for our older array...and it even did
have one with a diferent controller address and Fiber port.
So, even having two diferent CPU's, we do have disk set's configured
and working.
HTH, but not the time to answer before,

--- fine, I acted in a similar way. Yust didn't understand why f(79,174,39)=54.





This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:33 CDT