SUMMARY: Sun Cluster 3.2, did devices over 100

From: Markus Mayer <>
Date: Mon Nov 26 2007 - 09:17:14 EST
Hi all,

Thanks for the responses, only three (Francisco Mauro Puente, Dean Ross-Smith,
Martin Pre_laber), but sufficient to help me find my way.

Dean pointed out the did manual page, where it mentions that did devices are
dynamically generated in groups of 100 at a time -  the next 100 are
generated only when there are more disks available to the cluster than there
are did devices, hence nothing over 100 for me.

Martin suggested setting the cluster back into install mode and configure the
hdlm so that it doesn't show any disks.

In the end, I got our storage admin to remove all disks from the cluster, I
rebooted, cleared the did device tree, got rid of all did devices, all
references to anything with the storage, rebooted, made sure everything was
cleaned of the hdlm devices and storage devices (wasn't, repeated the
cleaning successfully a seond time), reboot, disks from storage made
available again, reconfigured the hdlm, rebooted, reconfigured the did
devices, rebooted because the devices weren't immediately accessible, renamed
the did devices again with lower numbers, rebooted again, and now everything
seems to be at least partly ok.  Somehow I had that win... feeling.

It also seems there is some kind of bug with the hdlm that we are using.  It
rears it head with the message "No such device" when trying to access any
disk with format, although I can actually access, partition, and put file
systems on all the disks.  Lets see if this changes in a newer version of

Thanks and regards

On Wednesday 21 November 2007, Markus Mayer wrote:
> Hi all,
> I'm having a hard time with Sun Cluster 3.2 in a two node cluster.  We have
> a Hitachi AMS500 and HDLM, Solaris 10 update 3, all patches until
> the start of September installed.  I have 30 disks from the storage array
> made available, and to try to keep an overview of what disk came from which
> pool on the array and for what purpose it was, I renamed the did devices on
> the disks (eg: didadm -t d35:d101).  The renaming procedure went through
> without any problems, however since rebooting, I can do nothing with these
> did devices any more.  For example, the boot messages for one of these
> disks is: Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs:
> Could not register disk
> Nov 21 13:12:48 wombat Cluster.CCR: /usr/cluster/bin/scgdevs: Could not
> stat /dev/global/dsk/d123s2:
> Nov 21 13:12:48 wombat Cluster.CCR: No such file or directory
> didadm -c or -R returns no errors, so it thinks everything is ok.
> I have tried everything I can think of to access the disks, or to remove
> them
> >from the device tree, to rename them again to device numbers under 100, to
> clean the did device tree, and so on, all with no success.  (didadm -C
> then -r, devfsadm -Cv, didadm -t d101:d35)
> I have examined the problem and seen that there are no did device entries
> (links) in /dev/did greater than d100, the devices go from did/d0 to
> did/d100, there is nothing for did/d101 and onwards.  The same applies
> in /devices/pseudo/did.  According to Sun documentation, these should
> however be dynamically generated.
> I cna see two possibilities here -
> 1. Find a way to get cluster to generate the missing device entries.
> 2. Find a way to remove all the problem entries and go back to something
> more conservative.
> So far I have not been able to find a way to do either of these.  I have
> considered the possibility of manually brutally deleting the troublesome
> entries, however I don't know what consequences this will have.
> I would be grateful to anyone who could point me in a direction to either
> get cluster to delete the troublesome entries and start again, or generate
> the missing entries.
> Thanks
> Markus
> _______________________________________________
> sunmanagers mailing list
sunmanagers mailing list
Received on Mon Nov 26 09:17:51 2007

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:44:07 EST