SUMMARY: A 1000 disks not seen with lad utility

From: Rami Aubourg <rami.aubourg_at_ifrance.com>
Date: Tue Feb 04 2003 - 11:38:30 EST
Hello, gurus,

First of all, many thanks to all those who replied. Particularly:
Mike Salehi
Dominic Clarke
William Enestvedt

It was a rather tense moment for me, so I appreciated even more the 
community's support.

In fact, the problem was neither with the Ultra10, neither with any 
configuration quirks.

All of my seven disks just happened to be physically dead (*Yes. It's 
true*). Which is the reason why I couldn't see them. We reconstructed 
everything from a remote backup, and now it's rolling along fine, except 
some cold sweat.

What happened was rather incredible. The battery on which the A1000 was 
plugged on broke during the night, causing it to turn the A1000 on and 
off every 5 secs. Which treatment apparently ended up crashing all of my 
disks. All seven of them.

What's fun about it is that on the same battery I had two SunFire 
V100's, and they suffered no damage. The two internal disks of the 
Ultra10 neither.

Apparently, that could be because on the SunFire and the Ultra10 don't 
start the disks right away, since there is some latency in the boot 
process before starting the disks. The A1000, on the other hand, just 
does what he is told, that is: start the disks, shut them down, start 
them on again, etc...

So, it's possible to totally screw up a disk bay made for maximum 
redundancy and data safety, under certain particular conditions, i.e. 
bad electrical supply in my case. I believe I'm the only one who's had 
this kind of experience. It's one of the very rare bad surprises I've 
had with Sun material. Any feedback is welcome.

Rami Aubourg

Original post below.


***********************************************************
Hello, gurus,

I had an Ultra10 connected to an A1000 with three mirrored disks. We had 
a serous power failure problem last night, and today I can't connect to 
the A100 anymore.
A probe-scsi-all sees the seven  channels attached to the three mirrored 
disks, plus the hot spare, and that's all. lad says there are no RAID 
devices. rm6 saw no disks. The internal disk leds light up, but not the 
ones on the A1000 corresponding to the disks We changed the scsi card , 
the SCSI cable, the controlles on the A1000, tried putting the disks on 
an entirely new A1000. Same effect. The last thing I can imagine is the 
terminator, or a deeper problem with the motherboard.
The internal system disks on the Ultra10 are all right.

My collleague's with the people who sold us the bay and the Ultra10 
right now with the material, trying to fix the problem with them. I'm 
setting up another server with yesterday's backup if everything else 
fails. I'm also searching for some clues as to how we could access the 
disks on the A1000.

Has anyone had this kind of problems with an A1000 before? Were there 
other tools that you used to acces the disks? And in case the disks are 
all right, is there a way to revert to a normal filesystem and plug them 
on another server, without using the A1000?

Thanks in advance,

Rami Aubourg
*****************************************************
-- 
Lost Knowledge Sets Back Civilization

_____________________________________________________________________
Envie de discuter en "live" avec vos amis ? Tilicharger MSN Messenger
http://www.ifrance.com/_reloc/m la 1hre messagerie instantanie de France
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Tue Feb 4 11:38:43 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:02 EST