SUMMARY: Write errors on storage array disks

From: Avrami, Louis (L.Avrami@dialogic.com)
Date: Fri Jul 24 1998 - 18:00:36 CDT


Hello all,

        The reason for the late summary is that we applied the 'gummy bear'
fix to the disks within our storage arrays, and haven't received any error
messages since.

        Thanks to everyone who replied, especially to Joy Silva and Joe
Harman of Sun, who recommended that we apply FCO (field change order)
A0081-1, an "energy absorbing foam pad to the bracket handle" to each disk
within our storage array.

        In retrospect, this solution made perfect sense. Construction had
begun on the room next to our data center almost to the day when these
errors began to appear on our disks (a total of three separate disks were
effected). From what I understand, the disk vibrate until they are
"off-track", causing the errors.

        This was the fix. Our next steps will be to upgrade Volume Manager
and the firmware.

        Thanks again to everyone who replied.

Lou Avrami

Get the Dialogic Edge at http://www.dialogic.com

> -----Original Message-----
> From: Avrami, Louis [SMTP:L.Avrami@dialogic.com]
> Sent: Wednesday, July 01, 1998 7:43 PM
> To: 'ssa-managers@Eng.Auburn.EDU';
> 'sun-managers@sunmanagers.ececs.uc.edu'
> Subject: Write errors on storage array disks
>
> Hello all,
>
> We're having a problem on one of our storage arrays ... if anyone
> can offer any advice, it would be appreciated.
>
> Platform is an Sun Enterprise 4000, Solaris 2.5.1, Veritas Volume
> Manager 2.1.1, with a RSM 2000 and two SSA 112 storage array devices.
> Firmware is version 3.11. We're using RAID 1, mirroring, on all disks.
>
> Below are the errors that we are receiving, on a fairly regular
> basis:
>
> Jul 1 10:14:14 itsun2 unix: WARNING:
> /sbus@6,0/SUNW,soc@d,10000/SUNW,pln@a0000000,779a45/ssd@0,4 (ssd11):
> Jul 1 10:14:14 itsun2 unix: Error for Command: write(10) Error
> Level:
> Retryable
> Jul 1 10:14:14 itsun2 unix: Requested Block: 1536 Error Block: 1536
> Jul 1 10:14:14 itsun2 unix: Vendor: SEAGATE Serial
> Number: 00984542
> Jul 1 10:14:14 itsun2 unix: Sense Key: Hardware Error
> Jul 1 10:14:15 itsun2 unix: ASC: 0x9 (track following error), ASCQ:
> 0x0,
> FRU: 0xfb
> Jul 1 11:27:29 itsun2 login: REPEATED LOGIN FAILURES ON /dev/pts/49 FROM
> 146.152.12.29
> Jul 1 13:58:17 itsun2 unix: WARNING:
> /sbus@6,0/SUNW,soc@d,10000/SUNW,pln@a0000000,779a45/ssd@0,2 (ssd9):
> Jul 1 13:58:17 itsun2 unix: Error for Command: write(10) Error
> Level:
> Retryable
> Jul 1 13:58:17 itsun2 unix: Requested Block: 3426496 Error
> Block:
> 3426508
> Jul 1 13:58:17 itsun2 unix: Vendor: SEAGATE Serial
> Number: 01489534
> Jul 1 13:58:17 itsun2 unix: Sense Key: Hardware Error
> Jul 1 13:58:17 itsun2 unix: ASC: 0x9 (track following error), ASCQ:
> 0x0,
> FRU: 0xfb
>
> Eventually we will get an error like this:
>
> Jun 28 08:21:28 itsun2 unix: WARNing: vxvm:vxio: write error on Plex
> vol06-01 of volume vol06 offset 16 length 4
> Jun 28 08:21:28 itsun2 unix: WARNing: vxvm:vxio: Plex vol06-01 detached
> from
> volume vol06
>
>
> and one of the mirrors will detach. The mirrors can be reattached
> without any problems.
>
> Yes, we should upgrade Volume Manager to the latest version (2.5.3)
> ASAP. And we also intend to upgarde the firmware to the latest release,
> version 3.12.
>
> We've applied patch 104708-13, without any effect. Perhaps there
> are some other patches that might be needed?
>
> Can anyone suggest any other causes/solutions for these problems?
> It seems that three disks are involved in these errors are all on the same
> controller, so could the problem be the controller?
>
> Any ideas would be appreciated.
>
>
>
> Lou Avrami
>
> Get the Dialogic Edge at http://www.dialogic.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:44 CDT