SUMMARY: SCSI timeout errors when using external disk pack

From: Andy Malato <andym_at_oak.njit.edu>
Date: Fri Dec 03 2004 - 17:47:08 EST
I never found the exact cause of the problem, but I solved it by replacing
my 4 internal 4GB drives with 4 x 9GB drives, thereby eliminating the
need for external disks.

I would like to thank the following for their suggestions :

Rafael Hinojosa --

Who suggested that it may be the scsi controller, and I believe that this
might be the problem.  Rafael also suggested other helpful tests such as
running iostat -Mxn 1 to get an idea of the drive throughput.

Aleksander Pavic --

Suggested analysing the drives with the diagnostic tools provided by
format.  This actually helped to identify a bad drive, but this did not
prove to be the cause of the problem.

Matthew Stier --

Who thought the problem might have been caused by a clash between SCSI and
UltraSCSI busses and drives and suggested adding the following to
/etc/system :

* Begin SunSolve FAQ 1510
*set scsi_options=0x58
set scsi_reset_delay=10000
* End SunSolve FAQ 1510


I also discovered a utility called scsiinfo, which I downloaded from :

        ftp://ftp.cs.toronto.edu/pub/jdd/scsiinfo/scsiinfo-4.7.shar

After running scsiinfo it was discovered that on whatever target ID was
timing out at the time ( target 2.0 and target 8.0 both had problems )
scsiinfo indentified the channel as "NOISY" which would suggest a cable
problem, but as mentioned the cable was replaced several times.



Again thanks to all for their efforts in trying to help me solve the above
problem.



        ---Andy



! Date: Wed, 1 Dec 2004 17:26:06 -0500 (EST)
! From: Andy Malato <andym@oak.njit.edu>
! To: sunmanagers@sunmanagers.org
! Subject: SCSI timeout errors when using external disk pack
!
!
!
! System setup is an E450 running Solaris 9 kernel patch 117171-07.
!
! The system has 4 internal 4.2GB disks (Model: DDRS-34560) and an
! external SPARCstorage MultiPack 2 with 4 of those same drives, used for
! mirroring the internal drives.
!
! The problem seems to be with the external disk pack as the problem goes
! away when the MultiPack is removed.  Every so often the following messages
! appear in /var/adm/messages.
!
!
!         WARNING: /pci@1f,4000/scsi@2 (glm1):
!                 Connected command timeout for Target 2.0
!         WARNING: /pci@1f,4000/scsi@2 (glm1):
!                 Target 2 reducing sync. transfer rate
!         WARNING: /pci@1f,4000/scsi@2 (glm1):
!                 got SCSI bus reset
!         WARNING: /pci@1f,4000/scsi@2/sd@2,0 (sd17):
!                 SCSI transport failed: reason 'timeout': retrying command
!
!
! The above messages show connection timeout problems for Target 2.0, but a
! similar message has also been logged for Target 8.0.
!
! The disk drives for target 2 and 8 have been replaced along with the scsi
! cable connecting the MultiPack to the E450.  In addition, a different
! MultiPack was used, but the above problem still occurs.  The MultiPack is self
! terminating, and does not require the use of a terminator, however a
! terminator was placed on the back of the MultiPack to see if the problem
! was a result of the MultiPack not terminating itself correctly, but again,
! this did not prove effective.
!
! Hopefully someone has seen a situation like this before and can provide
! insight into solving the above problem.
!
!
! Thanks for any help that one can provide.
!
!
!
!         ---Andy
! _______________________________________________
! sunmanagers mailing list
! sunmanagers@sunmanagers.org
! http://www.sunmanagers.org/mailman/listinfo/sunmanagers
!
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Fri Dec 3 17:47:39 2004

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:40 EST