SUMMARY: SCSI errors

From: Peter Belding (peter@jarthur.Claremont.EDU)
Date: Thu Mar 05 1992 - 07:04:18 CST


Sorry for the delay in summarizing...

My original question:
> We recently added a "<Maxtor P-12S cyl 1431 alt 2 hd 15 sec 93>" as sd2 to a
> SparcStation 1. The system already had two internal 100Mb drives (sd0,1), an
> external 1.2Gb (sd3), and two Exabyte tape drives (st0,1). Ever since, we
> have been getting the following errors:
>
> vmunix: esp0: Current command timeout for Target 2 Lun 0
> vmunix: sd2: SCSI transport failed: reason 'reset': retrying command
>
> where the sd2 in the SCSI transport message ranges from sd0 to sd3. Every
> time we get this error, the machine freezes for about 30 seconds. I assume
> it's waiting for the SCSI bus to reset itself or something along those
> lines. The timeout doesn't seem to occur in any particular cycle.
>
> Have we overloaded the SCSI bus or is there something else I'm missing?

The responses were:

- check the length of the SCSI cabling and keep it as short as possible.
- make sure the drive is terminated
- use active termination
- some MAXTORs have bad firmware.
- some sparcs have bad scsi controllers

What we ended up doing was getting our vendor to replace the drive. A
possibly useful thing to note is the drive now has a very different
cylinder, etc. count. <Maxtor P-12S cyl 2076 alt 2 hd 13 sec 74> Then again,
the vendor may just be playing with our minds. :-) It seems to be working
now.

Many thanks to all who responded.

More detailed responses (somewhat edited but not butchered by above
summarizing)

----
From: Arnold de Leon <arnold@Synopsys.COM>

Also what firmware version is the Maxtor? We had problems with HB17. ---- From: Forrest Cook <cook@stout.atd.ucar.EDU>

Peter, we have some Maxtor P-12S disks and spent a while tracking down an unreliability problem that turned out to be caused by the write-ahead cache on the disk. Maxtor has a couple of unix programs that can disable the cache. If you need a copy, drop me a line and I will put the directory in our ftp directory. ---- From: canuck@rice.edu (Mike Pearlman)

You may have exceeded the cable length restrictions. Here is a quick checklist 1) check your terminator. Be sure that none of the devices in the middle of the SCSI chain are terminated. (Devices can have internal terminating resistors) 2) check your cable connections. Use as short cables as you can. Although the SCSI standard says up to 6 meters, this involves a cable impedance that is not available off the shelf and presumes perfect impedance matching with connectors and internal cabling. When you compute your SCSI cable length you need to account for the internal cabling in the devices as well as impedance mismatches. As a rough rule of thumb figure .5m to .75m for each separate box attached. 3) make sure that the disks are connected closer to the CPU than the tape drives.i.e tapes at the end of the chain

Now for the bad news, MAXTOR has never seemed to get their firmware correct. I have a setup where with a MAXTOR drive in a cabinet I see the same error message that you do especially when I am dumping the disk to my exabyte. With an Seagate or Fujistu replacing the MAXTOR in the cabinet no errors. ---- From: ray@isor.vuw.ac.nz

That error message reminds me of what happens when a SPARCstation 1 loses control of SYNC SCSI and has to reset the bus. Are you attempting to use SYNC SCSI? ---- From: Margarita M Suarez <marg@decagram.cc.columbia.edu>

i came across scsi problems when installing a CD-rom reader on a SS1 with an external (3rd party) 1.2gig drive and a tape drive already attached. the SS1 had an internal 200M disk (?).

the errors were:

esp: data transfer overrun sd0: SCSI transport failed esp0: target 3 now synchronous

i called sun, and they said that the scsi controller could have been out of date. they had me open up the machine, and indeed, the CPU board was Rev 07 when currently they are up to Rev 09. apparently, there were SCSI bus length problems with scsi controllers Rev 07 and below. ---- Additional responses from:

himes@comet.ucar.edu (David Himes) Eckhard.Rueggeberg@ts.go.dlr.de David Fetrow <fetrow@biostat.washington.edu> cdr@kpc.com (Carl Rigney) mti!dave kalli!kevin@fourx.Aus.Sun.COM (Kevin Sheehan {Consulting Poster Child}) action@thynne.psych.nwu.edu (System Analyst) poffen@sj.ate.slb.com (Russ Poffenberger)

Thanks again to all who responded.

-Peter Belding peter@jarthur.claremont.edu uunet!jarthur!peter



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:37 CDT