SUMMARY: Exabyte errors: Problem with drive or tapes?

From: Scott Butler (skb@cypress.com)
Date: Mon Apr 27 1992 - 16:21:22 CDT


Sorry about taking so long to summarize, but I wanted to wait
until I had solved my problems so that I posted accurate info.

Major (unrelated) system crashes slowed my progress. (Actually
one of the few good backups were made just prior to to the
crash and saved our butt!).

A condensed version of my original posting:
>
> AAAAUUUGGGHHHH!!!!
>
> I have read glowing praise from others about the wonderful problem
> solving abilities of the readers of this list. Who knows, I may
> already be a winner...
>
> Two out of three nights my console window is reporting the following
> exabyte errors:
>
> st1: Error for command 'write', Error Level: 'Fatal'
> Block: 1151 File Number: 0
> Sense Key: Hardware Error
> st1: Error for command 'write file mark', Error Level: 'Fatal'
> Block: 1151
> Sense Key: Media Error
>
> This pair of errors may be repeated several times. It doesnt matter
> whether I am doing a dump, tar or dd to the device, and the errors
> do not seem to happen in the same place on the each tape (although
> I write multple files to a tape, so the dump message "DUMP: Tape
> write error 907 feet into tape 1" is not much help.)
>

Respondents:
        fabrice@sj.ate.slb.com (Fabrice Le Metayer)
        guyton@rand.org (Jim Guyton)
        ks@cypress.com (Krishnan Sampathkumar)
        kd@redwood.cray.com (Kevin Drysdale)
        decwrl!uunet.UU.NET!deltam!tigger!jt (jim wills)
        mark@maui.Qualcomm.COM (Mark Erikson)
        danielle@systems.caltech.edu (danielle sanine)
        feldt@phyast.nhn.uoknor.edu (Andy Feldt)
        daniel@CANR.Hydro.Qc.CA (Daniel Hurtubise)
        mis@seiden.com (Mark Seiden)

Many of the replies were requests to summarize what ever I found out
because the problems were happening there too.

Most of the suggestions were in some way or another related to
SCSI problems. Fabrice forwarded some SCSI hints from
Jim Guyton. This advice seems to be the most applicable to my
problems and I have included it below.

There were other suggestions related SLC and IPC machines which
claimed SCSI controlers were pretty messed up. If you have one
of those machines, you should see about swapping your SCSI controller
boards, and getting a patch.

A couple of others suggested that my heads were bad, or I had bad
tapes. The problem was happening to many tapes and I doubted that
most of my tapes had suddently gone bad. The drive was replaced
less than a month ago under a service contract, which made me doubt
that the heads would be bad. (Although having the drive replaced again
would have been my next step.)

*THE FIX* for me seemed to be shortening my SCSI cables. My 8200 is
at the end of a chain including a hard disk shoebox and a CD-ROM
device. The total cable length was right at 19' which is near max.
I am now closer to 6'. Since shortening my cables a week ago, I have
not experienced the problem again.

If I seem undecisive in claiming that shortening the cables fixed my
problem, it is only because my problems seemed to disappear several
days before the new cables arived. However I am given that SCSI
problems are often intermittent and one can go a long time before
noticing a problem. So I am willing to believe that this was
my problem.

-------------------------------------------------------------------------------
Scott Butler Telephone : (612) 851-5036
Cypress Semiconductor Inc. Fax : (612) 851-5199
2401 E. 86th Street E-Mail : skb@cypress.com
Bloomington,MN 55425, USA
-------------------------------------------------------------------------------

attachment:
----------------------------------------------------------------
SCSI HINTS:
----------------------------------------------------------------

Date: Wed, 1 Apr 92 09:04:58 PST
From: fabrice@sj.ate.slb.com (Fabrice Le Metayer)
>
> This may be helpful to you. It is not related directly to Exabytes, but
> to SCSI. However, there are some good hints.
>
> Regards,
>
> --
> Fabrice
> -- ,
> Fabrice Le Metayer DOMAIN : fabrice@sj.ate.slb.com
> Schlumberger Technologies - ATE UUCP : {amdahl,decwrl,uunet}!sjsca4!fabrice
> San Jose, CA 95110 BELL : (408) 437-5114
>
> "Argue for your limitations, and sure enough, they are yours."
> -- Richard Bach
>
> ----- Begin Included Message -----
>
 [....]
> From: Jim Guyton <guyton@rand.org>
>
> I've had lots of fun with SCSI over the last few years. Here's
> a checklist for you ...
>
> 0) Triple check the cables. Replace all/most of the scsi bus
> cables and see if the problem goes away. Make 'em shorter if
> you can.
>
> 1) Is BOTH ends of your SCSI bus terminated? Hint: the innards of
> the SS/1 and SS/2 have room on the bus for two internal disks and
> that counts as one end of the bus! Hint #2: you can run with the
> bus not-properly terminated for a long time w/out knowing it.
> I've purchased internal scsi terminators for us, but it's usually
> only needed when the cable length gets a tad long.
>
> 2) Have you tried turning off DISCONNECT-RECONNECT in the kernel?
>
> 3) If (1) and (2) aren't enough, I've got a nifty little SCSI disk mode
> page editor that I've been hacking on (and off; mostly off) over the
> last year that will let you examine (and change) the mode registers
> in the disk. It's of limited use w/out either my handholding you
> through it, or unless you've tried to read a SCSI manual and/or want
> to learn all about it ...
>
> ----- End Included Message -----
 



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:41 CDT