SUMMARY: SCSI drive errors on 4/370

From: Jeff Pack (jfp@inel.gov)
Date: Wed Apr 22 1992 - 16:43:54 CDT


Thanks again to sun-managers; you always come through when the vendors
don't. The original problem was SCSI bus resets on a 4/370 with 4 Maxtor
P-12 drives attached. Thanks to everyone. Suggestions came from:

>From danielle@systems.caltech.edu

>I don't know if this is your problem, but I had two Maxtor 1.0 GB
>drives fail in 4 months, then FINALLY the vendor informed me that the
>firmware on the disks was faulty. Seems the read-ahead cache causes
>various, intermittent errors/failures. I have some software that will
>determine the Rev # of the firmware and check to see if the read-ahead
>cache is turned on. Let me know if you would like me to email it to
>you. Unless you bought the drives VERY recently, I would bet this is
>your problem.

The drives are current; Danielle sent me the software and I was able to
determine that the firmware is current. But the software is very useful in
determining the status of the read-ahead cache. Thanks much, Danielle.

*********

>From Perry_Hutchison.Portland@xerox.com

>Is there anything else on the SCSI bus besides the Maxtor drives? (Try
>a probe-scsi if the 4/370's ROM supports it.) I have seen stuff like
>this when trying to mix Adaptec- and Emulex-based shoeboxes on a sun-3
>under SunOS 3.5 .

Just the 4 Maxtor drives were on the bus, but we're getting warmer...

*********

>From seiden.com!mis@seiden.com

>anyway, the terminations should be at the ends of the bus.
>sm0 perhaps has one? phys sd2 should have it instead...

>sd3 should (and does, according to your note) have the other.

The wiring scheme on the 4/370 is strange; because of the peripherial
trays, it's hard to trace how the wiring goes. Sun verified that my
terminations were correct.

*********

>From noel@essex.ac.uk

>The only odd thing I can see is that the snapshots refer to lun 1 all the
>time, and I take it that the drives are embedded SCSI (I don't know the
>drive designator off hand), so there are no LUN 1's at all, they're all 0.
>Is there anything running that's trying to access sd1/3/5 etc? I would be
>inclined to drop those lines from the kernel (you'll never use them) to
>avoid any abberant program trying to get at non-existent hardware.

Bingo! Although the only 4 devices on the bus were the Maxtor drives, the
kernel configuration defined additional devices on LUN 1. When I commented
out the additional devices, the symptoms disappeared.

So, the moral of the story is, if you don't need it, take it out of the
kernel!

Thanks again, everyone.

-- 
---------------------------------------------------------------
Jeff Pack                         UUCP:  ...!uunet!inel.gov!jfp
Idaho National Engineering Lab    Internet:  jfp@inel.gov
P. O. Box 1625 M.S. 2603          Phone:  (208) 526-0007
Idaho Falls, ID 83415             FAX:  (208) 526-9936



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:41 CDT