SUMMARY: Exabyte 8500 I/O error ??? (REPOST)

From: Real Page (rpage@coyote.matrox.com)
Date: Wed Apr 01 1992 - 07:45:59 CST


Sorry to repost, but I only got the first 89 lines back from the
relay, so it was truncated at some point.

Greetings,

here is the summary of responses to my query, No real solution, but lots of
hints

The problem seems to exist at many sites, but nobody really
have a reason for the failure.

It might be related to the SunOS st driver, the EXB-5000,
some other SCSI device in the chain, power supply noise,
etc...,

I suspect tape defect, so I will be ordering better quality tape.

During the last week, I saw no failure on 6 tapes, while the
week before I had 3 failures in a row.

I will investigate further during the week.

Thanks to:

        "Cuong C Nguyen (408)764-6863" <cuongc@nad.3com.com>
        Tomas.Stephanson@eua.ericsson.se
        quejoh@calamari.Auto-trol.COM (Quentin Johnson)
        mark@maui.Qualcomm.COM (Mark Erikson)
        poffen@sj.ate.slb.com (Russ Poffenberger)
        uunet!tekbspa!edward (Edward Chien)
        Rebecca_Burwell.PARC@xerox.com
        davee@lightning.mitre.org (David N. Edwards)
        jem@gel1.gel1 (Jim Myrick)
        Perry_Hutchison.Portland@xerox.com
        tkevans@eplrx7.es.duPont.com (Tim Evans)
        ben@boxhill.com (Benjamin Monderer)

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Original posting:

From: rpage@coyote.matrox.com (Real Page)
Message-Id: <9203252356.AA04415@coyote.matrox.com>
Subject: Exabyte 8500 I/O error ???
To: sun-managers@eecs.nwu.edu (Sun Managers)
Date: Wed, 25 Mar 92 18:56:26 EST
X-Mailer: ELM [version 2.3 PL11]
Status: OR

Greetings,

I have a EXB-8500 in a EXB-10i stacker hooked to a SparcStation1 under
SunOS 4.1.2, no patches installed.

About every tape I use finished with the following messages:
We use SONY P6-120MP (10/09/91)

I had the 8500 replaced because of a mechanical problem, but the error
still happen.

% /usr/ucb/rsh bourse /etc/dump 0bsfu 112 226000 bigbrother:/dev/nrst8 /dev/rsd2g

  ...

  DUMP: 38.94% done, finished in 0:23
  DUMP: write: I/O error

  DUMP: write: I/O error

  DUMP: Tape write error 11161 feet into tape 1
  DUMP: fopen on /dev/tty fails
  DUMP: The ENTIRE dump is aborted.

and the console gets:

st0: Error for command 'write', Error Level: 'Fatal'
        Block: 3616 File Number: 17
        Sense Key: Media Error
        Vendor (Exabyte EXB-8500 8mm Helical Scan) Unique Error Code: 0x3
esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate
st0: Error for command 'write file mark', Error Level: 'Fatal'
        Block: 3616
        Sense Key: Media Error
        Vendor (Exabyte EXB-8500 8mm Helical Scan) Unique Error Code: 0x3

Then the tape drive is in a weird state:

% mt -f /dev/nrst8 status
/dev/nrst8: no tape loaded or drive offline
esp0: Target 4 now Synchronous at 4.0 mb/s max transmit rate

The only way to get the drive back online is to "push" the eject button
on the unit. Not the best way to automate backup...

I cleaned the heads, I checked the cabling, I changed tapes, changes a lot of
tapes, but I will always fail (>75%). I need to have this unit working for at
least 5 tapes to do a complete backup of my suns and I have never been able
to get it run for more than one or two tape in a row.

I did use the driver that came with Budtool, I had better success with
writing to tape but the same kind of problem was happening:

Mar 23 17:14:21 bigbrother vmunix: smt0: Delta Microsystems Copyright 1990 rev 1.3 3/10/89 (REL) -85Qanx0 SS-5000T 03U1
Mar 24 01:43:38 bigbrother vmunix: smt0: Media change
Mar 24 04:07:57 bigbrother vmunix: smt0::MEDIUM ERROR::33075:
Mar 24 04:07:57 bigbrother vmunix:
Mar 24 04:08:04 bigbrother vmunix: smt0::MEDIUM ERROR::33075:
Mar 24 04:08:04 bigbrother vmunix:
Mar 24 04:12:21 bigbrother vmunix: smt0::MEDIUM ERROR::33075:
Mar 24 04:12:21 bigbrother vmunix:
Mar 24 04:12:21 bigbrother vmunix: smt0: smtclose failed to write file mark

Do you experience the same kind of failure?
would the tape be to old?
is-it releated to SunOS 4.1.2?
to the firmware on the 8500?
to pure badluck?

I will summarize any relevant information.

-- 
+-----------+---------------------+----------------------+-------------+
| Real Page | (514)685-7230 #2359 | (514)685-7030 Fax    | Vaux mieux  |
| Administrateur des Systemes     | root@matrox.com      | jamais que  |
| Systemes Electroniques Matrox   | Real.Page@matrox.com | d'avoir tort!
+---------------------------------+----------------------+-------------+

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Responses:

Date: Thu, 26 Mar 1992 19:03:25 -0800 From: "Cuong C Nguyen (408)764-6863" <cuongc@nad.3com.com>

Hi,

I also had very similar problem with the Exabyte 8500 and SunOS4.1.2. My case is SparcII, after I had talk to Sun SE; they said there are too many problem with 8500?????? So, to be safe (None of us can affort to have bad backup, I assumed) I used 4/330 and SunOS 4.1.1...

Please summary if you have another solution.

Regard Cuong C Nguyen 3Com/NAD SysAdmin cuongc@nad.3com.com

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 10:20:54 +0100 From: Tomas.Stephanson@eua.ericsson.se

In erinet.mailing-list.sun.managers you write:

>Greetings,

>I have a EXB-8500 in a EXB-10i stacker hooked to a SparcStation1 under >SunOS 4.1.2, no patches installed.

>About every tape I use finished with the following messages: >We use SONY P6-120MP (10/09/91)

>I had the 8500 replaced because of a mechanical problem, but the error >still happen.

We are having the same problems with a 8500 on a 4/690. We have changed everyting exept the 4/690.

Fri Mar 27 03:31:00 MET 1992 Filesystem kbytes used avail capacity Mounted on /dev/rid000g 983547 611200 273992 69% /export/u1 /etc/dump 0sdbf 360000 460 20 /dev/nrst8 /dev/rid000g DUMP: Date of this level 0 dump: Fri Mar 27 03:31:02 1992 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping /dev/rid000g (/export/u1) to /dev/nrst8 DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 1237218 blocks (604.11MB) on 0.34 tape(s). DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: 22.96% done, finished in 0:16 DUMP: Tape write error 36901 feet into tape 1 DUMP: fopen on /dev/tty fails DUMP: The ENTIRE dump is aborted. Fri Mar 27 03:37:59 MET 1992

Tomas Stephanson Email: Tomas.Stephanson@eua.ericsson.se Ellemtel telecommunication labs Voice: +46 8 7273881 17 170 Alvsjo SWEDEN FAX: +46 8 7274168

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 10:08:42 MST From: quejoh@calamari.Auto-trol.COM (Quentin Johnson)

One of my coworkers, who is on the Sun Managers mailing list, forwarded your problem to me.

I'm evaluating an EXB-8500 in an EXB-10i stacker with budtool. It too is connected to a SparcStation1 but with SunOS 4.1.1.

I'm using 3M D8-112 tapes and have no problems. It seems I've heard SONY P6-120MP tapes are okay to use. Maybe you got a bad batch of them or maybe they're write protected?

>I did use the driver that came with Budtool, I had better success with >writing to tape but the same kind of problem was happening:

Yes -- use the smt device.

You should give Delta Microsystems tech support a call or whoever you got Budtool from. Quent Johnson (quejoh@auto-trol.com) Unix System Administrator Auto-trol Technology, Denver

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 09:24:14 PST From: mark@maui.Qualcomm.COM (Mark Erikson)

Whenever we have had a rash of strange problem it was due to power supply problems. 8200s and especially 8500s are not happy with out of tolerance voltages and noisy power.

Mark Erikson mark@qualcomm.com

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 09:43:40 PST From: poffen@sj.ate.slb.com (Russ Poffenberger)

Exabytes have a tendency to "wear" or become unreliable with use. We have gone through MANY 8200's. Good thing they are under contract.

About your only option is to have them serviced. It may just be alignment, or it could be head wear.

Russ Poffenberger DOMAIN: poffen@sj.ate.slb.com Schlumberger Technologies UUCP: {uunet,decwrl,amdahl}!sjsca4!poffen 1601 Technology Drive CIS: 72401,276 San Jose, Ca. 95110 Voice: (408)437-5254 FAX: (408)437-5246

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 10:19:00 PST From: uunet!tekbspa!edward (Edward Chien)

I had a lot of trouble using the newer Sony P6-120MP with a little 'F' on the package. (Be careful with that little 'F'. I heard that it is a coating to prevent data storage usage.) I got those write error messages on 50%+ of the tapes. So I switched to use the HG tapes. I have had no problem at all using SONY (without the little 'F', the newer ones have 'F') and MAXELL. It's been 3 months and 120+ tapes, I have not experienced a single write I/O error!!!

Hope this helps.

Edward Chien (edward@tss.com | uunet!tekbspa!edward) Systems Engineer, Teknekron Software Systems, Inc. (415)617-2450

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 1992 10:20:08 PST From: Rebecca_Burwell.PARC@xerox.com

The people we deal with in buying our tape drives told us a few months ago that in an 8200 it is okay to use the SONY P6-120MP. In the 8500 they recommend using either the Exabyte tape or the SONY Metal HG. Apparently there are a lot less troubles with the 8500 if you use the higher quality tapes.

Don't think this is your problem, though.

*becky*

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 14:42:07 EST From: davee@lightning.mitre.org (David N. Edwards)

Greetings!

We have a similar problem with our Exabyte 8200 under SunOS 4.1_PSR_A. We don't get write errors, but the job will simply stop. No errors, and the processes seem to be swapped out. We've yet to figure it out.

I am buying a stacker in the near future, and my choice is between the ACL and Exabyte 10i. If you get a chance, would you tell me whether you need any device drivers to run the stacker? Is it in the SCSI chain, or are the robotics run by a serial connection to the CPU? Does SunOS4.1.2 know about the EXB-8500? Thanks in advance for any hints!

Regards,

Dave Edwards MITRE Corp. davee@mitre.org

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 92 15:44:29 EST From: jem@gel1.gel1 (Jim Myrick)

I had some trouble recently with our auto backup on an 8500. The problem was traced to a disk drive, however. Our backup would lockup and occasionally write error messages would appear. I had to turn off the read ahead cache on our 1.2 Gbyte Maxtor drives that we bought from Andataco. We still need new firmware, but for now turning off the read ahead cache has worked. This may not be any help if you don't have these drives, but it was worth a try. Good luck.

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Fri, 27 Mar 1992 13:07:49 PST From: Perry_Hutchison.Portland@xerox.com

Methinks you are running out of tape. You may need to set the density in order to get the Exabyte to work properly. We are using the following on an 8200:

rsh -n mars rdump 0ufsbd dumphost:/dev/nrst0 6000 126 54000 /dev/rsd0g

For an 8500, you probably need to change the 6000 to 12000 OR the 54000 to 108000 -- not both, but I don't know which.

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ From: tkevans@eplrx7.es.duPont.com (Tim Evans) Date: Mon, 30 Mar 92 8:08:25 EST

Verily, Real Page hath said unto me:

I'm seeing the same errors, but not all the time, on 4.1.1. I have *not* however seen the situation where the tape drive was unusable; it at least keeps trying until my dump script runs all the way through.

Please do summarize and post the information you get. Thanks.

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Date: Mon, 30 Mar 92 09:43:06 EST From: ben@boxhill.com (Benjamin Monderer)

We have been seeing problems of this type, and have not been able to identify the problem. We have one site with 25 drives that has been seeing the problem regularly, but intermittently, on most of the drives. Sometimes 50% of the dumps fail.

We are not able to reproduce the problem at our site, which makes the problem very annoying.

Would you be willing to help us debug the problem? If so, use the SunOS st driver for your dumps and turn on the st driver error messages:

Set st_error_level to 0x0 to get all request sense responses from the tape drive. Use the following commands to make the change in your running kernel. This will not crash or otherwise affect your operation, except that you will get more messages from the st driver.

# adb -w /vmunix /dev/kmem st_error_level/W0 st_error_level?W0 ^D

Send the kernel messages to me when you have a failure. We can decode them and I will tell you if we find a solution.

Thank you very much!!

Ben

------------------

Benjamin Monderer BBBB H H i ll ll (ben@boxhill.com) B B H H l l BoxHill Systems Corporation BBBB ooo x x HHHH ii l l 161 Avenue of the Americas B B o o x H H i l l New York, NY 10013 BBBB ooo x x H H iii lll lll Tel: (212)989-HILL (4455) Fax: (212)989-6817 S y s t e m s C o r p o r a t i o n

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

-- +-----------+---------------------+----------------------+-------------+ | Real Page | (514)685-7230 #2359 | (514)685-7030 Fax | Vaux mieux | | Administrateur des Systemes | root@matrox.com | jamais que | | Systemes Electroniques Matrox | Real.Page@matrox.com | d'avoir tort! +---------------------------------+----------------------+-------------+



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:40 CDT