SUMMARY: Problems with Exabyte Tape Drive

From: Leslie E. Forte (leslie@tigger.georgetown.edu)
Date: Wed Feb 19 1997 - 09:01:29 CST


Sorry for the delay in this summary - I wanted to find a resolution before
posting. My original message was:

******
I have been having a heck of a time with our 10 Gig 8mm Tape Dsktop
Storage Pack - it is an Exabyte 8500 attached to a Sparc 20 running 2.5.
It started as a hardware problem, with the drive eating the tapes that I
would put in it. I called hardware support and they came out and replaced
the drive with an identical model. Then, when I put tapes in it, it would
spit them out after 30 seconds, with the error lights on. So, I called
hardware and they replaced the drive again. Now it will accept the tape
and it gives me a ready light, but when I do an mt status for /dev/rmt/0
(its correct device name) I get "/dev/rmt/0: no tape loaded or drive
offline". I have rebooted the machine with a -r to reconfigure and have
checked the device files to see if perhaps they were corrupt, and they
look fine. I then ran through drvconfig and tapes with a sun technician
to no avail. The cable I assume is ok, since it is in the middle of a
scsi chain and the disk drives after it are working just fine. Is that an
incorrect assumption? Nothing changed after the hardware replacement,
just the internal drive, no changes to the target id or anything.

Sun software support says it is definitely hardware, and hardware is
telling me it is definitely software, and neither of them will help me any
further. I am lost and I really need to get some backups done.
*******

SOLUTION:

I should have included in my post some more information about the
situation (such as the fact that probe-scsi-all did indeed show the
device, address numbers, etc), so most of the responses I got were asking
for more information. I did get some very useful information about SCSI
chains and the st.conf file. One noticable comment is that Exabyte is not
a very well-like tape drive! =)

In the end, I still cannot get the tape drive to work with its original
system, so I took it off and attached it to another Solaris box to at
least do remote backups. Since it works fine on the other box (using the
same cabling from the first box), I have pretty much ruled out hardware
and will now concentrate on finding a software solution.

Anyway, thanks to everyone who responded, I really appreciate you all
taking the time to help me out with this problem! Below are all of the
responses I got in case anyone needs them.

Best,

Leslie Forte

------------------------------------------------------------------------
Leslie Forte Reiss Science 238
UNIX Systems Administrator Phone: (202) 687-3108
Academic Computing Services Fax: (202) 687-6003
Georgetown University leslie@georgetown.edu
------------------------------------------------------------------------

RESPONSES:

From: Michael Kohne <mhkohne@moberg.com>:

First off, let me say that I can't stand exabyte. As a company, I don't
think much of them, or their products. We had an exabyte 4mm dat that gave
us a similar series of troubles, which was eventually replaced by them with
an 8mm unit (apparently even they can't get their 4mm units to work right).
I'd personally put more faith in sun's support people knowing what they are
doing than exabyte's.

I don't have specific knowledge of your problem but I'd lay odds on it
being the exabyte. Does scsiinfo or scsiping give anything useful on the
drive? If not, try hooking it to some other type of machine (I actually
like macs for this) and use a good scsi probing utility on it. This can
often reveal information that is hidden behind unix drivers.

Also, I would try getting some spare cables and see if you can make
anything different happen. You should probably also check all the cables
and connectors for problems - it's not that hard to bend those pins, and
depeding on which pin you bend, it might cause interesting problems that
don't manifest the way you'd expect. Try shuffing devices around on the
SCSI bus (as in, change which drive is cabled first).

-------------------------------------------------------------------------
From: Ric Anderson <ric@rtd.com>:

If you have a hardware and software maintenance contract with Sun,
(which it sounds like you do) call in the problem and don't stop
escalating till its fixed. Also cry on your sales persons shoulder.
He or She will be bright enough to see lost sales if this doesn't get
resolved :-)

Of course, if you want to poke about yourself, I'd try the following -
most of which you've probably already done :-)
1. Halt the machine (a crash a day keeps the users away)
2. do a "probe-scsi-all" and make sure the tape DRIVE as well as
    the stacker answer the probe properly, and at the expected SCSI ID.
3. If they don't, power off the tape enclosure and pop the skins on
    the it and verify that the 50 pin ribbon connector is fully seated
    in the drive. Its amazing what a slight angle on that connector
    can foul up.
4. Check the power plug also, at the back of the drive to make sure
    it is fully seated.
5. If the hardware looks fine, go to /dev/rmt and remove all the
    links. Then do a boot -r (or /usr/sbin/drvconfig followed by
    /usr/sbin/tapes).

---------------------------------------------------------------------

From: John Stoffel <jfs@fluent.com>:

Leslie,

My first suggestion is to open up the case holding the drive and to
check and make sure that any DIP switches are set correctly for your
system. This should have been done when they took away the original
drive, but they may have forgotten to do so. Of course this assumes
there are some dip switches.

Then try shutting down the system and checking for the drive with a
'probe-scsi-all' command from the prom level. Does the drive come
back with the same name and rev numbers as before? You might have
gotten a drive with newer/older rev firmware that your system doesn't
understand properly.

What happens if you put a cleaning tape into the drive? Does it make
some whirring noises and then spit it out again? Then try booting the
system (boot -rv) and seeing what 'mt status' says then. Watch the
console carefully as it boots up to make sure it sees the tape drive
properly. The -v flag will give you verbose output.

Go out and buy a DLT tape drive. You'll be happier and I wish I could
do the same with out jukebox.

----------------------------------------------------------------------
From: Frank Pardo <fpardo@tisny.com>:

Previous summaries on this list have talked about the position of
devices on the SCSI chain. You might try removing everything but the
tape drive, as a test. And if that works, try adding other devices one
at a time, both closer to the computer than the tape drive, and outboard
from it.

Again quoting from previous summaries, the difference between active and
passive SCSI terminators can be important. People are always saying to
use active terminators.

This may sound stupid, but... Have you experimented with more than one
tape in the drive? It could be that your test tape is defective...

-----------------------------------------------------------------------
From: "Dan A. Zambon" <dzambon@afit.af.mil>:

Hi,
I am not sure if this will help you or not. I have two tape
stackers (from MTI) on my Sun environment and have had troubles
galore with them. However, when I replace the exabyte drives
inside the stacker, I have to make sure that the CEI numbers
of the new replacement drive match those of the drive
being replaced.

For example, on my 10 tape stacker I just replaced the EXB-8505
drive. The CEI number on the drive is 870010*025. This number
(except for the 025 part) must be exactly the same for both drives.
If not, the mt (or tar) commands do not recognize the device.

I hope this helps - and I wish you all the luck....

---------------------------------------------------------------------
From: Mark Hargrave <root@wisdom.maf.nasa.gov>:

Have you tried a different brand of tapes?

---------------------------------------------------------------------

From: ssayer@aisys.com:

Have you checked to make sure that this device is not internally terminated or that the SCSI bus itself is incorrectly terminated in some manner?

----------------------------------------------------------------------
From: Jay Lessert <jayl@latticesemi.com>:

[horror story clipped...]

My sympathies.

1) We *have* run into two bad 8500 refurb jobs in a row. It's possible.

2) We have run into "weak" power supplies on exb-10 stackers (the
    older ones, without the LCD display). One time we ended up running
    a stacker for a year with:

    - the cover off,
    - the stacker mechanism running off the internal power supply
    - the tape drive itself running off an external power supply

    It was the only way we could make it work, and we tried
    *everything*.

3) Just because the drive is in the middle of the SCSI chain doesn't
    mean the SCSI chain is ok (one or both of the internal SCSI
    connectors could be completely open, for example). So if you
    haven't done it yet, rewire the SCSI chain completely, or move
    the stacker to another cpu and try it there with a another cable.

----------------------------------------------------------------------
From: Jim Harmon <jim@telecnnct.com>:

What you didn't mention here are the following things:

        IS the drive Fast SCSI, FastWide SCSI, active, passive?

        How long is your entire SCSI chain, SE? Differential?

        How many SCSI devices are ON the chain? 3? 4? 5? 6?

        What is the ADDRESS of the tapedrive? (I assume it's 5 or 6
        by default)

        Is your kernel configured to support the tape drive on an
        address other then 5 or 6?

Any of these issues could impact your problem. :)

-------------------------------------------------------------------------

From: Bob Woodward <bobw@kramer.filmworks.com>:

check your /kernel/drv/st.conf file. I'm running Solaris 2.4 on a Sparc 20
and though I'm using a DLT tape, I still have the settings for the Exabyte
drives. I'll include the relevant entries from my file so you can check
them against what's in yours:

(stuff at the top of the file.......)
        "EXABYTE EXB-2501", "Exabyte 2501 MiniQIC", "WtQIC",
        "EXABYTE EXB-2502", "Exabyte 2502 MiniQIC", "WtQIC",
        "EXABYTE EXB-8205", "Exabyte 8205", "Exa8200c",
        "EXABYTE EXB8500C", "Exabyte 8500c", "Exa8500c",
        "EXABYTE EXB-8505", "Exabyte 8505", "Exa8500c",
        "EXABYTE IBM-8505", "IBM Flavor 8505", "Exa8500c",
        "EXABYTE IBM-85XL", "IBM Flavor 8505XL", "Exa8500c",
(more stuff in the middle of the file.....)
Exa8200c= 1,0x35,0,0xd639,4,0x14,0x14,0x14,0x90,1
Exa8500c= 1,0x35,0,0xd639,4,0x14,0x15,0x15,0x8c,1
WtQIC = 1,0x32,512,0xc40a,1,0x00,0;
(a bunch more other stuff at the bottom of the file.....)

The comma's at the end of the first section are important and the last
'group' entry for the second section should end with a semicolon. (The
WtQIC line is the last entry of the parameter listings in my file.)

Hope this helps a little bit. If need be, I can probably email the whole
file to you if you think that will help.

Of course, I'm getting ready to go for the weekend so I won't be back until
Tuesday. Good luck.
---------------------------------------------------------------------------
From: White Gary SrA USAFE CSS/SCOE <Gary.White@ramstein.af.mil>:

Ensure the target ID of the tape drive does not conflict with any of
your other devices. Do a probe-scsi in PROM mode to ensure all devices
are unique and are being seen.

Gary White



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:46 CDT