Pre-SUMMARY: Problem with SPARCserver 4/390: memory configuration?

From: Claus Assmann (ca@informatik.uni-kiel.de)
Date: Fri Jun 12 1998 - 11:32:21 CDT


Original question is appended.

Preliminary summary:
The system still doesn't work, but the problem of the original question
vanished. After removing and re-inserting all boards and tightening the
screws this particular error was gone... So it seems the CPU board
didn't have full contact to the VME bus, so it neither recognized the
memory board nor the IPI controller.
 However, the IPI disks seem to be unusable (fsck fails even manually),
and we still can't get a complete suninstall run (for the SCSI disk):
the system still crashes nondeterministically without indications of
what the problem might be.
Now one guy completely disassembles the system, cleans it, and puts
it together again (at least I hope so :-).

Thanks to:
Jeff Wasilko <jeffw@smoe.org>
bismark@alta.Jpl.Nasa.Gov (Bismark Espinoza)
James Ford <jford@tusc.net>
Joel Lee <joellee@continuus.com>
Sean Ward <sdward@uswest.com>
Brian <brw@jazz.njit.edu>
for their help!

Suggestions were:
- checking jumpers, PROM version, NVRAM
(we had done this: jumpers are identical, PROM revision of the
new board is one level higher, the NVRAM has been transferred
as well as the EPROM containing hostid etc).
- test the memory (now the test succeeded)
- offer to ask hardware maintenance people (maybe we'll come
back to this).

Most of the respondents just had gotten rid of their old manuals, but
we still have the docs that came with the system.

Thanks again,

Claus
-----
Original question:
We've got a problem with our SPARCserver 4/390 running SunOS 4.1.1 with
32MB of RAM (8MB on the CPU board, 24MB on a separate memory board).
It occasionally crashes. After some investigations we ordered a new CPU
board. We got this today and tried to install it. However, the system
fails its selftest. We connected a terminal to it and started the
system in DIAG mode. The problem seems to be this messages:

<Probing Main Memory>
<Found 0x00000008 MB of Main Memory>
 error: Configuration Register incorrect for 8 MB
  exp = 0xf , obs = 0x00000005

Questions: which configuration register does this message refer to?
Does this message mean that the memory board is defect or is this
a misconfiguration of the new CPU board?
We set the appropriate values in the EEPROM (0x20 instead of 0x08),
but the error remains. We can boot the system from a CD-ROM,
but it crashes either at the reboot (after installation of a
mini-root) or at latest during suninstall.

Is there a way to test the remaining 24MB RAM?
What else should we try?
BTW: the system doesn't recognize the two IPI drives anymore,
on which all of our data is (yes, we have a backup, but we would
like to boot from one drive).



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:41 CDT