SUMMARY: Ecache SRAM Data Parity Error

From: Cathy Hargrave (cathy@mercury.stm.com)
Date: Tue Feb 11 1997 - 13:24:28 CST


hi

my thanks to
   Benjamin Cline <benji@hnt.com>
   Fletcher B. Cocquyt <fletch@ttmc.com>
   Jeff <jeffw@smoe.org>
for responding.

my original posting:

>
> i had a sun ultra-1 crash with the following error message:
>
> panic[cpu0]/thread=0x507a6fc0: CPU0 Ecache SRAM Data Parity Error: AFSR 0x00000000
> 00400080 AFAR 0x000000 00 100ffff0
> syncing file systems... [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1] [1]
>

some of the cpu's on the early ultra's were built with a defective cache.
sun is shipping me a new mother board.

i want to investigate sun's verification test suite, but i haven't had a
chance yet to look at it.

i appreciate the help. since this error was very infrequent, it had occurred
once a month ago that i could document, your help saved me a lot of work and
my company a lot of money. the machine is under warranty, but there is no
service contract.

cathy

-- 
Cathy L. Hargrave Smith                 phone: (972) 466-7599
SGS-Thomson Microelectronics, Inc.      fax:   (972) 466-7279
1310 Electronics Drive - MS 600         e-mail: cathy@stm.com
Carrollton, TX   75006

-------------------------------------------------------------------------------- >From benji@hnt.com Mon Feb 10 13:36:43 1997

I don't _know_ what it means, but to hazard a gues, it looks like there was a parity error in the static RAM used for the external CPU cache. This might be caused by overheating, or it might indicate a serious hardware problem (hopefully your machine is still under warranty and/or you have a service contract!). If you can, I'd try running Sun's Verification Test Suite (shipped with Solaris 2.5+, but on a separate CD from the core OS) for 24-48 hours, and see if the problem happens again.

-------------------------------------------------------------------------------- >From fletch@ttmc.com Mon Feb 10 14:46:52 1997

We had this problem with one of our CPU's on a new Ultra 2 system. We opened a call with SunService and they replaced the CPU module. Apprently, some of the earlier CPU's were made with a defective type of cache memory. Anyway, the replacement fixed our problem.

-------------------------------------------------------------------------------- >From jeffw@smoe.org Mon Feb 10 17:50:40 1997

Very likely, you have a bad CPU. Make sure that the fan is running (the one on the CPU)--there's a known problem with the fans stopping (although this wouldn't cause your problem, it could lead to failures).

You'll need to replace your CPU board.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:46 CDT