SUMMARY:Panic cpu0 writeback data parity error

From: Zion_Huang@Focus-Healthcare.CCMAIL.compuserve.com
Date: Tue Apr 21 1998 - 14:21:05 CDT


     To All Sun Managers:
     
     I still have not been able to determine the cause of CPU(0) panic
     error on Ultra-2 with 200 MHz CPU.
     
     
     But here is the summary that I have tried so far.
     
     
     Original Question:
     
     To all Sun Manager:
     
     Not long ago, Kun Li has just sent a summary out about panic[cpu0]
     problem and tracking down to cpu problem for Ultra 2 167 or 200 Mhz
     system.
     
     According to the summary, if the cpu number is 501-4791-02 or later,
     the problem will be fixed.
     
     I have looked into our system and found our cpu number is
     501-4791-04-5596 but I still experienced system randomly reboot itself
     with following error messages on error log.
     
     I just wonder whether the previous summary is correct or not. Or the
     error message I got is something else.
     
     
     Thanks for you feedback and I will summarize.
     
     
     Zion
     
     
     The system is dual 200-MHz Enterprise Ultra-2
     running solaris 2.6. It is less than one month old machine and it has
     been randomly rebooted twice.
     Apr 16 06:16:52 vizion unix: panic[cpu0]/thread=0x60fa1ba0: CPU0
     Writeback
     Data Parity Error: AFSR 0x00000000 00800800 AFAR 0x000001ff 30000000
     Apr 16 06:16:52 vizion unix: syncing file systems... 56 49 49 49 49 49
     49
     49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49
     49panic[cpu0]/
     thread=0x30043e80: panic sync timeout
     Apr 16 06:16:52 vizion unix: 7472 static and sysmap kernel pages Apr
     16 06:16:52 vizion unix: 64 dynamic kernel data pages
     Apr 16 06:16:52 vizion unix: 199 kernel-pageable pages Apr 16
     06:16:52 vizion unix: Copyright (c) 1983-1997, Sun Microsystems, In
     
     Thanks to following responsers:
     
     Colin Melville
     Sandeep
     Leif Ericksen
     
     Summaries:
     
     As Leif suggested to check out the fan on the CPU box and the fan is
     working fine.
     
     As Sandeep suggested that this error has been common to a batch of
     CPUs from Sun and he suggested to replace them. Our CPU is prtty new
     and the number is -04 level, but the problem still appears.
     
     As Colin suggested to run vts to do stress test on the processors and
     I have run them but the result was clean, no error at all.
     
     My last chance is to call Sun to come out to swap the CPUs. I will
     keep your all posted if I have new finding. It seems to me I am not
     the only one running into this problem, I just hope there are some Sun
     technical staff on this list to shed some light on this issue. I am
     new to the Sun Server and it is brand new system and yet it is such an
     unreliable box for production deployment. I am kind of disappointed
     from my short period of experience with the Sun system so far. :(
     
     
     Zion



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:38 CDT