SUMMARY : IPC crashes - panic: error in swapping in u-area

From: vanandel@rsf.atd.ucar.edu
Date: Tue Mar 10 1992 - 17:50:08 CST


(I never saw my response, so I'm resending it. Sorry if this got out twice!)
My query was:

>I have a Sparc IPC, SUNOS 4.1.1, that has started crashing with
>"panic: error in swapping in u-area". The backtrace is not real informative:

>_panic(0xf8141308,0xf8206000,0x4000,0x2,0x4,0xd05) + 6c
>_swapin(0xf81b1e64,0x42,0x0,0xfffffffe,0x4000,0x2) + 64
>_sched(0xf8067124,0xf81b1e64,0x1,0x24,0x2,0x4) + 33c
>_main(0x4000e6,0xffffffff,0x29b41443,0xfffe,0xf8135000,0xf81b13d8) + 374

>The machine has 24 Meg of memory (8 original 1 Meg SIMMs, 4 3rd party 4 Meg
>SIMMS)

>I've reseated all the SIMMs, and the power-on memory test (selftest-#megs = 24)
>never shows a problem. The monitor's "test-memory" command also works just
>fine. I tried running /usr/diag/sundiag/sundiag. If I just test physical
>memory, everything is fine. When I try to test virtual memory I get this same
>panic. Now I've pulled my 3rd party memory out, and am trying to run with
>"just" 8 Megabytes. Again, sundiag can test physical memory all right, but
>hangs when trying to test virtual memory. And if I try to start up X11R5, I
>get the same panic.

......
--------------------------------------------------------------------

Many of the responses told me to suspect disk problems, rather than physical
memory or MMU problems, since the error occurred in "swapin", and Unix can't
gracefully handle disk I/O errors that occur during swapping. I checked the
kernel error logs, and didn't find any indication of hard disk problems.

However, the problem was related to disk I/O, and not caused by bad SIMMS or
MMU hardware. The IPC is running diskless and therefore swapping over the
network. The problem was caused by the fact that I had improperly added an
additional swap file, because I didn't have enough space in /export/swap .
The /etc/fstab entry was:

/dt/bight.swap swap swap rw 0 0

Unfortunately, file "/dt/bight.swap" WASN'T WRITEABLE over NFS by my diskless
client, (permission 600 on the server). SUNOS 4.1.1's "swapon" accepts an
unwriteable swapfile without complaint, but as soon as the system tries to use
the unwriteable swapfile, you get an NFS error, and then (some of the time)
the kernel panic. My short term solution is to make "/dt/bight.swap" world
writeable, and that fixed the problem. A better solution is to expand
/export/swap so I don't need this auxiliary swapfile. My real solution is to
get another local disk back in this IPC.

Addition information from the many helpful responses:
1) Just because SIMMS pass physical memory diagnostics, you can't necessarily
rule them out as a source of failure.

2) No one knows of any other memory diagnostics besides sundiag and the prom
based ones.

3) If the problem was associated with physical disk errors, you should check
cabling, verify that the drive firmware is up-to-rev, & run the format program
to check the disk.

As usual, the collective knowledge of this mailing list is fantastic!

Thanks much!

        Joe VanAndel Internet:vanandel@ncar.ucar.edu
        NCAR - ATD/RSF
        P.O Box 3000 Fax: 303-497-2044
        Boulder, CO 80307-3000 Voice: 303-497-2071



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:38 CDT