SUMMARY: Systems start disk thrashing

From: Stuart.Little@ssu.stcssl.co.uk
Date: Sun Jul 30 1995 - 06:26:25 CDT


Hi All,

Part of original question (24th June..)
> The problem is after about six days the systems
> slow down to the point of being unusable. When caught
> just before they became totally unusable vmstat
> showed they were paging heavily and you can hear constant
> disk activity.

I originally received a few responses indicating a leak in
volume-managment, patch 101907-05, or possibly named, patch
102479-01. With both of these applied we still had the problem.
Some people had suggested using top to spot the errant process,
there wasn't one. Using crash we spotted a kernel
leak but couldn't understand why our test configurations
worked fine while the delivered systems fell over :-(
Finally we clicked that in the delivered system one piece
of the system was missing and therefore trying to open a
socket to it would fail. Our application would close the
socket on failure and attempt an open/connect again. A piece of
test code proved the point, open/connect/close had a leak.
Sun have provided a patch, 101945-33(not yet official) that
appears to have nailed this one.

Thanks to the following for their replies:-
Tim Wort <tim@access.com>
Kevin.Sheehan@uniq.com.au (Kevin Sheehan {Consulting Poster Child})
johnkm@netcom.com (John K. Mickelson)
Andreas Stoll <astoll@hypo.de>
D.White@ee.surrey.ac.uk
john@starinc.com (John Malick)
mshon@sunrock.east.sun.com (Michael J. Shon {*Prof Services} Sun Rochester
David Ellis <dellis@bbn.com>
Jerry Lugert <jerry@sequana.com>
Per.Akesson@carmenta.se (Per Akesson)

Cheers,
  Stuart



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:30 CDT