SUMMARY: update hogging cpu

From: Lindy Foster (lindy@olsen.ch)
Date: Fri Mar 19 1993 - 15:20:10 CST


I originally wrote:

>We have a fresh SS2 with a 200Mb internal disk running 4.1.1b. It has
>a couple of fairly large processes (>20Mb) running on it. The load average
>goes up to ~2-5 and the interactive response declines to 0 (i.e., we
>can't even rup the machine). When we manage to get top running (by
>starting that up before the load goes up), we get to see that _update_
>is actually the culprit, using most of the CPU all the time.
>Has anyone over seen this, or have a clue what could cause it? When we
>kill -STOP update, everything is rosy; this doesn't seem like such a good
>permanent solution, however ;-). Any suggestion welcome! Thanks,

Some additional information that I should have included the first time:

        The ss2 has 64Meg of RAM, approx 150Mb swap. It doesn't seem
        to be swapping at all. The processes are two of the same program,
        with size of about 25Mb each. We actually see the same behaviour
        with just one of these running. Also, running "time sync" tells us
        that approx 800 i/o's are done just running sync.

I'd like to thank the following for making suggestions:

        Svante Lindahl <sl@os.se>
        meisner@dlrtcs.da.op.dlr.de (Robert Meisner DT)
        dsnmkey@guinness.ericsson.se (Martin Kelly)
        cyerkes@jpmorgan.com (Chuck Yerkes)
        weingart@inf.ethz.CH
        poffen@sj.ate.slb.com (Russ Poffenberger)
        Christian Lawrence <cal@soac.bellcore.com>
        stern@sunne.East.Sun.COM (Hal Stern - NE Area Systems Engineer)

Many respondents suggested that we were simply swapping. This was not the
case, but I had given insufficient information for that to be determined.
One suggested renicing the jobs after starting them; we already did this,
and this was also not the problem. Two quoted me the man page for update,
but I had already read it ;-)

Svante Lindahl <sl@os.se> suggested the following patch:

>Have you tried patch 100259-03?
>
>Patch-ID# 100259-03
>Keywords: 4.1.1 ufs_inactive syncip nfs performance attachment vmstat
>Synopsis: SunOS 4.1.1: ufs_inactive patch
>Date: 13/Aug/91
>SunOS RELEASE: 4.1.1 *4.1
>BugID's fixed with this patch: 1054999
>Architectures for which this patch is available: sun3, sun3x, sun4, sun4c
>Patches which may conflict with this patch: Quickcheck 1.0
>Obsolete By: SysVR4
>On a 4/490 server, an NFS client is performing a large file copy operation.
>After a few minutes, the server slows down and the LED's on the back move
>very slowly. It may take minutes to execute any command. Running vmstat
>during the copy shows a sudden increase in the page attach rate and the
>CPU utilization hits 100%.

What we did (and it worked): we applied patch #100379-01. Here's the
lowdown:

Patch-ID# 100379-01
Keywords: intelligent swap block freeing algorithm.
Synopsis: SunOS 4.0.3;4.1;4.1.1: an intelligent swap block freeing algorithm.
Date: 9/Sep/91

SunOS RELEASE: 4.1.1, 4.1, 4.0.3 and 4.0.3c

Topic: an intelligent swap block freeing algorithm.

BugID's fixed with this patch: 1033861

Architectures for which this patch is available: sun4, Sun4c

Patches which may conflict with this patch:

Obsolete By:

Problem Description:

Bug ID 1033861 -

OLD SWAP CODE
-------------
The old swap code started with an in order free list. Every allocation of
greater than one page was put back in reverse order. Programs with large
data segments cause the swap free list to become scrambled. This can lead
to some nasty behavior. Specfs clusters io, which helps in many cases,
but becomes worse than useless if the pages are unrelated.

NEW SWAP CODE
-------------
This does an in order insertion into the free list. swap_alloc
hasn't changed, it still takes the first thing off the list. swap_free
will walk down the list until it finds the right spot.

POTENTIAL PROBLEMS
------------------
Some machines have large swap partitions. If you had a large process
allocated at the far end of the swap device, you could get N^2 behavior on
exit(). For a 5 meg process, that could turn into something like a
second. This seems too long. We added a single hint to the system - it
records the location of the last block freed. We check against this hint
when we are freeing a block. If the block is after the hint, we start
there, else we start at the beginning of the free list. For the 5 meg
process, most of the time will be in finding the first insertion point.
After that, it's a simple linked list manipulation. This has the effect
of turning multi page frees into single page frees in terms of time.
This doesn't solve the problem for many small processes but does solve
it for one big one, which should be the only case noticeable.

TURNING IT ON AND OFF
---------------------
Notice: this new behavior is conditional on "swap_order". The default is
on. It can be switched off on a running system by saying

    echo swap_order/W0 | adb -w /vmunix /dev/kmem

You may switch it on and off on a running system at will. However, if you
switch it off, run for a while, and then switch it on, it will take the
system some time before the list gets reordered. You can force a reorder
by writing a small program that mallocs as much space as possible and
then exits.

The fix includes patched version of:

        vm_swap.o

INSTALL:
As root:
# mv /sys/sun{4,4c}/OBJ/vn_swap.o /sys/sun{4,4c}/OBJ/vm_swap.o.orig
# cp {4.1.1, 4.1, 4.0.3,4.0.3c}/sun{4,4c}/vm_swap.o /sys/sun{4,4c}/OBJ/vm_swap.o
# chmod 444 /sys/{sun4,sun4c}/OBJ/vm_swap.o

rerun /etc/config <kernel-name>
and make and install the new kernel.
Please refer to the system and networking administration manual
for details on building and installing a new kernel
 
 --------------------------------------------------------------

Thanks again to all!

                        *lindy*
lindy@olsen.ch



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:37 CDT