SUMMARY: CPU extremely busy with high percentage of system processes

From: John Lee <thesunlover2002_at_yahoo.com>
Date: Tue May 28 2002 - 10:36:05 EDT
Question:
I have a Sun system (an NFS client) having CPU problem
now. The system/kernel processes have consumed 50% to
70% of the CPUs. Here is the info from TOP: "CPU
states: 7.0% idle, 20.0% user, 72.9% kernel, 0.1%
iowait, 0.0% swap Memory: 1024M real, 479M free, 152M
swap in use, 1639M swap free". The user processes seem
fine in either 'top' or '/usr/ucb/ps -aux', and they
don't consume much CPU.

Solution:
No magic solution yet. I will post another question
about this issue.

Answers: Thank you all who answered.
james.brown@us.abb.com wrote:
Use snoop to check for NFS errors from other boxes.
That is a likely thing to look for. If you have a lot
of network traffic the kernel will consume the CPU
cycles.

"Kevin Buterbaugh" <Kevin.Buterbaugh@lifeway.com>
wrote:
If you're running Solaris 8, run prstat and see what
it says is using the most CPU. If you're running
Solaris 7 or earlier, you're going to have to rely on
top (although it's not supported by Sun and not always
100% accurate) for graphical info. How many CPU's does
the box have? If it has just one, context switching
may be your problem. Finally, does what "sar -u" and
vmstat tell you agree with what top says? If they
don't, believe sar and / or vmstat, not top.

"William Yodlowsky" <wyodlows@andromeda.rutgers.edu>
wrote:
We had a very similar issue here. We had moved a lot
of local disk data to a remote machine, and the
processing of the NFS and IP stack traffic alone was
consuming most of the CPU (same symptoms). Altering
the NFS read/write block size helped, but ultimately
the machine was just too slow to keep up, and the
hardware was upgraded.

"sajeev nv" <sajeev20@rediffmail.com> wrote:
R u running web server or LDAP sort of applications ?
This can happen because of application or Oracle also.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following answer was from "Karl Vogel"
<vogelke@dnaco.net>. Thank you very much Karl !!

Answer:
Karl Vogel wrote:
We've run into this several times. Some suggestions
are below; our system is behaving much better.
First, I'd recommend upgrading to Solaris-8 if you're
not already there. It's stabler than 2.6 and has some
nice performance improvements. It comes with
filesystem journalling, which makes disk performance
absolutely *fly*. Solaris-8 also comes with a
different paging scheme, called priority paging in
earlier releases.
http://www.sun.com/sun-on-net/performance/priority_paging.html

-----------------------------------------------------------------------------
http://docs.sun.com/ab2/coll.709.2/SOLTUNEPARAMREF/
Overview of Solaris System Tuning
Tuning a Solaris System
Tuning the Solaris Kernel
Special Structures
Viewing System Configuration Information
kstats
Solaris Kernel Tunables
NFS Tunable Parameters
-----------------------------------------------------------------------------
You won't be able to get this unless you have a
SunSolve Online account.
http://sunsolve.sun.com/private-cgi/retrieve.pl?type=2&doc=stb/1442
White Papers/Tech Bulletins 1442
Delivering Performance on Sun: System Tuning
Greg Schmitz and Allan Esclamado, 30-Apr-1999
This document focuses on techniques for performance
tuning for the Sun computing environment.  It is aimed
at system administrators. Each chapter concentrates on
a different subsystem of the computing environment 
(e.g., Tuning the Solaris Kernel, Memory, Tuning Disk
Subsystems, etc.) and the specific things that can be
done to increase performance.
-------------------------------------------------------------------------
If you have an application that's an incredible swap
hog, or the system is really slowing down, try adding
the lines below to /etc/system and rebooting. I run
with these settings and they've never caused me
trouble.

*  Swap
*  System keeps 1/8th of all memory for swap, which is
too much for
*  a 4GB system. Reduce that to 32 Mbytes (4096 8K
pages).
set swapfs_minfree=4096

*  Memory management
* 
http://www.carumba.com/talk/random/tuning-solaris-checkpoint.txt
*  Tuning Solaris for FireWall-1
*  Rob Thomas robt@cymru.com, 14 Aug 2000
*
*  On firewalls, it is not at all uncommon to have
quite a bit of
*  physical memory.  However, as the amount of
physical memory is
*  increased, the amount of time the kernel spends
managing that
*  memory also increases.  During periods of high
load, this may
*  decrease throughput.
*
*  To decrease the amount of memory fsflush scans
during any scan
*  interval, we must modify the kernel variable
autoup.  The default
*  is 30.  For firewalls with 128MB of RAM or more,
increase this
*  value.  The end result is less time spent managing
buffers,
*  and more time spent servicing packets.
set autoup = 120

-------------------------------------------------------------------------
Check your most popular applications (using truss) for
the following:

* lots of kernel-level system calls, like open(),
read(), write(); all of these require an interrupt
plus kernel attention.
* lots of fork() or exec() calls to start new
processes; fork() under Solaris is extremely
expensive.
* lots of open files; a program called "lsof" can tell
you how many file descriptors are being used by
anything on your system.
* any opendir()/readdir() calls for walking through
directories to find a file; any given directory is
stored in a hash table, but the contents of the
directory have to be scanned linearly, so files in
large directories (~1000 or more files) will take much
longer to open or close.
* size of your inode caches, which keep track of
previously-accessed files.  Run the DNLC script below
as root to see your hit-rate percentage.  If it's
under 90-95%, you need to up the cache size. The
easiest way to do that is change maxusers in
/etc/system to a nice high number like 2048.

Run "mount" to see how your filesystems are set up. 
I'm pretty sure you can mount filesystems with
"noatime" turned on, meaning don't bother updating the
access time whenever a file is opened. We use this
under Solaris-8, and it makes a *huge* difference if
you're doing something to a large number of small or
medium sized files.

---------------------------------------------------------------------------
#!/bin/sh
#
# NAME:
#    dnlc
#
# SYNOPSIS:
#    dnlc
#
# DESCRIPTION:
#    "dnlc" reports on Directory name lookup cache
statistics from
#    the kernel.  This corrects a bug in vmstat.
#
#    To change the kernel values, add something like
this
#    to /etc/system and reboot.  Both "nnn" numbers
should be the
#    same.
#
#         set ncsize = nnn
#         set ufs_ninode = nnn
#
# AUTHOR:
#    Kimberley Brown - UKAC Kernel Support
#    comp.unix.solaris

PATH=/bin:/usr/bin:/usr/local/bin
export PATH

adb -k /dev/ksyms /dev/mem <<END
="**  Directory/Inode Cache Statistics  **"
="----------------------------------------"
ufs_ninode/D"Inode cache size"
ncsize/D"Directory name cache size"
ncstats/D"# of cache hits that we used"
+/D"# of misses"
+/D"# of enters done"
+/D"# of enters tried when already cached"
+/D"# of long names tried to enter"
+/D"# of long name tried to look up"
+/D"# of times LRU list was empty"
+/D"# of purges of cache"
*ncstats%1000>a
*(ncstats+4)%1000>b
*(ncstats+14)%1000>c
<a+<b+<c>n
<a*0t100%<n=D"Hit rate percentage"
="(See /usr/include/sys/dnlc.h for more information)"
END

exit 0
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Tue May 28 11:02:06 2002

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:44 EST