[Summary] Diagnosis fun...

From: <Jason.Shatzkamer_at_cexp.com> Date: Wed Feb 25 2004 - 07:26:57 EST · This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:30 EST

All,

OK, some fantastic responses.....Darren, thank you, as always, for your
advice....both here, and on vrts, I always enjoy your posts....

BUT....

The guy that hit the nail RIGHT ON THE HEAD this time was Frank
Smith....Frank, well done, I swear you defined my problematic system to a T
(see Frank's response below), and all based on some preliminary sar
output....VERY well done....

I suppose it's still too early to tell what kind of environment I'll see
when 500 users are pounding away at the system, so perhaps it's too soon for
a summary....that recognized, I will detail the steps I took throughout the
evening, the end result of which brought a system from 3% user, 60% system,
1% iowt, and 33% idle, to 67% user, 20% system, 18% iowt, and 2%
idle....these numbers are based on running multiple user-like loads via
scripts...

I am aware that:

1. 2% idle time sucks, and more than likely the system is CPU bound, and can
benefit from some more hardware...
2. No script can emulate a true production user load

So noone yell at me that it's too soon to know for sure how the system will
behave during peak hours.....

However:

1. I watched a historical peak of 4% user time rise to a never-before-seen
80% user time....
2. I watched the usr/kernel ratio completely pull a 180, which is truly what
I was after at this point
3. I watched as top went from a historical "2 or 3 on cpu" statistic, to "10
on cpu", on an 8 processor box
4. I watched mutex lock contention dive
5. I watched involuntary context switches dive
6. Cross calls are still high, but at least there is actual data processing
going on behind them
7. I watched reports that took 10-15 minutes to run, run in 1-2 minutes,
with hardly any impact on the kernel

My Approach:

1. High smtx in mpstat, combined with high cross calls, combined with high
system calls, combined with 3% user / 60% system numbers, told me quite
clearly that:
	A. Solaris was spending most of it's time kernel thrashing
		a. Lots of cache transfers
	B. User processes were not getting enough CPU time to actually
process, by the time their giant process cache was pulled from other CPU's
		a. Very low user time
	C. Very much a problem accessing some shared resource inherent to
the app
		a. High smtx in mpstat
	D. App inherently has a HUGE dependence on system calls, issuing
forks, execs, lseeks, fopens, etc., at an alarming rate
		a. High number of system calls, high system time
2. Most of the symptoms seemed to have to do with CPU load (or lack
thereof), and cache thrashing
	A. Memory was not being touched
3. How can I reduce mutex lock contention, and at the same time drastically
increase the time the system spends handling user requests. So:
	A. Let's see if we can't make these processes stay on the cpu a
little bit longer
	B. Let's see if we can't shrink some of the caches where the mutex
lock contention is coming from
	C. Let's see if we can't take some advantage of the fact that the
app depends so heavily on the kernel (system calls), rather than try to
change it's nature
		a. If each app request becomes a system call, then in
effect, each user request MUST be chaperoned by a kernel thread
		b. As such, maybe I should concentrate on streamlining my
kernel, and let the app ride the coattails of the more efficient kernel

That said, here is what I did, after hours of research and benchmark / load
testing:

1. I removed any /etc/system entries that expanded the size of IPC
parameters, specifically semaphores and message queues, and let them default
to whatever Solaris wants by nature
2. I replaced the default Solaris 8 TimeSharing dispatch table (dispadmin)
with a StarFire TimeSharing dispatch table
3. I changed the setting of LD_LIBRARY_PATH so that the alternate Solaris
thread libraries (/usr/lib/lwp) were used during dynamic linking of the
application code, rather than the Solaris default of /usr/lib, ro
/usr/ccs/lib

And that's it!

Numbers changed like night and day...NEVER has anyone seen this system run
THIS application at the speed it ran tonight....hopefully, I will see the
same results with the typical user load....I will know for sure by lunch :-)

If there is no further summary, or HELP REQUEST ;-), then you can be sure
that today went well....

I've seen this problem (kernel intensive, single threaded app) mentioned on
the list many times....Frank Smith has obviously endured it once or
twice....maybe these techniques will prove useful to others....

Thanks to All, Again,....J.~

****************************************************************************
***********************************

I dooubt you'll find anything wrong with the hardware.  It looks like the
machine has plenty of idle time available, it isn't swapping, and the disk
latency is usually pretty low (except for one time in the afternoon).
   That would correspond with the snappy command line response you are
seeing.  The system time you are seeing is probably mostly VxFS, NFS, and
network connection service time.
    Your protection faults (pflt/s) and  validity faults (vflt/s) seem
somewhat high.  I'm not that familiar with jBASE, but my guess would be
there is serious contention for some shared resource internal to the app
that is causing all the processes to spend their time sleeping.  You seem to
have plenty of spare disk, memory, and CPU if only the process were able to
use it.
   Does this app work speedily during off hours and then crater as the
number of simultaneous users climbs past a certain point?  I suspect that it
does, but it is difficult to track down the bottleneck.  If the app has any
profiling support built in (or could be compiled in) that would narrow down
the problem, but it still may not be fixable.  A single-threaded app can
only do so much before it bogs itself down spending all its time managing
context switches and hardly any time doing actual work.
    Adding hardware may not help, other than faster CPUs.  More CPUs or RAM
won't help as you already have idle time and no swapping.  If you are lucky
you will find some unneccessary use of locks in the code that you can
remove.  Perhaps you can move parts of the app to a different machine.
Your disk I/O doesn't seem to be a real problem, but mounting your
filesystems with the no_atime option can speed that up considerably
(assuming you don't need atime for something, like scripts that remove
unused files). Also, I noticed on the jBASE web site that if you are using
j1 through j4 jBASE
files:

This method, however, needs a significant of administration and maintenance
in a very volatile environment, where data is being added, removed or
changed significantly over short periods of time. 

I have no idea what that means exactly, but I assume it means there are
things you need to do on a regular basis to tune and optimize those files.

Good luck,
Frank Smith

****************************************************************************
***********************************

> Looking for:
> 
> 1. Anyone see anything glaring as far as isolating problematic Solaris 
> subsystem (i.e. memory, cpu, etc.)

Not really.  The actual number of forks don't look *too* high, but I wonder
if the processes that are forking are really big.  Normally the memory
required should be copy on write, so that the fork itself shouldn't be too
heavyweight, but the high system time points me to look in that direction.

> 2. Can something like this be caused by a bad RAM chip? Bad CPU? (No 
> errors in system logs)

I doubt it.

> 3. Some advice in narrowing down definitive cause, troubleshooting 
> checklists, tools, general approach to finding the needle in the haystack?
> 
> As always, thanks to all....any and all additional information is 
> available upon request....

Ugh..  yeah...  Nothing off the top of my head.  I don't know if you can see
mutex locks on the sar output.  Maybe a quick 'mpstat 5' or something to see
if those look out of place.

Darren Dunham

****************************************************************************
***********************************
> 1. Database app written in PICK basic, jBASE to be specific 
> (obviously, this is a major CAUSE of the problem, just got to figure out
where, exactly)
> 	Not multithreaded, LOTS of forks, assuming application code is the

Yuck!

> culprit, but in order to rewrite app, I need to identify which Solaris 
> subsystem is hurting 2. ~500 users connecting via telnet

I would run prstat for a bit, to see which processes are eating your CPU
time.  Probably your app, as fork() is an expensive operation...

> 1. Anyone see anything glaring as far as isolating problematic Solaris 
> subsystem (i.e. memory, cpu, etc.)

Bit hard to tell (need vmstat and/or mpstat output), but it looks like you
have a CPU starvation problem.  Have you set priority_paging?

> 2. Can something like this be caused by a bad RAM chip? Bad CPU? (No 
> errors in system logs)

Nope.

> 3. Some advice in narrowing down definitive cause, troubleshooting 
> checklists, tools, general approach to finding the needle in the haystack?

I'm guessing that PICK Basic is horribly inefficient...

--
Rich Teer, SCNA, SCSA

****************************************************************************
*************************************

Jason Shatzkamer, MCSE, SSA
Corporate Express Imaging
1096 E Newport Center Drive
Suite # 300
Deerfield Beach, FL 33442
(800) 828-9949 x5415
(954) 379-5415
Jason.Shatzkamer@cexp.com
http://imaging.cexp.com <http://imaging.cexp.com> 

> Confidentiality Notice:  This message, including any or all attachments,
> is for the sole use of the intended recipient(s).  This message may
> contain proprietary and confidential pricing information of Corporate
> Express Imaging and shall NOT be used, disclosed or reproduced in whole or
> in part for any purpose other than to evaluate internally and by
> authorized personnel of named company. Any unauthorized review; use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy all
> copies of the original message. 
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers