SUMMARY: ps is telling wrong memory-sizes

From: bernhard_fank@ukl.uni-heidelberg.de
Date: Wed Apr 01 1998 - 01:58:53 CST


     When I looked at the ps-results I was not careful enough.
     My original message interpreted by john l. wynstra (>):
             (e-mail: john@its.brooklyn.cuny.edu)
     
> Hi sun-manager,
>a process is allocating all our swap-memory on a SPARC-Ultra 2
     running Solaris 2.5.1.
>
> 1. When I look on such a process (no exactly the problematic
     one) with
> /usr/bin/ps -Al -o vsz=20 -o user=5 -o comm=10 -o pid=5 |
     grep 17990
> I'm told it's using:
> SIZE User Process PID
> 1147552 r3madm dw.sapR3 17990
            ^^^^^^^
> 2. When I look on it with
> /usr/ucb/ps -auxv | grep 17990
> I'm told it's using:
> PID TT S TIME SIZE RSS %CPU %MEM COMMAND
> 17990 ? S 0:12114755233280 0.0 6.5
     dw.sapR3M_DVEBMGS00
                                   ^^^^^^^
     
> /* difficult to read! */
     
The same wrote Peter Polasek and Casper Dik, see below.

Peter Polasek wrote also: Both results strongly suggest that the
dw.sapR3M_DVEBMGS00 is a memory hog.
You are right Peter. But these processes (they are even 14) are using a large
amount of shared memory (640 MB). I have to look closer on them.

Birger Wathne from pretty Norway wrote me where to find help:
http://www.sun.com/sunworldonline/ and find the performance
column by Adrian Cockroft

Thank you
Bernhard Fank
University Hospital Heidelberg
-----------------------------------------------------------------
The results from both commands are consistent; however, it is extremely
difficult to parse the PS results when memory sizes are larger than 99Mb.
The '0:12114755233280' result should be interpreted as '0:12' CPU time,
'1147552' Kb total memory size, and '33280' Kb resident physical memory.
The total memory size of 1.147 Gb exactly matches that returned from the
formatted ps command (I'm not sure how you derived the 8965 Mb figure).
Both results strongly suggest that the dw.sapR3M_DVEBMGS00 is a memory hog.
Because it is difficult, even for a leaky program, to accumulate this much
memory in only 12 seconds of CPU time, it is likely that the program either
has declared a huge global memory array or has very large variable
declarations in 'main()' (the stack size limitations should prevent this
from happening in a subroutine). The other possibility is that there is
something like a 'while(1)' malloc loop. I would truss the process to
distinguish these two possibilities (pipe the results into a file and grep
for malloc).

Peter Polasek
phone: (201)617-2626
FAX: (201)330-9772
e-mail: pete@brass.com

Automated Securities Clearance, Inc.
800 Harbor Boulevard
Weehawken, NJ 07087
             
-----------------------------------------------------------------
1. No, the "vsz" parameter is in kilobytes, not pages. So it's around 1GB
(still a sizable amount)

osz is the size in pages.

2. Also in kilobytes; 1147552/33280

Casper Dik <casper@holland.Sun.COM>
------------------------------------------------------------------
Go to http://www.sun.com/sunworldonline/ and find the performance
column by Adrian Cockroft. Download the tools provided in the
latest article. Then read the back issues of the column to get
in-depth understanding.

Birger@sdata.no (Birger Wathne)



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:35 CDT