After so many weeks that I've inquired about the strange RPC
error messages problem on our file server 3/280 running 4.0.3, I've
finally found out what had happened. Thanks to all who had responded
to my question.
The problem is due to a bug in SunOS4.1's rpc.lockd. On this
subnet with the file server, I have my workstatoin plug into the
same network. All the workstations are running 4.0.3 except mine
running 4.1. And the message came from the rpc.lockd running
in my workstation.
When rpc.lockd starts, it seems to be okay, "ps aux" says :
USER PID %CPU %MEM SZ RSS TT STAT START TIME COMMAND
root 109 0.0 0.0 84 0 ? IW Oct 2 0:00 rpc.lockd
However, after running for an extended and arbitrary of time, it
may go into a state with SZ=16000 and RSS=6000 or some arbitrary
big size. I have talked to some Sun engineer and found that it is
a known documented bug with patch # 100037-01 for 4.0, 4.0.1 and 4.0.3
and patch # 100075-01 for 4.1.
I haven't have time to install the patch just yet, and since it didn't
happen all the time. When it happens, I just kill rpc.statd and rpc.lockd
and restart them again and then it will be fine for a while.
Thanks again for all the responses.
================================ Address ===============================
Tim Chan, System Engineer, Teknekron Communications Systems
(415)-649-3645 2121, Allston Way, Berkeley, CA94704
uucp : ucbcad!tcs!tim or uunet!tcs!tim
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:58 CDT