SUMMARY: Patch 101318-70

From: rnf@spitfire.tbird.com
Date: Sat Jun 17 1995 - 14:31:39 CDT


Hi Sun Managers,

My original problem was that after installing patch 101318-70 on my Solaris 2.3
Sparc 2, my console filled with messages like the following:

> Jun 6 16:05:50 spitfire automountd[117]: dev 65000000 not in mnttab
> Jun 6 16:05:50 spitfire automountd[117]: dev 66730a00 not in mnttab

Thanks to the following for their help:

Peter.Bestel@uniq.com.au (Peter Bestel)
Lyle Miller <lmiller@aspensys.com>
Scott.Kamin@Central.Sun.COM (Scott Kamin [Sun Denver SE])
Laura Taylor <ltaylor@hootowl.bbn.com>
Peter.Bestel@uniq.com.au (Peter Bestel)
jjr@edi-nola.com (Jack Reiner)

Most of the replies suggested problems with the automounter. It was recommended
that I:
1. Restart the automounter.
2. Kill the automounter if not needed.
3. Check for spaces after the names in the automounter files (this evidently
   causes the observed messages due to a bug in the automounter).
4. Make sure diskless clients and server systems were using the same patch
   revision.

I originally ftp'd the patch from sunsite.unc.edu. Just to make sure that
I was getting the most recent revision I ftp it again from sunsolve. The files
I got from sunsolve were different from what was at sunsite.unc.edu. I don't
know if it was just a transfer problem or what, but the readme files from
sunsolve were more complete.

I reinstalled the patch on a Sparc 10 and a Sparc 2 and have had no problems
with either system since.

jjr@edi-nola.com (Jack Reiner) Sent the following message. It did not affect me
but you may find it useful.

         WARNING: BUG in Solaris 2.3 kernal patch 101318-70

A few weeks ago I asked sun-managers for opinions of Sol 2.3 kernal patch
101318-70 and received generally positive answers. I did get the feeling
that no one had been running it very long.

One week after installing this patch on my server and diskless clients, this
patch caused a major problem. Due to flooding here in New Orleans, we lost
power a couple of times in a couple of days. Our IPC's would not boot.

It took SUN 4 people and 15 hours to find this bug (at 1:30am - it was a
long night :-(

Running this patch, an IPC (diskless client) that is stopped "ungracefully,"
such as powered-off, will hang on boot. This is caused by certain system
files not being properly deleted or ignored during boot.

This is an excerpt from email I received from SUN:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
The diskless client is executing /sbin/swapadd (called from
/etc/rcS.d/S40standardmounts.sh) which tries to mount the swap partition from
the server. While doing this, it attempts talking to the local lockd to lock
/var/yp/binding/cache_binding. This file, together with the xprt* files found
in /var/yp/binding are introduced by 101318-64 for ypbind performance's sake.
However, there's no lockd running until system is in state 2. The process just
keeps getting RPC_TIMEDOUT error and retries forever. I think it should be O.K.
to remove these files when the system is restarted. Removing these files
in S40standardmounts.sh is one of the solution.

I've tested with success this work around :

original /etc/rcS.d/S40standardmounts.sh

        #ident "@(#)standardmounts 1.3 92/07/22 SMI"

        #
        # Add physical swap.
        #
        /sbin/swapadd -1

here is the modified /etc/rcS.d/S40standardmounts.sh

        #ident "@(#)standardmounts 1.3 92/07/22 SMI"

        #cleanup

        rm /var/yp/binding/xprt.tcp.?
        rm /var/yp/binding/xprt.ticlts.?
        rm /var/yp/binding/xprt.ticots.?
        rm /var/yp/binding/xprt.ticotsord.?
        rm /var/yp/binding/xprt.udp.?
        rm /var/yp/binding/ypbind.pid
        rm /var/yp/binding/`domainname`/cache_binding

        #
        # Add physical swap.
        #
        /sbin/swapadd -1
        #

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Make sure that you modify the CLIENT'S S40* file (most likely on the
boot server in /export/root/clientname/etc/rcS.d/S40standardmounts.sh ).

Gracefully shutdown clients apparently do not have this problem.

This bug is not documentated in the README.101318-70 file.

I tend to doubt if this bug affects workstations with their own o/s because
my 1+ has been powered-off and on and rebooted just fine. Of course, if
there are enough sun spots . . . :-)

Hope this heads-up saves some other people from an all-nighter.

Regards,
Jack Reiner
jjr@edi-nola.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:27 CDT