SUMMARY: "lockfs -u" on Sol2.3 w/ PC/TCP,PC-NFS (STILL UNRESOLVED)

From: Rex Walters (rex@aisg.com)
Date: Tue Apr 12 1994 - 08:29:43 CDT


My problem was removing a write lock from a local filesystem on a
SunOS-5.3 system occasionally gives "permission denied" errors to root.
The write lock was created as part of a dump sequence.

Since my posting I've verified the following:

    1) Of 17 filesystems being backed up on the server using this
        method, only one causes the problem to occur.

    2) The problem does not occur *every* time the troublesome
        filesystem is backed up with the mirroring method. (It usually
        causes the problem to occur, but not always).

    3) The only filesystem accessed from the server by PCs running
        PC/TCP and PC-NFS is the one causing the problems. Most of the
        PCs are running PC/TCP, but at least one uses PC-NFS.

    4) The only sure-fire way I've discovered to remove the write lock
        is to reboot the server. There *MUST* be another way!

Since I've been unable to resolve this problem, and since this is one
of the smallest filesystems (700 Mb), I've been doing "normal" dumps of
the live filesystem (the whole mirror, not just one of the submirrors)
for just this one filesystem. This hasn't caused any significant
performance or reliability problems (yet!), but this exception has made
my backup scripts needlessly complex.

I'm convinced that there is some bug with the PC client NFS software
(or our configuration/installation/usage of the PC software -- I don't
exactly pride myself on my PC skills :-) that is triggering the problem
in the first place. This is the problem that ultimately needs to be
resolved.

*BUT* until I can find a less disruptive way of removing the write lock
than rebooting the server, I will have a very difficult time
testing/debugging any solutions to the primary problem (it's a 24x7
production shop -- they're threating a Singapore-style caning if I reboot
again :-).

Does anyone have any suggestions for either of these problems?

I only received one reply to my original posting: "why go to all that
trouble -- why not just dump the live filesystem?".

Like most people I've dumped live filesystems in multiuser mode for
several years without problems, but our current environment can't
afford the performance hit while dumps occur. Offlining submirrors on
their own controllers allows us to do dumps with virtually no impact on
performance. It is only a side benefit that this method prevents the
well known possibility of corrupting a dump tape if created from a live
file system.

Best Regards,

--
Rex Walters

Original posting:

>Date: Mon, 28 Mar 94 16:34:05 EST >To: sun-managers@eecs.nwu.edu > >From: "Rex Walters" <rex@aisg.com> >Subject: "lockfs -u" on Solaris 2.3 gives "permission denied" to root > >I am using disk mirroring on our NFS server to allow us to perform >backups during the day on a "live" filesystem. Although this is usually >successful, twice now I've been unable to remove a write lock on a file >system that was temporarily locked while the submirror was detached. >Several client programs become inoperable because of the write lock. > >I'm running Solaris 2.3 + Online:Disksuite 2.0.1 on a SparcServer-1000. >I've installed all of the Sun recommended patches: > > 101316-01 Synopsis: SunOS 5.3: Socket library is not signal safe > 101317-06 Synopsis: SunOS 5.3: lp jumbo patch > 101318-36 Synopsis: SunOS 5.3: Jumbo patch for kernel > 101327-04 Synopsis: SunOS 5.3: security, tape, and group patches for tar > 101329-09 Synopsis: SunOS 5.3: Jumbo NIS+ patch, automountd security, > autofs and loopback mounts > 101331-03 Synopsis: SunOS 5.3: fixes for package utilities > 101344-05 Synopsis: SunOS 5.3: Jumbo NFS patch > 101345-02 Synopsis: SunOS 5.3: cpio can't be used with multi-volume tapes > 101347-01 Synopsis: SunOS 5.3: system hangs due to mblk memory leak > 101362-08 Synopsis: OpenWindows 3.3: Server (Xsun) Jumbo Patch > >Every filesystem on the server is a two way mirror, and we are >performing backups by temporarily off-lining one side of each mirror >while it is dumped to tape. > >I'm following the procedure outlined in the OLDS manual: write lock the >file system, run metaoffline, unlock the file system, dump the data from >the submirror, run metaonline to resync the submirror. > >This has worked great for the most part, but twice now I haven't been >able to remove the write lock on one of the file systems after off-lining >the submirror. I get a "permission denied" error -- even though logged >in as root. Both the "lockfs -w" and "lockfs -u" command are executed >by the same shell script (run by root). > >The excerpt from the transcript is as follows: > > # /usr/sbin/lockfs -w /export/p05 > # /usr/opt/SUNWmd/sbin/metaoffline /dev/md/dsk/d120 /dev/md/dsk/d122 > Device /dev/md/dsk/d122 is offline > # /usr/sbin/lockfs -u /export/p05 > /export/p05: Permission denied > >Nothing I've tried short of rebooting the server will allow me to remove >the write lock. > >This has happened twice now (in as many days). The first time it >occurred, I thought it was because I had aborted a dump script by >sending it SIGINT and SIGQUIT (^C and ^\ -- I paniced when I realized >I'd specified a wrong metadevice). Today, though, the scripts were >allowed to run to completion. Both times it was the same file system >(/export/p05). > >My only suspicion is that it has something to do with some PC clients >running PC-TCP or PC-NFS that are bypassing the write lock somehow. >/export/p05 is the only partition that is being accessed by PC clients. >Regardless, I view the "permission" denied error as a problem on the >server side. > >*ANY* suggestions for how to prevent the problem from occurring, or for >how to remove the write lock without rebooting will be greatly >appreciated. We are in a 24 hour production environment, and server >reboots are quite painful. > >This list has been a terrific resource in the past for me; I'm hoping >that my first specific request will find me a quick answer! > >Thanks for any assistance, >-- >Rex Walters



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:58 CDT