SUMMARY: Stale NFS file handle Problem

From: Hong Trac (hongt@sa-cgy.valmet.com)
Date: Tue Jun 20 1995 - 14:20:42 CDT

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi everyone,

First I would like to thank the following persons for giving me help
to solve the problem. Once again, "sun-managers" has come thru:

        Nico Garcia, raoul@mit.edu
        Yves Hardy, yves@suntech.abcomp.be
        Melissa Metz, melissa@columbia.edu

and others who might be sending reponses after this summary.

* The Problem: On a SPARC 20 file server, sunos 4.1.3_U1, console
=========== gives the following messages every few seconds:

fcntl: Stale NFS file handle
rpc.lockd: unable to do cnvt

* Solutions: (replies are included at end)
=========

  I followed Yves's suggestions to run rpc.lockd in debug mode (-d) and
  was able to identify 2 Sparc5 clients (Solaris 2.4) that are causing
  the "Stale NFS" errors. The problem went away after rebooting these
  2 clients. I will get patch #100075 later.

  I haven't tried Nico's and Melissa's suggestions yet but I will
  the next time. Melissa has a nice script that helps find the
  client (if Mellissa doesn't mind, I can send to you if requested).

  Thanks again Melissa, Yves and Nico.

Hong Trac

======================================================================
Hong Trac Valmet Automation (Canada) Ltd.
Phone: (403)-253-8848 10333 Southport Road S.W.
Fax: (403)-253-2926 Calgary, Alberta T2W 3X6
Email: Hong.Trac@sa-cgy.valmet.com Canada
======================================================================

* Replies:
=======

1. From Nico Garcia:
----------------

Try, on the machines exporting the directories in question, running
"exportfs -av;exportfs -v". This flushes the state of NFS interactions
by turning *off* NFS exporting, then turning it back on (I think this
is how I fixed it last time).

Fair warning: NFS does not work well

You can also unmount the NFS imported directories on the client machines
by hand: this may also help.

Nico Garcia
raoul@mit.edu

2. From Yves Hardy:
---------------

First verify that the appropriate lockd patch is installed on all
clients and servers on the network.

   SunOS 4.1.X Patch# 100075

   On the system reporting the errors, kill and restart rpc.lockd:

        #rpc.lockd -d 3

   The "-d" option will put rpc.lockd indebug mode. This may pinpoint a
   client on the network which is trying to access a nonexistent file
   or directory on the file server. Once a client is identified, correct
   the mounts on the client so that it properly accesses the file server.

Regards,

Yves Hardy from Belgium.

3. From Melissa Metz:
-----------------

Here is our internal note about the "unable to do cnvt" errors:

Problem: spewing errors about: fcntl: Stale NFS file handle
rpc.lockd: unable to do cnvt.

    Diagnosis:
    A client of this NFS server has a stale file handle (one which no
    longer matches the state of the disk) open and locked.

Solution:
kill the offending client process, or reboot the client.

Procedure:

on server:

- /sh/sy/subsys/scripts/efindlockmgr

This will run rpcinfo -p, find the lockmgr processes/ports, and then
run etherfind on those ports.

Look for the host that shows up again and again, this is the culprit
client.

Try to find and kill a process on that client which would be accessing
this NFS server. Or reboot the client.

I will include our "efindlockmgr" script below.

Melissa Metz
Unix Systems Group

---------------------------------------------------------------------------

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:10:27 CDT