SUMMARY: rpc.mountd goes into uninterruptible wait

From: jol@columba.att.com
Date: Sat Apr 10 1993 - 03:07:48 CDT


It turned out one of the optical drives was disabled. That appears
to have been the problem.
Thanks to Bill Roome wdr@allegra!att.com for a good explanation of
how mountd interacts with the CommVault nfsd daemons.

ORIGINAL

Hi all,
 
I have an SS2 running 4.1.1. The system has 32 Megs of memory, an internal
SCSI adaptor and an SBUS SCSI adaptor. The internal SBUS adaptor has
1 400Meg internal drive and 4 external drives connected to it. I
changed the config file to use the extra target ID for the disk rather
than a tape unit. The system also has an ATT ABARS optical juke box
connected to it.
 
The system is running DMS to allow you to traverse through the files
on the optical platters as if it were a regular unix file system. The
optical platter is an NFS mount even though it is local to this system.
All of out other suns mount this optical file system along with 2
other magnetic disk drives.
 
What happens is sometimes when the other suns try to mount any disks
from this system they will hang waiting to mount the disk. If you log
in to the system it is up and running. If you try to do a df it hangs.
An exportfs hangs. A ps -aux shows the rpc.mountd as follows:
   root 139 0.0 0.0 68 0 ? D Mar 12 1:37 rpc.mountd -n
 
According to the man page it is in a non-interruptible wait state.
There are 24 nfsd running. The other suns mount the optical platter
using the following options:
 
archive -ro,soft,port=2051,timeo=160,retrans=12 system:/archive
as recommended by the folks from ABARS. Up until this point I have
been rebooting to clear it up.
 
Does anyone know what causes this non-interruptible wait state?
This system is connected to a PRODON router and is on a net all by itself.
Does anyone know of a graceful way to bring this system's exporting
capabilities back to life? The next time this happens I'm just going to
kill the mountd and restart it but I would like to find out what causes it.
 
Thank much in advance ! I'll summarize.
 
Joe Ledva
jol@columba.att.com
   or
columba!jol@ludwig.att.com

RESPONSE:

When a client mounts a 3dfs or dms file system, mountd must
talk to the 3d-nfsd or dms-nfsd servers. If those are hung,
mountd will hang. If it happens again, do ls -ld /archive
and also on the dms mount points. If those hang,
then the 3dfs/dms servers are hung.
 
If those are okay, then something's really strange in mountd.
 
BTW, the 24 nfsd's are 3 sets of 8:
        nfsd -- the vanilla sun nfs mag-disk servers
        3d-nfsd -- the nfs servers for 3dfs
        dms-nfsd -- the nfs servers for dms
The 3d & dms daemons are independent, except that dms-nfsd calls
3d-nfsd (via normal nfs channels) to resolve @ references.
 
Also, the 3d-hprtseg command prints the status of each of the 8 3d-nfsd
processes -- what request type they're working on, how many requests
they've served, etc (3d-hprtseg reads a shared segment to get
that info). I think there's a similar command for dms, named
dms-prtCache or something like that. You have to run them as root
on your abars server.
 
        - Bill Roome wdr@allegra!att.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:43 CDT