SUMMARY: Clone fsystem on NFS server

From: Imre Kolos (Imre.Kolos@eth.ericsson.se)
Date: Thu Sep 02 1999 - 05:57:04 CDT


I've got 4 replies to my question (moving NFS shared fsystem to new device,
see original post below), thanks goes to:

Jochen Bern <bern@penthesilea.uni-trier.de>
Kevin Sheehan <kevin@joltin.com>
Arthur Darren Dunham <add@netcom.com>
Todd A. Fiedler <todd.a.fiedler@mail.sprint.com>

The consesus is NO (read on :)), I can't move an NFS shared file system
to new device(s) on the server without disturbing the NFS clients
(the mounts become Stale NFS Filehandels on the client),
*unless*
I can preserve the inode structure and the device name of the fsystem.

When the client wants to access a file the server gives out a `filehandle',
unique for that file and file system, for the client. The client then refers
the file with this handle. The handle is calculated by the server from the
file system ID (FSID) and file ID (FID, I'd say the inode number).
(Sunos4 has a `showfh' command, for Solaris there is an fhfind script in
the SunSolve NFS PSD/FAQ)

The inodes can be preserved, for example with mirroring, see the original
post.
! What more, if the file system is under some volume management software,
ie.: the device name is not hardware related, there is a good chance that
the VM has some way to preserve the device name while moving the file system
around, thanks to Arthur and Todd for the idea.
(Or, if it were a single disk, I can swap the scsi ids between the old and
the new one after the mirroring.)

Away with the smalltalk :), I'd set up a test server and went cooking.
It works with DiskSuite !!! (Todd said it can be done with Veritas too)
If you have a files system on a concat or trans device and want to move it
to an other concat or stripe device you can do it while the file system is
basicaly online and without rebooting the server.

Here are the tests:
(Solaris 2.6, DiskSutie 4.2)

/dev/md/dsk/d1 is /export/home, ~1Gbyte, lets move it to d2, ~1.7Gbyte

| server>metastat
| d1: Concat/Stripe
| Size: 2296350 blocks
| Stripe 0:
| Device Start Block Dbase
| c0t0d0s6 0 No
|
| d2: Concat/Stripe
| Size: 3499335 blocks
| Stripe 0:
| Device Start Block Dbase
| c0t0d0s7 0 No
|
| server>unshare /export/home
| server>umount /export/home
| server>metarename d1 d5
| d1: has been renamed to d5
| server>metainit d1 -m d5
| d1: Mirror is setup
| server>mount /export/home
| server>shareall
| server>metattach d1 d2
| d1: submirror d2 is attached
- wait till the mirror-resync is complete.
| server>metastat
| d1: Mirror
| Submirror 0: d5
| State: Okay
| Submirror 1: d2
| State: Resyncing
| Resync in progress: 54 % done
| Pass: 1
| Read option: roundrobin (default)
| Write option: parallel (default)
| Size: 2296350 blocks
-
| server>unshare /export/home
| server>umount /export/home
| server>metaclear d1
| d1: Mirror is cleared
| server>metarename d2 d1
| d2: has been renamed to d1
| server>mount /export/home
| server>shareall
| server>df -k /export/home
| Filesystem kbytes used avail capacity Mounted on
| /dev/md/dsk/d1 1111677 2278 1053816 1% /export/home
- The move is ready, just, because of the mirroring, the fsystem has the same
- size as the old one had -> growfs
| server>growfs -M /export/home /dev/md/rdsk/d1
| /dev/md/rdsk/d1: 3499334 sectors in 3703 cylinders of 15 tracks, 63 sectors
| 1708.7MB in 116 cyl groups (32 c/g, 14.77MB/g, 3712 i/g)
| super-block backups (for fsck -F ufs -o b=#) at:
| 32, 30336, 60640, 90944, 121248, 151552, 181856, 212160, 242464, 272768,
| 303072, 333376, 363680, 393984, 424288, 454592, 483872, 514176, 544480,
| 574784, 605088, 635392, 665696, 696000, 726304, 756608, 786912, 817216,
| 847520, 877824, 908128, 938432, 967712, 998016, 1028320, 1058624, 1088928,
| 1119232, 1149536, 1179840, 1210144, 1240448, 1270752, 1301056, 1331360,
| 1361664, 1391968, 1422272, 1451552, 1481856, 1512160, 1542464, 1572768,
| 1603072, 1633376, 1663680, 1693984, 1724288, 1754592, 1784896, 1815200,
| 1845504, 1875808, 1906112, 1935392, 1965696, 1996000, 2026304, 2056608,
| 2086912, 2117216, 2147520, 2177824, 2208128, 2238432, 2268736, 2299040,
| 2329344, 2359648, 2389952, 2419232, 2449536, 2479840, 2510144, 2540448,
| 2570752, 2601056, 2631360, 2661664, 2691968, 2722272, 2752576, 2782880,
| 2813184, 2843488, 2873792, 2903072, 2933376, 2963680, 2993984, 3024288,
| 3054592, 3084896, 3115200, 3145504, 3175808, 3206112, 3236416, 3266720,
| 3297024, 3327328, 3357632, 3386912, 3417216, 3447520, 3477824,
| server>df -k /export/home
| Filesystem kbytes used avail capacity Mounted on
| /dev/md/dsk/d1 1693969 2278 1636108 1% /export/home
- on the client too
| client>df -k .
| Filesystem kbytes used avail capacity Mounted on
| server:/export/home/user
| 1693969 2278 1636108 1% /home/user

I started to edit /home/user/file on the client with a texteditor before the
start of the move, and could save under the mirror-resync and after it has
finished, the client is OK, no Stale NFS handles.

With trans devices it went this way:

/export/home is trans d1, move it to trans d2

| server>metastat
| d1: Trans
| State: Okay
| Size: 2296350 blocks
| Master Device: d6
| Logging Device: c0t0d0s4
|
| d6: Concat/Stripe
| Size: 2296350 blocks
| Stripe 0:
| Device Start Block Dbase State Hot Spare
| c0t0d0s6 0 No Okay
|
| d2: Trans
| State: Okay
| Size: 3499335 blocks
| Master Device: d7
| Logging Device: c0t0d0s5
|
| d7: Concat/Stripe
| Size: 3499335 blocks
| Stripe 0:
| Device Start Block Dbase State Hot Spare
| c0t0d0s7 0 No Okay
|
| c0t0d0s4: Logging device for d1
| State: Okay
| Size: 131097 blocks
|
| Logging Device Start Block Dbase
| c0t0d0s4 258 No
|
| c0t0d0s5: Logging device for d2
| State: Okay
| Size: 131097 blocks
|
| Logging Device Start Block Dbase
| c0t0d0s5 258 No
|
| server>df -k /rxport/home
| Filesystem kbytes used avail capacity Mounted on
| /dev/md/dsk/d1 1111677 2278 1053816 1% /export/home
| server>unshare /export/home
| server>umount /export/home
| server>metaclear d1
- DiskSuite can't mirror trans devices.
| d1: Trans is cleared
| server>metainit d1 -m d6
| d1: Mirror is setup
| server>mount /export/home
| mount: the state of /dev/md/dsk/d1 is not okay
| and it was attempted to be mounted read/write
| mount: Please run fsck and try again
| server>metaclear d1
| d1: Mirror is cleared
| server>fsck /dev/md/rdsk/d6
| ** /dev/md/rdsk/d6
| ** Last Mounted on /export/home
| ** Phase 1 - Check Blocks and Sizes
| ** Phase 2 - Check Pathnames
| ** Phase 3 - Check Connectivity
| ** Phase 4 - Check Reference Counts
| ** Phase 5 - Check Cyl groups
|
| FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX? y
|
| 495 files, 2277 used, 1109400 free (400 frags, 138625 blocks, 0.0% fragmentation)
| server>metainit d1 -m d6
| d1: Mirror is setup
| server>mount /export/home
| server>shareall
| server>metaclear d2
| d2: Trans is cleared
| server>metattach d1 d7
| d1: submirror d7 is attached
- mirror-resync
| server>unshare /export/home
| server>umount /export/home
| server>metaclear d1
| d1: Mirror is cleared
- I put it back online, growfs might take a long time.
| server>metarename d7 d1
| d7: has been renamed to d1
| server>mount /export/home
| server>shareall
| server>df -k /rxport/home
| Filesystem kbytes used avail capacity Mounted on
| /dev/md/dsk/d1 1111677 2278 1053816 1% /export/home
| server>growfs -M /export/home /dev/md/rdsk/d1
| Warning: 1 sector(s) in last cylinder unallocated
| /dev/md/rdsk/d1: 3499334 sectors in 3703 cylinders of 15 tracks, 63 sectors
| 1708.7MB in 116 cyl groups (32 c/g, 14.77MB/g, 3712 i/g)
| super-block backups (for fsck -F ufs -o b=#) at:
| 32, 30336, 60640, 90944, 121248, 151552, 181856, 212160, 242464, 272768,
| 303072, 333376, 363680, 393984, 424288, 454592, 483872, 514176, 544480,
| 574784, 605088, 635392, 665696, 696000, 726304, 756608, 786912, 817216,
| 847520, 877824, 908128, 938432, 967712, 998016, 1028320, 1058624, 1088928,
| 1119232, 1149536, 1179840, 1210144, 1240448, 1270752, 1301056, 1331360,
| 1361664, 1391968, 1422272, 1451552, 1481856, 1512160, 1542464, 1572768,
| 1603072, 1633376, 1663680, 1693984, 1724288, 1754592, 1784896, 1815200,
| 1845504, 1875808, 1906112, 1935392, 1965696, 1996000, 2026304, 2056608,
| 2086912, 2117216, 2147520, 2177824, 2208128, 2238432, 2268736, 2299040,
| 2329344, 2359648, 2389952, 2419232, 2449536, 2479840, 2510144, 2540448,
| 2570752, 2601056, 2631360, 2661664, 2691968, 2722272, 2752576, 2782880,
| 2813184, 2843488, 2873792, 2903072, 2933376, 2963680, 2993984, 3024288,
| 3054592, 3084896, 3115200, 3145504, 3175808, 3206112, 3236416, 3266720,
| 3297024, 3327328, 3357632, 3386912, 3417216, 3447520, 3477824,
| server>df -k /export/home
| Filesystem kbytes used avail capacity Mounted on
| /dev/md/dsk/d1 1693969 2277 1636109 1% /export/home
| server>unshare /export/home
| server>umount /export/home
| server>metarename d1 d7
| d1: has been renamed to d7
| server>metainit d1 -t d7 c0t0d0s5
| d1: Trans is setup
| server>mount /export/home
| server>shareall
Client is OK.

The original question(s):

----- Begin Included Message -----

Hi managers,

I'd like to move /export/home from one filesystem to an other on my NFS server
with as little disturbance to the NFS clients (well, the users) as possible.
Now I need to get every user to log out for at least 5 hours long. Worse even
that to arrange this takes at least a week.

My main problem is the /home/user mounts become Stale NFS filehandles on the
clients if /export/home moves to a new device on the server.
(By move I mean that /export/home is /dev/dsk/c1t5d0s0, I copy it to say
/dev/md/dsk/d1, then edit vfstab to have /export/home point to /dev/md/dsk/d1
and reboot the server.)

Is there a way to do this move without the Stale NFS problem ?

The idea at all to try something like this comes from the fact that you can
reboot/power-cycle (change hardware on) the server and the clients will go on
once the server has come back.

Basic steps are:
        unshare/share -o ro /export/home
        copy /export/home
        edit vfstab
        reboot server
(This has problems as per above.)

Advanced version, using Disksuite (the tip was from SunSupport)
        1) - first put /export/home into a one-way mirror, ie.:
                metainit -f d21 1 1 c1t5d0s0
                metainit d2 -m d21
        2) - edit vfstab, /export/home is /dev/md/dsk/d2 now and reboot
        3) - attach the new home fsystem to the mirror
                metainit -f d22 1 1 d1
                metattach d2 d22
        4) - when the mirror resync is completed unshare, unmount home,
                  metaclear d2 mirror, edit vfstab, /export/home is
                  /dev/md/dsk/d1 now and reboot
But already at step 2 the client goes Stale, although /export/home physicaly
"almost" has not changed.

I hoped that if I preserve the inode structure of the fsystem I can change
the device unnoticed but at step 2 the inodes are guaranteed to be the same,
still the client detects the trick.
What do they detect ?
Can it be avoided ?
Is there other ways to do this ?

----- End Included Message -----

cya,
Imre

------------------------------------------------------------------------------
Imre Kolos e-mail: Imre.Kolos@eth.ericsson.se
UNIX support mgr. tel: +36 1 4377322
Ericsson Ltd. Hungary fax: +36 1 4377374
------------------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:25 CDT