*VERY LATE* SUMMARY: NFS Performance...

From: Brad L. Knowles (blknowle@aragorn.jdssc.dca.mil)
Date: Wed Aug 14 1991 - 05:16:41 CDT


[Sorry about being late with this summary, but the day before I tried to
post this the first time our mailhost freaked out on us, and I've been
scrambling since then to try to get my Sun SS2/GX set up as the new
mailhost. Only now do I have a /etc/sendmail.cf that I have some
confidence will work. -Brad]

    Here is some more information on the subject of NFS Performance, part
of a thread I started a while back, but haven't had the time to post
comments from yet.

    Since this is getting dangerously close to a ``conversation'', and I
haven't officially received permission from the maintainer of this list
(just a couple months of silence on my request for permission to post this
stuff), I will make this the next to last word that *I* will post on the
subject, at least until someone else asks the same questions I did. The
*last* word will be the list of hosts that I got as a response from the
Archie server that have the nhfsstone and nfswatch benchmarking programs
(compress'ed and uuencode'd, of course).

    Thanks for all the help, and I hope this helps some of you.
| Brad Knowles | Internet: blknowle@frodo.jdssc.dca.mil |
| DISA/DSSO/JNSL | Ph: (703) 693-5849 Fax: (703) 693-7329 |
| The Pentagon, Room BE685 |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-|
| Washington, D.C. 20301-7010 | Speaking from, *not* for DISA (nee DCA) |

               Getting the most performance out of Sun Servers
                            by Dr. Peter Oleinick

    [Re-printed from _The_Sun_Observer_, with the author's permission]

        [Graciously input by jaf@inference.com (Jose A. Fernandez)]

Two new products that improve the network file system (NFS) performance of
Sparcservers have been introduced in the Sun marketplace. The Sun
PrestoServer NFS Accelerator and the NC400 Network CoProcessor both
improve the speed and capacity of a Sun NFS server.

It might appear as if these products compete with each other because both
seem to accomplish the same thing. In fact, the opposite is true. In the
material below, an explanation of how both products work, when to use
either one or both, and result of the products' performance are given.

For this study, the objective is to improve the NFS performance of a
Sparcserver 470. This 4/470 has three Ethernets that connect workstations
to the server. Two IPI disks are connected via two IPI controllers.
Tracing NFS traffic between the clients and the server shows that NFS
apparently occurs with a fairly typical mixture of operations.

The first step is to establish the baseline performance with a standard
benchmark. Using the nhfsstone benchmark and the default mix of
operations, a single Sparcstation 1 client achieved 102 NFS operations per
second (ops/sec) with an average response time of 37 milliseconds. During
the test, the workstation was busy about 40 percent of the time, and 30
percent of the server was utilized.

Something other than the lack of central processing unit (CPU) cycles
limits NFS performance. Table 1 may contain the answers.

             Table 1: One Workstation Running Nhfsstone Benchmark

                      Operation Mix msec/call time %
                       null 0% 0.00 0.00%
                       getattr 13% 6.61 2.33%
                       getattr 1% 144.95 3.78%
                       root 0% 0.00 0.00%
                       lookup 34% 6.53 6.04%
                       readlink 8% 5.91 1.29%
                       read 22% 14.16 8.49%
                       wrcache 0% 0.00 0.00%
                       write 15% 170.53 69.85%
                       create 2% 63.30 3.44%
                       remove 1% 138.52 3.61%
                       rename 0% 0.00 0.00%
                       link 0% 0.00 0.00%
                       symlink 0% 0.00 0.00%
                       mkdir 0% 0.00 0.00%
                       rmdir 0% 0.00 0.00%
                       readdir 3% 12.14 0.97%
                       fstat 1% 6.06 0.15%
              5000 calls 102.4 calls/second 36.76 msec/call

The mix of operators appears in the leftmost column of numbers. The
average milliseconds per call (ms/call) appears next, and the percentage
of the benchmark execution time for each operation is in the last column.
Write operations account for almost 70 percent of the execution time,
although only 15 percent of the operations are writes. Following the
80-20 rule indicates that we should foccus on the write and modify
operations to improve our overall performance.

Write and modify NFS operations are slow because NFS requires the server
to write the information safely to non-volatile storage before an
acknowledgement is returned to the client. This prevents client data loss
in the event of a server failure during a modify operation.

For a typical 8 kilobyte (Kb) write operation to a medium-sized file,
three physical disk write I/O operations must be performed. Three disk
seeks, three rotational delays and three disk writes are performed to
write the data block update the indirect block and finally update the
inode. All of this takes a great deal of time.

The PrestoServer accelerates writes and modify operations by caching these
operations to nonvolatile memory and efficiently scheduling data transfers
to disk. These operations now can occur at memory speeds rather than at
disk speeds, while the number of writes to disk are reduced by eliminating
redundant writes of inode and indirect blocks to disk.

For example, suppose the client copies a 1 megabyte (Mb) file to the
disk. Each of the 128 updates to the inode and the 116 updates to the
indirect block will be cached rather than cause a separate disk I/O. A
total of 130 disk writes will occur instead of the normal 372 writes to
copy the file.

When we study the performance results produced by a single client running
the nhffstone benchmark, this time with a PrestoServer installed in the
server, we find the total throughput has increased from 102 ops/sec to 165
ops/sec, and average respons time has been more than cut in half. The
ms/call figures for the write and modify operations have been dramatically
reduced, leading to a more balanced distribution for execution time.

Let us examine performance achived when four workstations simultaneously
perform the benchmark, first on the standard 4/470, then on one with the
PrestoServer installed. More NFS throughput has been achived through the
accelerated 4/470 because of the PrestoServer; however, now the server is
out of CPU cycles after reaching 285 NFS ops/sec. If we are to achive any
higher levels of NFS throughput, something must reduce the amount of CPU
being consumed performing NFS operations.

NFS is the top layer of a deep protocol stack. To perform a data transfer
operation involves overhead to process the Ethernet packets from the
Datalink layer through IP, UDP, RPC/XDR, and NFS server. The file system
operation is then performed on the server, and a complete round trip is
made back down the protocol stack to return data and/or status to the
requesting workstation. In addition, machine interrupts at the datalink
level notify the CPU that a packet has arrived over the Ethernet, and
additional CPU is required to move data to and from the Ethernet
controller over the VMEbus.

The NC400 Network CoProcessor is an intelligent Ethernet controller with
its own processor, which is capable of executing the entire protocol stack
mentioned above. The coprocessor handles all of the processing, from the
Datalink level up to and including the NFS server, without involving the
Sparcserver 470.

When a ready-to-execute file system operation is assembled, a single
interrupt notifies the processor, while the board DMA's the data to and
from main memory. For the return trip the reverse occurs, reducing the
Sparcserver's network computing requirements by more than 90 percent.

When examining the performance results from the four-client workload in
which the server is enhanced with a PrestoServer and two Network
CoProcessors to handle the workstation Ethernets, we find the server is
providing over 450 NFS ops/sec at an average response time of 32
milliseconds. Thus we have increased performance in two big jumps, from
165 ops/sec for the PrestoServer accelerated server, to 450 ops/sec for
the "twin-turbo" 4/470.

Throughput and response times for all three configurations show that
PrestoServer cuts average response times in half, and by doing so allows
the server almost to double the maximum NFS throughput capacity to 285
ops/sec. The addition of the Network CoProcessors extends the throughput
capacity limit to 450 ops/sec by offloading the NFS networking overhead
that consume the server.

If we were to plot NFS throughput as a function of server utilization, for
the three configurations -- NC400 and PrestoServer, PrestoServer only,
standard 4/470 -- we calculate that the standard server provides 3.27 NFS
operations per CPU percent utilized. However, the disk bottleneck
prevents the server from providing more than 165 ops/sec.

The PrestoServer accelerated server produces only 2.95 operations per CPU
percent utilized, but the disk bottleneck is now eliminated and the server
can reach almost 300 ops/sec. The "twin-turbo" server, with both the
PrestoServer and the Network CoProcessors, produces 5.37 NFS operations
per CPU percent utilized and can reach nearly 500 ops/sec. The
combination of both products eliminates the disk bottleneck and the
network overhead.

The products are recommended for use in nearly all server systems;
however, in some instances neither will help. The PrestoServer
accelerator will not improve performance if both the following conditions
are true: (a) the volume of NFS traffic is not significant; and (b) write
or modify operations constitute less than five percent of the total NFS

Network CoProcessors will not enhance performance if the following
conditions are true: (a) the volume of NFS traffic is not significant; (b)
multiple subnets are not desired; and (c) the server is not used for both
NFS and other computing.

[Late breaking news from Dr. Oleinick: ``... with some very minor
performance tweaking recommended by Sun in the paper produced by the Sun
Server Performance Group, "Tuning the SS-490 for Optimal NFS Performance"
by Varun Mehta and Rajiv Khemani, the 2 network configuration is able to
exceed 500 NFS ops/sec. with excellent response time.'' The paper was
presented at the June Technical SUG in Atlanta.

Also, it appears that the article as printed in _The_Sun_Observer_ left
out several figures and tables. The complete information on the subject
can *only* be obtained from Dr. Oleinick. -Brad Knowles]


Dr. Peter Oleinick, an electrical engineer, is director of technical
marketing for Interphase Corp. He holds a Ph.D. in Electrical
Engineering. He has held positions with Tandem and with Hewlett-Packard.
Dr. Oleinick can be reached at pno@omni.com.


Date: Sun, 23 Jun 91 23:20:58 EDT
From: stern@sunne.East.Sun.COM (Hal Stern - Consultant)
Message-Id: <9106240320.AA13528@sunne.East.Sun.COM>
To: blknowle@frodo.jdssc.dca.mil
Subject: Re: SUMMARY: Any good NFS performance monitoring software out there?
Status: R

there's a big difference between the legato (prestoserve) boards and the
interphase nc400 network processors. the nc400 speeds up the network, NFS
or not, even on one network. it provides better scaling over multiple
networks, since the VME ethernet interface you get from sun is pretty slow
(it's a 4 or 5 year old product).

the nc400 helps reduce interrupts and offloads the cpu; protocol
processing up to the NFS layer is done in hardware.

the presto board just accelerates disk writes. it's really pretty
ignorant of NFS -- it plugs into the device driver, and doesn't know one
thing about NFS. one of the DBMS vendors tried a benchmark using a presto
board to accelerate local writes and they were thrilled with the
performance. anything that does lots of sync disk writes (file creation,
dir modification, and of course NFS service) benefits.


This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:20 CDT