SUMMARY: (additional details) gigabit performance NFS weirdness

From: Adam and Christine Levin <levins_at_westnet.com>
Date: Thu Aug 08 2002 - 10:37:28 EDT
I got a few more comments from:
Nate Itkin <Nate-Itkin@ptdcs2.ra.intel.com>
Tristan Ball <tb@vsl.com.au>
"Broderick, Sean" <Sean.Broderick@irl.xerox.com>
Wanke Matthias <Matthias.Wanke@itellium.com>

After I posted my summary, we ran some more tests.

First, we found one very significant problem.  As you are probably
all-too-aware, Suns set every network card in the system (if they're Sun
cards) to the same MAC address.  This was the problem with UDP -- the
packets didn't know where to go, and the switch got confused.  We
incremented the ethernet addresses of the gigabit cards by one:
ifconfig ge0 ether aa:bb:cc:dd:ee:ff+1
ifconfig ge1 ether aa:bb:cc:dd:ee:ff+2
where aa:bb:cc:dd:ee:ff is the ethernet address of the on-board 100Mbit
interface.

This solved the problem with UDP.  UDP is almost twice as fast as TCP for
our NFS application -- I'm now getting 40-45MB/sec reads on the Sun
machines, and I'm maxing out the gigabit cards on the Auspex, which in
this case is a very good thing.

Additionally, someone finally sent me a URL to find "the new" TTCP:
http://www.leo.org/~elmar/nttcp/

Also, we have tweaked tcp_xmit_hiwat, tcp_recv_hiwat, udp_xmit_hiwat and
udp_recv_hiwat (all 65535).  It makes a small difference in speed.

Finally, regarding the idea that UltraSPARC II processors just aren't
powerful enough, I got one reply that someone has dual gigabit cards in a
4500 with 4x400MHz UltraSPARC II processors, and the throughput is
100MB/sec there.

Thanks again for all the info.  Solving the UDP problem has given us the
throughput we were looking for.  We will of course continue to tweak to
squeeze as much speed as possible out of these systems.

-Adam

On Wed, 7 Aug 2002, Adam and Christine Levin wrote:
> Ok, great huge thanks to the following for their extended email Q&A
> sessions:
> Bryan J. Smith <b.j.smith@ieee.org>
> "joe.fletcher@btclick.com" <joe.fletcher@btclick.com>
> David Foster <foster@dim.ucsd.edu>
> Kevin Buterbaugh <Kevin.Buterbaugh@lifeway.com>
>
> Many helpful comments ensued.  The problem has *not* been resolved.
>
> Some comments:
> 1) Even with good data transfer, there could be disk latency issues
> getting the data off the Auspex's disks.
>
> I don't see this as the problem, because three files to *one* Sun goes at
> 20MB/sec, but three files to *three* Suns go at 60MB/sec -- each Sun is
> pulling data at 20MB/sec.  If it were disk or file read latency issues,
> three files would take about the same amount of time regardless of how
> many clients were reading them (with proper nods to read caching, of
> course -- we're dealing with *different* files here).
>
> 2) An UltraSPARC II just can't push a GigE card very fast -- 8 way V880s
> are getting about 40MB/sec (320Mbps).
>
> Well, that may very well be, but the machines are *writing* at 35MB/sec --
> so they're able to send data out the GigE card faster than 20MB/sec.  The
> 900MHz pentium III in the Auspex can push data out at 60MB/sec -- that's
> pretty darned good.
>
> 3) Jumbo frames!
>
> Ok, according to several white papers I've seen, jumbo frames could be a
> solution.  Apparently, using 9k frames instead of 1500 byte frames can
> increase throughput by a couple of hundred Mbps, and also jumbo frames
> lowers the load on the CPU by at least a third of not close to a half,
> which could solve #2 above, if that is indeed a problem.  However, Sun's
> Gigabit cards do not support jumbo frames.
>
> Also, I'm being told by Auspex that their tests have concluded that unless
> *all* of the equipment in the network chain is from the same manufacturer,
> jumbo frames are not reliable (they claim all equipment must be Alteon --
> cards and switches -- because that's what's in the Auspex).
>
> We are going to try to get some loaner cards from other manufacturers.
> Does anyone know the part number for Alteon's optical fibre Sun compatible
> PCI gigabit ethernet card?  Also, does anyone have experience with Antares
> cards?
>
> By the way, I've got two more Sun machines connected now, both running
> Solaris 8.  With five machines pulling data off the Auspex, it tops out at
> almost 70MB/sec -- that's 560Mbps, which is about the max for 1500 byte
> frames over gig-e, as far as I'm aware.  If I can get better performance
> out of the Suns, we'll be trunking the two gig-e interfaces on the Auspex
> together, and then we'll be golden.
>
> -Adam
>
> On Thu, 1 Aug 2002, Adam and Christine Levin wrote:
> > So we've got this brandy-nifty-new Auspex network attached monster, and
> > we're trying to squeeze performance out of it.
> >
> > Clients are three Enterprise E450s, 2GB memory, 4x400MHz CPU, using a Sun
> > GigabitEthernet 2.0 adapter.  Latest recommended patch cluster (105181-32)
> > and latest patches for the SUNWged drivers.  I've got /dev/udp tweaked via
> > ndd (recv/xmit hiwat are 64k).
> >
> > The filesystems are mirrored stripes -- no RAID 5.
> >
> > So I mount the filesystem read/write and I slam data onto it using
> > time dd if=/dev/zero of=foo bs=1024000 count=200
> > (Yes, I've tried varying the block size.)
> >
> > By default, NFS is TCP, and I get about 30MB/sec write (240Mbps).
> > Reading, I get 20MB/sec.  The 20MB/sec is *per Sun machine* -- each Sun is
> > reading simultaneously at 20MB/sec, so the Auspex is pumping out 60MB/sec
> > (480Mbps).  It almost seems like there's a hard receive limit or something
> > going on here.  The read test is:
> > time dd if=foo of=/dev/null bs=16384
> >
> > I've also tried of=/tmp/foo -- still 20MB/sec.
> >
> > When I mount proto=udp, I get a hair faster write times, but about the
> > same.  The read, however, is wonky.  Measuring the transfer rate each
> > second, I get:
> > 58800
> > 222306
> > 0
> > 2299388
> > 56349
> > 80234
> > 600234
> > 0
> > 500234
> > 0
> > ...
> >
> > There are two Alteon gigabit cards in the Auspex unit on different VLANs.
> > The default route on the Auspex is out the 100Mbit card (the gigabit cards
> > are on private, non-routable VLANs, the 100Mbit card is the management
> > address).  The primary gigabit card on the Auspex is on the same VLAN as
> > the gigabit cards on the Sun machines.
> >
> > The switch is a Cisco 6500.
> >
> > Any thoughts as to:
> > 1) Why I can only get 20MB/sec read when writing is faster?
> > 2) Why reading off the Auspex via UDP is wonky?
> >
> > Thanks much,
> > -Adam
> _______________________________________________
> sunmanagers mailing list
> sunmanagers@sunmanagers.org
> http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Aug 8 10:40:00 2002

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:51 EST