Thanks to all (17) who responded.
The original problem:
A few bytes where being read incorrectly in large files accessed via NFS
through a 4/280 gateway. Further testing (after my post) showed that
accessing the files on the gateway itself (via ie1) also didn't work reliably.
Other network services through ie1 (rlogin, ftp, telnet) seemed OK.
The problem, as suggested by several people, was a bad ie1 ethernet board.
Using a 4/280 as a gateway for NFS traffic is (in general) a bad idea, which
I knew, but I expected poor performance, not mangled data.
Interesting things I learned:
1) UDP checksumming in the kernel is off by default. Because ethernet packets
have CRC checks in them, this should never cause a problem if the hardware
is working properly. In this case, turning UDP checksumming on caused SOME
of the errors to be caught, but not all. Because the errors caused by the
bad hardware were somewhat systematic, a the simple checksum could not
catch them all. Turning on UDP checksumming is a good way to diagnose this
type of hardware problem, though. Turn it on (on all machines involved)
and use "netstat -s" to check if any UDP checksum errors are occuring.
There should not be any with working ethernet hardware.
2) TCP services such as rlogin, telnet, and ftp worked because TCP assumes that
everything below it is unreliable and does its own checksumming. Also, I
believe my problem only showed up under very heavy traffic, and
rlogin/telnet would not trigger it.
University of Manitoba
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:59 CDT