SUMMARY: S2.2 network hangup

From: Sumner Hushing (hushing@gdwest.gd.com)
Date: Thu Sep 16 1993 - 15:41:02 CDT


Hi Sun Net Managers,

Thanks to all who responded! A recap is included below. It's not very
"summarized", but hopefully is informative. Apologies for the
bandwidth hit.

I tried paraphrasing my original message a few times, but I keep
leaving out significant information, so here's the original:

>Hi Sun Net Managers,
>
>We're having trouble talking through our network with our new Sparc/10 and
>Sparc/LX Solaris 2.2 systems. Simple rlogin and ftp operations on small
>files work OK, but when we try to push any significant amount of information
>through the connection, there's a hangup.
>
>Our networking guru used a sniffer to be able to tell me that the packets
>originating from our 2.2 machines have the "Don't fragment" bit set,
>whereas packets from our SunOS 4.1.3 machines don't.
>
>It seems that the Cisco routers in the net have a lower max packet size
>than the Suns, and want to disassemble the big packets for transmission,
>then reassemble them at the other end, but won't because the set bit says
>not to. So instead the connection just hangs, because the big packets
>aren't getting through. The Suns have MTU 1500 set by default, and the
>problem seems to happen on packets bigger than 1200. We can reproduce
>the problem any time with "ping -s hostname 1300".
>
>I tried setting the MTU down to 1200 on the 2.2 machines, and lo and
>behold, the problem goes away. But then if a machine gets rebooted,
>the defaults take over again, and the trouble comes back, until I
>redo the MTU. This MTU fix doesn't seem like the right solution to me.
>I want the 2.2 machines to have the same packet format the 4.1.3 machines do.
>
>The fellow at the Sun help line seemed surprised, and referred me to the
>"mandatory" patch set, which I obtained through SunSolve online (a great
>service!), and installed last night, but there's no change in the symptoms.
>The fact that he was surprised that I was having this problem leads me to
>believe y'all did your installations differently than me, and it's just some
>silly pilot error during the install.
>
>Right?
>
>While I'm waiting for Sun to get back to me, anybody out there have
>an answer? Thanks in advance.

------------------------------------------------------------------------
Tim Smith <tgsmith@Sun.COM> pretty well covered the subject with the
following. It's fairly informative, so I included it verbatim:

>Is the cisco sending an ICMP Type 3 Code 4 message when it gets a
>packet it cannot fragment? ICMP Type 3 Code 4 means "destination
>unreachable, don't fragment (DF) bit set, and fragmentation required".
>
>The Sun's are sending packets with the DF bit set in order to do IP
>Path MTU Discovery (RFC 1191). RFC 1191 specifies a method for hosts
>to determine the "Path MTU". The "Path MTU" is the largest MTU that
>can be used over a given path without any router in the path
>fragmenting packets. Using the Path MTU means your packets should get
>from point A to point B (which may be many router hops away) without
>having to be fragmented which should result in better performance. As
>of Solaris 2.0 Sun supports RFC 1191.
>
>RFC 1435 entitled, "IESG Advice from Experience with Path MTU
>Discovery" mentions a potential problem with an interaction between
>some commercial routers and systems implementing RFC 1191. The
>problem is that some routers can be configured to disable the sending
>ICMP messages. The reason for disabling ICMP messages is that some
>older BSD hosts get very confused by them. The Path MTU algorithm
>depends on receiving ICMP Type 3 Code 4 messages from routers to
>work. If a router does not send the ICMP messages things will "fail in
>a silent and hard to diagnose way".
>
>Check with your network gurus (whoever configures the ciscos) and see
>if the ciscos are not sending ICMP Type 3 Code 4 messages. If the
>routers are not sending the ICMP messages you should try to find out
>why. It may be a bug or it may be they are configured to not send
>them. If the powers taht be have disabled sending of ICMP messages on
>the ciscos you have two choices; you can ask them to reenable sending
>of ICMP messages by the routers or you can turn off the Path MTU
>discovery algorithm on the suns.
>
>It may be that the routers were configured not to send ICMP messages
>long ago and that there is no longer a need to surpress the sending of
>ICMP messages. Or it may be that someone configured the router to
>surpress ICMP messages because he heard that "ICMP messages may cause
>problems" and wanted to avoid a potential problem. So it may very
>well be perfectly safe to reenable the ICMP messages on your network.
>Or it may be that you still have some ancient BSD hosts around that
>will fall over dead if they see ICMP messages.
>
>IP Path MTU discovery is a good thing. In general you should try to
>use it. But if you have to turn it off, you can. There is a kernel
>variable named ip_path_mtu_discovery; setting the variable to 0 will
>disable the IP Path MTU discovery algorithm. You can put "set
>ip_path_mtu_discovery=0" in /etc/system and reboot and the IP Path MTU
>discovery algorithm will be disabled.
>
>Before you turn of the IP Path MTU discovery code *please* first try
>to get the routers fixed/reconfigured! If everyone used the optimal
>MTU and all of Van Jacobson's TCP tweaks the world of networking would
>be a happier place.

I left a message for our networking folks, asking if they could reprogram
our routers to supply this ICMP message. In the meantime, the following
suggestions came in.

------------------------------------------------------------------------
Stefan Mochnacki <stefan@centaur.astro.utoronto.ca> also pointed at the
routers instead of solving it on the Sun end:

>We had the same sort of problem with PCROUTErs running a SLIP link...
>...The fix was to hack PCROUTE to increase the SLIP MTU to over 1500.
>I think you must get your CISCO's updated ASAP.

At this point, I was still waiting expectantly for a call back from the
networking group.

------------------------------------------------------------------------
Dan Stromberg <strombrg@hydra.acs.uci.edu> mentioned that his Cisco
routers weren't giving him any trouble.

This sent me asking more questions. About the same time, our
networking guru called back, and that's when I found out about the
lower MTU limit on our Apollo token ring. It turns out, one part of
our network is a token ring, with an MTU of 1268. Larger packets
aren't passable directory through that token ring, and need to be
fragmented into smaller packets for transmission, then reconstructed on
the other end, but the DF bit says that's not allowed. The token ring
apparently doesn't know how to generate ICMP messages.

Dan also wanted to know how we tested for the DF bit. The answer is,
when the network guy came by, he brought a network sniffer which decoded
that info in plain English for us. I'll bet with a quick look at snoop
options, and some clever examination of the networking include files,
one could determine the same info.

------------------------------------------------------------------------
Blaine McFadden <Blaine.McFadden@Corp.Sun.COM> (my Sun Support contact
for this problem) suggested turning off the DF bit with the following
command:

    ndd -set /dev/ip ip_path_mtu_discovery 0

I tried this, and tried Tim's "last ditch" suggestion of inserting the
following line into /etc/system:

    set ip_path_mtu_discovery=0

and rebooting. Neither method solved the problem. This really confused me.

What DID work, was permanently setting the MTU down to 1200 on all my
2.2 machines, by inserting the following line in the network init file
/etc/rc2.d/S72inetsvc:

    /usr/sbin/ifconfig -au mtu 1200

This is a kludge at best, as far as I'm concerned, but I'll take what
works for now. If I need to send traffic through a network segment
with a smaller MTU than 1200, I'll have trouble again.

------------------------------------------------------------------------
Christopher Hoover <ch@lks.csi.com> wanted to know more about the
"mandatory" patch set. The answer is, Blaine told me about them just
recently, while we were working this problem, and I passed on to Chris
the list I received. The list, which may already be obsolete, contains:

100982-02 100985-04 100992-02 100999-22 101014-04 101018-02 101025-05
101031-01 101039-06 101090-01 101109-02

------------------------------------------------------------------------

Well, that's where we stand for now. Additional comments/suggestions
welcome.

+-----------------------+----------------------------+-----------------+
| Sumner K Hushing III | GENERAL DYNAMICS | This space |
| hushing@gdwest.gd.COM | Space Systems Division | intentionally |
| 619-547-5791 | PO Box 85990, MZ 43-8660 | left blank |
| 619-547-4542 (lab) | San Diego, CA 92186-5990 | |
+-----------------------+----------------------------+-----------------+



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:15 CDT