SUMMARY and update: hanging POP connections

From: Don Elrod <dre_at_sirius.utc.edu>
Date: Fri Mar 29 2002 - 16:39:16 EST
Thanks to:
Brian Dunbar,
Karl Vogel,
Martin Hepworth,
Matthew Boeckman,
Scott Hollatz, and
Tim Chipman
for their responses.

Suggestions included:
1.  upgrade to Solaris 8,
2.  replace sendmail with postfix or qmail,
3.  replace ipop3d with qpopper or Courier,
4.  turn on priority paging and check the system for signs that other 
system parameters should be altered,
5.  run virtual_adrian.se and Orca to monitor the system,
6.  split the smtp, pop, and web services among several servers and/or upgrade
the mail server hardware,
7.  use a sniffer to find out what's happening on the network side,
8.  use a tool such as MRTG (Multi Router Traffic Grapher at 
http://people.ee.ethz.ch/~oetiker/webtools/mrtg/) to monitor the network,
9.  use snoop to find out what's happening on ports 25 and 110 from the 
server side.

Karl Vogel also recommended the following sources of information:

http://Web.InfoAve.Net/~dsill/lwq.html, a qmail installation and usage guide.

http://www.sun.com/sun-on-net/performance/priority_paging.html,

http://docs.sun.com/ab2/coll.709.2/SOLTUNEPARAMREF/,
Overview of Solaris System Tuning

http://sunsolve.sun.com/private-cgi/retrieve.pl?type=2&doc=stb/1442
White Papers/Tech Bulletins 1442
Delivering Performance on Sun: System Tuning
Greg Schmitz and Allan Esclamado
30-Apr-1999

* http://www.carumba.com/talk/random/tuning-solaris-checkpoint.txt
* Tuning Solaris for FireWall-1
* Rob Thomas robt@cymru.com
* 14 Aug 2000

In addition, we referred the problem to Sun's sendmail group and, on 
request, sent them output from netstat and snoop commands.  They found that
1.  either clients or the router were resetting sendmail connections too 
quickly and
2. that sendmail was taking a long time to respond due to network load.

They also suggested that smtp, pop, and web services be put onto different 
servers so that the servers could be tuned to support their primary functions.

About the time I wrote my original e-mail message, one or more users used 
up 100% of the cpu time on the border router and caused the router to crash 
several times.   In response, Network Services placed limitations on the 
use of file sharing programs.  We have experienced relatively few problems 
with hung sendmail or pop connections since those limitations were put into 
place.

 >Dear Managers,

 >Over the last several weeks, we have been having problems off and on with
 >tcp connections hanging on our Sun faculty/staff mail and web server. POP
 >and sendmail seem to hang most often; but ftp and telnet connections are
 >also affected. If I truss one of the POP processes that has hung, truss
 >reports that it is sleeping. sendmail is writing far more "Broken pipe"
 >messages than usual to /var/adm/messages.

 >The IT staff has been asked to come up with suggestions on how to improve
 >performance by tuning, upgrading, or replacing the existing server hardware
 >and software. Various Sun and non-Sun options are under consideration. At
 >the moment, the favored ones seem to be to replace the Sun E3500 with two
 >or more Linux boxes and split up the Web and mail functions and to replace
 >sendmail with Novell Internet Messaging System. If any of you have
 >experience with running multi-user mail servers on Red Hat Linux or with
 >NIMS on any platform, I would appreciate hearing your opinions, pro and con.
 >There is some indication that the problem may really lie in the network
 >being periodically overwhelmed with traffic rather than with the
 >server. Is it possible to determine from the server side whether the
 >problem lies with the server or the network? Can the server be tuned to
 >compensate for overtaxed network hardware?

Don Elrod
University or Tennessee at Chattanooga
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Fri Mar 29 15:40:34 2002

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:38 EST