SUMMARY: sendmail 5.64 & SunOS 4.1

From: Ed Anselmo (anselmo-ed@cs.yale.edu)
Date: Wed Oct 10 1990 - 09:15:29 CDT


I reported problems with hung inbound SMTP connections to a machine
running SunOS 4.1 + patches and sendmail 5.64 + IDA. Sendmail is
running with frozen config files, and using a tried-and-true, albeit
ancient sendmail.cf file (not an IDA sendmail.cf) which runs fine on
over a dozen other mail machines here at Yale CS. There is a named
(bind 4.8.3) running on the machine, and libc.so has had the bind
4.8.3 resolver routines integrated into it (ie. no NIS/YP)

The hung sendmail processes left no qf* or df* files in /usr/spool
mqueue. "netstat" showed no smtp connections open. pstat -s shows
lots of swap space available. The hung sendmails came from a variety
of hosts (Suns, IBM-RTs, Vaxen, Sequent, Multimax, etc.)

I attached to the hung processes with trace -p, but apparently the
processes were just looping -- I never saw any output from "trace". I
never got around to looking at a core dump (SIGQUIT is ignored).

As suggested by Havard Eidnes <he@spurv.runit.sintef.no>, the solution
seems to be to use a different malloc. I linked sendmail with the GNU
malloc (from the GNU emacs distribution), patched for the localtime()
bug in SunOS 4.1 and have seen no hung sendmail processes in 3 days
now. Paul Pomes <paul@uxc.cso.uiuc.edu>, reports that replacing the
vendor-supplied malloc seems to have cured some problems he's seen
with sendmail 5.65 on a variety of architectures.

Some of the replies I received are included below (some are slightly
edited).

Many thanks to those who took the time to respond:

Steven Blair <sblair@synoptics.com>
aef@sjosu1.sinet.slb.com (Art Feather - SINet - 1.408.437.5373)
pjw@usna.navy.mil (Peter J. Welcher)
del@mlb.semi.harris.com (Don Lewis)
"Mark D. Baushke" <mdb@ESD.3Com.COM>
Havard Eidnes <he@spurv.runit.sintef.no>
"Paul Pomes, UofIllinois-CSO" <paul@uxc.cso.uiuc.edu>
allegra!mp@ucbvax.Berkeley.EDU (Mark Plotnick)

>>> inbox:275

Date: Thu, 04 Oct 90 09:45:42 PDT
From: Steven Blair <sblair@synoptics.com>
Subject: Re: sendmail 5.64 and SunOS 4.1
________________

you might want to investigate turning on debug output. It could be a useful
thing to do. On another note, you could check your sendmail.cf by
running sendmail in the test mode, and see if some ruleset is being
mangled at data handling time. Perhaps the best thing might be stepping
back to 5.61_patches, or using something like Sabre on it whilest running

good luck, we're staying on 5.61 for the time being. It could also
be a problem with DNS, or some weird permutation of permissions
problems...

scb

>>> inbox:278

Date: Thu, 04 Oct 90 13:10:07 EDT
From: pjw@usna.navy.mil (Peter J. Welcher)
Subject: Re: sendmail 5.64 and SunOS 4.1
________________

This probably isn't related, but here goes:

I had some strange email problems that I finally traced down
to swap space. When I locked the screen (default configured
SPARC 1+), there was no longer enough memory/swap space.
Not only would sendmail keep crashing, but we'd get bouncing mail
messages. Since I'm postmaster, each such message would
generate more messages about failure to deliver it. This
would drive the user load up on the mailhost something fierce
(so I couldn't log on even), not to mention filling up
the /var partition, causing other interesting problems.

>>> inbox:279

Date: Thu, 04 Oct 90 13:20:11 EDT
From: del@mlb.semi.harris.com (Don Lewis)
Subject: Re: sendmail 5.64 and SunOS 4.1
________________

If you can get a core file (with gcore or kill -something), you could
dig around with the debugger and find out where it is hung.

Don "Truck" Lewis Harris Semiconductor
Internet: del@mlb.semi.harris.com PO Box 883 MS 62A-028
Phone: (407) 729-5205 Melbourne, FL 32901

>>> inbox:285

Date: Thu, 04 Oct 90 22:42:53 PDT
From: "Mark D. Baushke" <mdb@ESD.3Com.COM>
Subject: Re: sendmail 5.64 and SunOS 4.1
________________

I have been running 5.64+IDA-1.3.4 (+ a few fixes) on my SunOS 4.1
Sun-4/60 (local disk is a WREN 5, kernel is GENERIC -- yes I know I
should customize it) with frozen configuation file with no hangs yet.

I am also running a version of 5.64+IDA on another SunOS 4.1 SS1+
machine with two ethernets. It also has no problems. It is not running
NIS.

I build with 'make dsendmail' using the conf.h and Makefile given
after my .signature [DELETED - ed.] . The -lresolv library is stock
SunOS 4.1. The machine has a shared resolver based library, a caching
DNS server (stock SunOS in.named) and an xntpd process. The machine
also runs as an NIS client.

Note: I have been told that there are cache problems using the "-b"
switch for NIS to provide DNS names under SunOS 4.1. I would recommend
building and running a real nameserver over use of the NIS hosts map.

Could you tell me more about your environment? Are the incoming hangs
always from a local subnet? Are they always from a particular machine?

-- 
Mark D. Baushke
mdb@ESD.3Com.COM

--------------- additional posting about sendmail and SunOS 4.1 ---------------

From: jaysun@CS.CLEMSON.EDU Newsgroups: ba3com.inet.sun.managers Subject: Sun sendmail and SunOS 4.1 Date: 4 Oct 90 15:39:44 GMT

The following message was posted about IDA sendmail. This is NOT an answer or response to that as we are not supposed to do on this mailing list but it is an addition which I thought should be posted...

We are running Sun OS4.1 and our mail server is a 4/280 running Suns sendmail. We are using NIS with the "-b" hook for dns. I am seeing 1-2 hangs a week that put a large load on the server. It has always hung while sending to the same user and site but it does not hang everytime its done. The ps output is very similer to below. (Not the same site of course)

Does anyone have any insight on this problem. It never happened in OS 4.0.3 so would it be considered a OS problem??

Jay Williamson Clemson University Systems Manager Computer Science Dept. jaysun@cs.clemson.edu South Palmetto Blvd. (803) 656-5884 29634-1906

>>> inbox:291

Date: Fri, 05 Oct 90 08:39:33 PDT From: Mark D. Baushke <mdb@ESD.3Com.COM> Subject: Re: sendmail 5.64 and SunOS 4.1 ________________

It sounds very much like a problem we once had with sendmail swamping the nameserver with requests.

Please try renice -1 `cat /etc/named.pid` to give the nameserver process a higher priority than sendmail. If quasi-eli is not running a caching nameserver, I would suggest that you set one up. Again, have it running at a higher priority than the sendmail processes.

Please let me know if this helps, -- Mark

>>> inbox:300

Date: Sat, 06 Oct 90 19:21:57 CDT From: rickert@cs.niu.edu Subject: Re: sendmail 5.64 and SunOS 4.1 ________________

This is all a long shot. I just reread the first message in which you mentioned the sendmail was stuck in the data state, but the remote sendmail thinks it has delivered.

What do you see with 'netstat'? Does it show the communication link apparently open? If so, could there be contributing networking problems?

If the link is closed, have you looked for queue files. One would hope that at least the 'q' file is there, even if the 'd' isn't. If the queue files are present, I would examine all header and envelope addresses in the 'q' file and run them through 'sendmail -bt' to see if they are causing problems such as rewrite loops. I don't think that is possible with the distributed rulesets, but subtle bugs always creep in.

Do the stuck messages always come from the same source or the same relay?

-Neil Rickert

>>> inbox:307

Date: Sun, 07 Oct 90 15:27:44 BST From: Havard Eidnes <he@spurv.runit.sintef.no> Subject: Re: sendmail 5.64 and SunOS 4.1 ________________

This sounds like the SunOS 4.1 localtime bug. The sendmail is looping inside the default exit handler, apparently repeatedly calling itself. Installing sendmail with the GNU Emacs fixed-for-SunOS-4.1-localtime malloc.c, compiled with -DSUNOS_LOCALTIME_BUG will probably fix the problem. Note that you also need the GNU Emacs getpagesize.h file to compile GNU malloc.c.

- Havard

>>> inbox:317

Date: Mon, 08 Oct 90 09:52:59 PDT From: Mark D. Baushke <mdb@ESD.3Com.COM> Subject: Re: [Havard Eidnes: Re: sendmail 5.64 and SunOS 4.1] ________________

On Mon, 08 Oct 90 12:21:22 -0400, Ed Anselmo <anselmo-ed@CS.YALE.EDU> said:

Ed> I received this message yesterday. I recompiled sendmail 5.64 using Ed> the GNU emacs malloc (appropriately patched). No hung sendmail's in Ed> the past 24 hrs. but only time will tell....

Ed> -- Ed

Interesting...

The Sun malloc() is supposed to be allocating memory on 16 byte boundaries while the unfixed GNU malloc() allocated memory on 8 byte boundaries and the fixed GNU malloc() on 16 byte boundaries.

The localtime() bug is said to scribble on the ninth byte of an eight byte structure which is why fixing the GNU malloc() to use 16 byte boundaries OR specifying '#define SYSTEM_MALLOC' to fix the GNU Emacs problem worked.

I do not see how this could be your problem unless you were previously linking with the unfixed GNU Emacs malloc().

If it matters, my sendmail was built using the Sun C compiler with the -Bstatic option and the Sun libresolv.a (-lresolv) library.

-- Mark

>>> inbox:322

Date: Mon, 08 Oct 90 12:14:54 EDT To: Mark D. Baushke <mdb@esd.3com.com> From: "Paul Pomes, UofIllinois-CSO" <paul@uxc.cso.uiuc.edu> Subject: Re: [Havard Eidnes: Re: sendmail 5.64 and SunOS 4.1] ________________

I've been having interesting results with the latest 5.65 sendmail on the various platforms at UIUC. Just about all the non-VAX machines have trouble using sendmail.fc files (either with -bd, -bt, or -bi) that disappear when I use the malloc.c from 4.3 BSD-tahoe. The IBM RS6000/540 has been the lone exception to that. It still dumps core with -bi, however a stack trace shows that it's still using the system supplied malloc. My other problems with AIX have me leaning towards the .308 Winchester solution but that's another story.

More details later.

/pbp

>>> inbox:342

Date: Mon, 08 Oct 90 09:52:23 EDT From: allegra!mp@ucbvax.Berkeley.EDU (Mark Plotnick) Subject: Re: sendmail 5.64 and SunOS 4.1 ________________

could you run 'trace -p' and see where it's looping? actually, since you have source, you can probably use dbx to attach to it.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:58 CDT