SUMMARY: sendmail.mx and nameserver problem at boot-time

From: Roar Smith (lmdrsm@ludvig.ericsson.se)
Date: Fri Sep 25 1992 - 06:26:23 CDT


The problem with Sun's sendmail.mx and nameserver (BIND) at boot-time
that I had has been solved (solution and explanation follows).

The basic problem:
|> HW: Sun 4/390
|> OS: SunOS 4.1.1
|> IP: 192.66.7.1
|>
|> The host is our mailserver as well as NIS master and secondary
|> nameserver for our domain (ericsson.se). The sendmail daemon used is
|> the standard MX sendmail from Sun i.e. /usr/lib/sendmail.mx, and the
|> nameserver daemon is also the one provided by Sun.
|>
|> When I start the sendmail daemon manually it works just fine and will
|> present itself with the fully qualified hostname (i.e.
|> "ludvig.ericsson.se"), BUT at boot-time the sendmail daemon somehow
|> manages to overlook or miss the nameserver information and will
|> consequently present itself without the domainname (i.e. as "ludvig").
|> I believe this is because "sendmail.mx" cannot contact the nameserver
|> at boot-time.

I got responses from the following people:
        woods@claven.UCAR.EDU (Greg Woods)
        andersa@Riga.DoCS.UU.SE (Anders Andersson)
        rob@msc.cornell.edu (Rob Vaughn)
        ckd@eff.org (Christopher Davis)
        pgreen@aoc.nrao.edu (Philip Green)
        per@erix.ericsson.se (Per Hedeland)
        vato@csv.warwick.ac.uk (Ian Dickinson)
Thank you very much for your responses !

Some of the noteworthy points:

        andersa@Riga.DoCS.UU.SE (Anders Andersson) writes:
        |> Do you run NIS in conjunction with DNS (-b in
        |> /var/yp/Makefile)? If so, and you need some addresses before
        |> named has started up properly, you might enter them in your NIS
        |> hosts map as well.
        |>
        |> In /etc/rc.local (or corresponding files), named should be
        |> started before sendmail. It's my impression that named needs
        |> some time to configure properly, especially if you list a lot
        |> of zones in named.boot, and thus inserting a "sleep 30"
        |> (without trailing "&") in /etc/rc.local immediately before
        |> sendmail just might have impact.
        My comments:
        Yes I do run NIS in conjunction with DNS and use the "-b" flag in
        /var/yp/Makefile, and I also have listed the name/address of the
        nameserver in NIS.
        The second point about configuration time for "named" is head on !
        This is basically THE problem that needed to be solved/circumvented.

        rob@msc.cornell.edu (Rob Vaughn) writes:
        |> The order in which you start the services is very important for a
        |> reboot to work. We do the following:
        |>
        |> 1. Very early in the rc.local script we use the 'hostname' command
        |> to give our machine a fully qualified host name. This isn't
        |> possible with all of our machine (although we'd like it)
        |> since FQDN seems to break things, but for the server it's a
        |> must.
        |>
        |> 2. The next thing we start up (again, *very* early in the script,
        |> right after config'ing network cards) is the name service,
        |> since it's needed by everything.
        |>
        |> 3. Much, much later in the script (near the end) we start sendmail
        |> (we use v5.6.5 + IDA) since this isn't needed by anything
        |> until after the entire system is up. We start news services
        |> at this time as well.
        |>
        |> Lastly, you can force sendmail to use a FQDN always (this is
        |> also recommended) by modifying the sendmail.cf file to make
        |> sure the $w variable is FQDN and the $j is set to the same
        |> thing (i.e. 'Dj$w')
        My comments:
        1. Giving the mailserver a FQDN with the "hostname" command would
           (probably, but I haven't tested it) make sendmail.mx work even
           at boot-time, but there are some problems involved in this that
           I don't have the time to tackle right now !
        2. Sounds reasonable - I will try it someday :-)
        3. This *might* be good enough, but I chose another solution.
        4. Forcing sendmail to use the FQDN. Well in the Sun version this can
           only be done by setting the $w macro like this in sendmail.cf:
                   # Our domain is ericsson.se
                   Dmericsson.se
                   # Hard-code hostname to FQDN
                   Dwludvig.$m
           Now I *really* would hate to have to hard-code the hostname into
           the sendmail configuration, I would much rather get this information
           from DNS !

        ckd@eff.org (Christopher Davis) writes:
        |> Did you actually move or link sendmail.mx to sendmail? The
        |> boot startup will run /usr/lib/sendmail unless you change the
        |> rc files (which you don't want to do, since a lot of stuff will
        |> have /usr/lib/sendmail hardwired in other places). Of course
        |> if you start it by hand with /usr/lib/sendmail.mx -bd -q30m or
        |> the like it will work...
        Very good point, but I already thought of that (Ain't I clever :-)
        Since all our clients send mail through the mailserver with the local
        hostalias "mailhost" they shouldn't and don't run the MX version of
        sendmail, since that version is *not* able to find the host "mailhost"
        in DNS (and it doesn't look in the NIS map hosts.byname). But some of
        these clients are also diskless NFS clients of the mailserver - thus
        using exactly the same /usr/lib/sendmail as the mailserver. Now this is
        a problem ! The solution is to let /usr/lib/sendmail be a shell-script
        which calls either /usr/lib/sendmail.mx or /usr/lib/sendmail.nomx
        depending on whether it is run on a client or on the host with alias
        "mailhost". I can mail you this script if you want to see how it can
        be done.
        
        per@erix.ericsson.se (Per Hedeland) writes:
        |> Quite correct, and all the pertinant information is in the
        |> above. The problem is caused by the combination of the first
        |> nameserver in /etc/resolv.conf being on the same host and being
        |> (secondary) server for a fairly large set of zones (some 9000
        |> records total), and problems/bugs with the "less-than-recent"
        |> version of named/resolver that Sun ships.
        |>
        |> Specifically, it is the initial gethostbyname(gethostname())
        |> lookup done by sendmail.mx that fails, and it does this because
        |> a) named loads the (backup) zone files *before* opening its
        |> socket (i.e. connections during the load time are refused
        |> rather than suspended/"timeouted") and b) the resolver will not
        |> try subsequent servers from /etc/resolv.conf if it gets a
        |> "connection refused" when trying e.g. the first one.
        |>
        |> Both of these problems are fixed in BIND 4.8.3, released some
        |> two years ago, I believe. There is actually a patch from Sun
        |> for the second problem (100465-01), but it is of little use in
        |> this case since it only contains a new version of res_send.o,
        |> which is of course statically linked into those SunOS programs
        |> (sendmail.mx and ypserv) that use it.
        My comments:
        I'll take take your word for it :-)
        Seriously though, this *does* account for the race problem at startup.
        BTW: hooray, yeah yeah Sun - well done, only two years behind ;->

The solution:
Since the problem with Sun's version of "named" is Brain Dead(TM), my solution
is also pretty brain dead.
The BIND daemon "in.named" needs time to configure itself so let's give it
plenty of time before starting sendmail, like this:

        In /etc/rc.local the following lines were changed from:
        |> if [ -f /usr/lib/sendmail -a -f /etc/sendmail.cf ]; then
        |> (cd /var/spool/mqueue; rm -f nf* lf*)
        |> /usr/lib/sendmail -bd -q15m
        |> echo -n ' sendmail'
        |> fi
        and changed to the following lines:
        |> if [ -f /usr/lib/sendmail -a -f /etc/sendmail.cf ]; then
        |> (cd /var/spool/mqueue; rm -f nf* lf*)
        |> (sleep 30; /usr/lib/sendmail -bd -q15m) &
        |> echo -n ' sendmail'
        |> fi

The Real Solution(TM) would of course be to get a *proper* version of BIND, but
I invoke the System Managers default answer: *I don't have the time right now!*

I'm new to the News community, but this system really *works* !!!
It really can help you solve your problems, so I guess that kinda makes it up
for the *huge* amount of time I spend reading News :-)

           ###### Roar Smith, M.Sc.EE. *** Organization:
          # Coordination, UNIX Network *** L.M. Ericsson A/S
UNIX # Phone: +45 3388 3577 *** Sluseholmen 8
    # # FAX: +45 3388 3134 *** DK-1790 Kobenhavn V
     # # MEMO: LMD.LMDRSM *** Denmark
      # Email: lmdrsm@ludvig.ericsson.se
+---------------------------------------------------------------------+
! Double-click (HERE) and you might already have won $5,000,000 !
+---------------------------------------------------------------------+



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:50 CDT