SUMMARY: in.named losing cached information

From: Damon LaCaille (Damon_LaCaille@dgii.com)
Date: Fri Jun 26 1998 - 17:41:13 CDT


The original posting was:

===============================================================
Some of you may recall that about 2 weeks ago I put up a request for
information on why our primary and secondary name servers were rebooting
spontaneously, turns out it was a memory issue. However, now we're having
additional problems with our DNS servers as follows:

Our internal name server is a SparcServer 5 w/ 128MB of memory. Solaris
2.5.1, with revision 4.9.3p1 of in.named.

What happens is that I make a query to an external site (say www.yahoo.com
or something), it takes a good 30-45 seconds before it resolves the name,
however, after that it caches it and it finds it immediately
(non-authoritative answer). However, if I try to access this site the same
day, but only a few hours later, it has already lost the cached information
on that site.

I'm definitely no DNS guru here, but I've set up a DNS server before and
this hasn't happened unless the machine went down and it had to repopulate
it's cache.

Thanks for any information in advance. This is of course making users
believe there is an issue with the network, when it's really a name
resolution issue.

Damon LaCaille
===============================================================

Well it turns out to be that our internal name server has a "forwarders"
directive in DNS that forwards all external name queries to our external
name servers, who are suppposed to have "super-caches". However, since
we've had the problems with the machines rebooting because of memory
problems, they never had any cached information because they were
constantly rebooting, and sometimes were not even available. Turns out that
we have 3 external name servers, and the one that hasn't gone down is the
one we've pointed the new "forwarders" directive too, and now things are
blazingly fast.

Many thanks to the following people who e-mailed me great suggestions,
though it was Robin Brown who had it pegged. Indeed it was our forwarders
directive, which of course WOULD have been valid had the machine been up
and running properly. One note however, I should have used something other
than www.yahoo.com as an example as folks took me quite literally, I meant
to use that just as an example, we were experiencing it on all name server
requests. Thanks again to all who responded!

==========================================================================
ROBIN_BROWN@phl.com

This sounds like it may actually be a problem with your forwarders or that
part of the configuration. Can you fire up nslookup and attache to your
forwarder name server? Does this take a long time?

==========================================================================
David Schiffrin <daves@adnc.com>

Yahoo, and many other places set the timeout on some dns entries pretty
short, so that round robin DNS can be used.

check it out, 900 second expiry:
hera% dig www.yahoo.com any any
 
; <<>> DiG 2.2 <<>> www.yahoo.com any any
;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6
;; flags: qr rd ra; Ques: 1, Ans: 1, Auth: 0, Addit: 0
;; QUESTIONS:
;; www.yahoo.com, type = ANY, class = ANY
 
;; ANSWERS:
www.yahoo.com. 900 CNAME www5.yahoo.com.
 
;; Total query time: 280 msec
;; FROM: hera to SERVER: default -- 205.216.138.127
;; WHEN: Thu Jun 25 10:37:21 1998
;; MSG SIZE sent: 31 rcvd: 59
 

I'd guess that multiple lookups will return different cnames as well.

hope this helps

==========================================================================
bismark@alta.Jpl.Nasa.Gov (Bismark Espinoza)

It should not take that long. Get the two name servers in debug
mode and do the www.yahoo.com lookup. Find why it takes so long to
resolve.

As far as the entry disappearing, look at the SOA refresh, expire and
minimum numbers for primary and secondary servers.

If it is a true caching problem, you can build a caching-only server.

==========================================================================
AJP13@chrysler.com

In your cache file there's a parameter, if I remember correctly is called
time-to-live. It usually set to 99999999.

Example

server.domain.com 999999999 IN A 131.171.151.1

A.Pahl

==========================================================================
Don Lewis <Don.Lewis@tsc.tdk.com>

It looks like yahoo has configured their web server address to time out after
15 minutes (900 seconds).

# dig www.yahoo.com @ns.yahoo.com
 
; <<>> DiG 2.2 <<>> www.yahoo.com @ns.yahoo.com
; (1 server found)
;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10
;; flags: qr aa rd ra; Ques: 1, Ans: 2, Auth: 6, Addit: 6
;; QUESTIONS:
;; www.yahoo.com, type = A, class = IN

;; ANSWERS:
www.yahoo.com. 900 CNAME www6.yahoo.com.
www6.yahoo.com. 900 A 204.71.177.71

;; AUTHORITY RECORDS:
yahoo.com. 172800 NS ns.yahoo.com.
yahoo.com. 172800 NS ns1.yahoo.com.
yahoo.com. 172800 NS av1.yahoo.com.
yahoo.com. 172800 NS ns2.dca.yahoo.com.
yahoo.com. 172800 NS ns.europe.yahoo.com.
yahoo.com. 172800 NS ns.yahoo.ca.

;; ADDITIONAL RECORDS:
ns.yahoo.com. 432000 A 204.71.177.33
ns1.yahoo.com. 432000 A 204.71.200.33
av1.yahoo.com. 21600 A 204.123.2.85
ns2.dca.yahoo.com. 900 A 209.143.200.34
ns.europe.yahoo.com. 21600 A 195.67.49.25
ns.yahoo.ca. 21600 A 206.222.66.41

;; Total query time: 80 msec
;; FROM: gatekeeper.tsc.tdk.com to SERVER: ns.yahoo.com 204.71.177.33
;; WHEN: Thu Jun 25 14:51:43 1998
;; MSG SIZE sent: 31 rcvd: 295

==========================================================================
Craig Whytock <cwhytock@cims.co.uk>

We have seen this when the primary server has gone down, the delay being the
timeit takes to swith to the secondary server.

for examples /etc/nsswitch.conf tells the search order for looking up various
this
including host addresses. Ours says

hosts: files dns

So initially it tries the local files in the /etc directory then it tries
DNS. The lookup
order for DNS is specified in the /etc/resolv.conf.

> after that it caches it and it finds it immediately
> (non-authoritative answer).

Obviously since its cached

> However, if I try to access this site the same
> day, but only a few hours later, it has already lost the cached information
> on that site.

It should save the info for the full refresh period, usually 8 hours but it
can be set
much lower, we sometimes run as low as 10 min if there are many changes
happening with external clients.

==========================================================================
darren@mailhost.onlinemagic.com

Why dont you upgrade to bind-8.1.2 ?

Maybe that will get rid of the problem.

==========================================================================
Karl Vogel <vogelke@c17mis.region2.wpafb.af.mil>

   Are you running the name-server cacheing daemon (nscd)? If so, type
   "nscd -g" to see how big your host-table cache is, and how long before
   it expires. I use the following entries in /etc/nscd.conf:

        positive-time-to-live hosts 43200
        negative-time-to-live hosts 5
        suggested-size hosts 503
        keep-hot-count hosts 20
        old-data-ok hosts no
        check-files hosts yes

==========================================================================
Geoff Weller <g.s.weller@larc.nasa.gov>

the TTL (time to live) settings are probably too short. I don't have a
lot of specific info for you
but I remember this from setting up DNS in a previous life. Check out
the O'reilly DNS and BIND book.

==========================================================================
Tim Carlson <tim@santafe.edu>

On the long time waiting, it almost sounds like you don't have your
primary DNS server in /etc/resolv.conf and it is timing out before
switching to something local.

Not sure about the cache problem. Are you sure that your nameserver isn't
restarting often? Have you checked your syslog for hints that in.named
is restarting itself?

==========================================================================

Rich Pieri <rich.pieri@prescienttech.com>

;; ANSWERS:
www.yahoo.com. 398 CNAME www4.yahoo.com.
www4.yahoo.com. 398 A 204.71.200.69

Given such a short TTL, it is to be expected.

==========================================================================



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:42 CDT