Re: Very slow DNS lookup with NetworkManager and dnsmasq



First, thank you for your very quick response, Dan. It helped a lot - at
least in figuring out what the underlying causes could be.

Dan Williams <dcbw redhat com> writes:

>> 1.  The system seems to "forget" cached addresses, so that address
>>     lookup for a frequently used address - say www.google.com - often
>>     initiates a new search that again takes several seconds. This
>>     happens very often, several times in a day. What is causing this?
>>     Can it be related to DHCP lease time? (It is the only idea I have
>>     at the moment.)
>
> Does the system forget the cached addresses after a certain period of
> time?  dnsmasq may be restarted periodically when events like lease
> renewal happen, but that should not be very often.  This could be what
> you're seeing, though you should see indications in the logs when this
> happens

Looking at /var/log/messages it seems that DHCP renewal time is about
one hour. This suggests that the loss of cached addresses might be due
to short lease times.

>From the same file I can also see the nameservers used by dnsmasq, as
you suggested. Like I said earlier, my computer seems to get from DHCP
two types of nameservers: fast, starting with address 10.8., and slow,
starting with 10.13. Interestingly enough, while system behaviour
suggests that the slow ones are always contacted first, these
nameservers appear in different orders in different places:

- In the list of nameservers printed by dnsmasq in /var/log/messages,
  the fast nameservers starting with 10.8. are listed first, and the
  slow ones last.

- If I disable dnsmasq in NetworkManager.conf, and restart
  NetworkManager, the ordering is reversed: the slow ones starting with
  10.13 are at the top of the list in /etc/resolv.conf. This probably
  explains the very slow responses when dnsmasq is not in use.

- In /var/lib/dhclient/dhclient*lease the slow ones are listed first.

I wonder what the ordering rule is in different systems. The different
possible orderings might explain the observed difference between Linux
and Windows machines: if Windows machines order the servers differently,
they might usually get fast service (slow service only if the fast
servers are down). Or then they just work better with what seem to me to
be "slow" servers.

>From /var/log/messages I also noticed that when running dnsmasq via the
dnsmasq option in NetworkManager.conf, my dnsmasq cache size is 150 even
though in the file /etc/dnsmasq.conf I have specified
cache-size=1500. Does running dnsmasq via NetworkManager bypass
configuration in dnsmasq.conf? Where should one set the configuration
then, for example, for cache size, or whether dnsmasq serves only as DNS
or also services DHCP requests?

> I used to have a branch of NM that wouldn't change DNS configuration
> if it hadn't actually change, but that was long ago, and we'd want to
> redo that work.  It would likely help with some of this jitter with
> shorter DHCP leases.

That would be a nice feature.

> It's possible to tell dhclient to 'override' the DNS servers using a
> custom dhclient config file.

Ok, but since I use the same laptop in different networks, this might be
difficult. I would just like to order the nameservers if their addresses
are of a certain form. I will search for ideas.

> One other thing to do is to run wireshark on the machine and capture
> outgoing DNS requests and the replies that come back.

I already installed wireshark, but I don't think that it is necessary to
start analysing the requests yet, because at least at the moment we have
a decent theory about the nature of the problems.

1. DNS lookups are infuriatingly slow because dnsmasq contacts the slow
   servers first. Windows machines here might contact the fast servers
   first, and therefore do not face the same problem - or their
   interaction with the "slow" servers might just be different. The
   problem could be rectified by somehow (?) getting dnsmasq to sort the
   nameservers, or by getting dnsmasq to query all the servers and
   notice which are the ones that respond quickly (is this possible?).

2. dnsmasq forgets addresses because of short DHCP lease times. The
   situation could be improved by changing NetworkManager's way of
   handling lease renewal, or by trying to convince the local
   administration personnel to increase lease time length.

Thank you very much for your already very illuminating answers.

---
Jarmo Hurri



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]