[Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

Otto Moerbeek otto at drijf.net
Tue Sep 8 07:33:44 UTC 2020


On Tue, Sep 08, 2020 at 09:22:31AM +0200, Christian Degenkolb wrote:

> (send again, first answer was not send cc to the ML)
> 
> Hi,
> 
> sorry for not sending any configs. pdns_recursor runs more or less with the
> vanilla config with the following changes:
> 
> forward-zones-recurse=zen.spamhaus.org=1.1.1.1;1.0.0.1 (thats why I wanted
> to use the local recursor, as mentioned the server is located in the hetzner
> IP Range which apparently is blocked for the spamhaus DNSBL)
> loglevel=6
> log-common-errors=yes
> quiet=no
> root-nx-trust=no (found this as a solution for the SERVERFAIL but did not
> work)
> 
> and
> # rec_control set-carbon-server 37.252.122.50 rho-test (for the grafs)
> 
> 
> A trace for the same resolves from my last mail:
> 
>  $ time dig +trace pubs.vmware.com @127.0.0.1
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> +trace pubs.vmware.com
> @127.0.0.1
> ;; global options: +cmd
> .                       86118   IN      NS      d.root-servers.net.
> .                       86118   IN      NS      c.root-servers.net.
> .                       86118   IN      NS      l.root-servers.net.
> .                       86118   IN      NS      b.root-servers.net.
> .                       86118   IN      NS      f.root-servers.net.
> .                       86118   IN      NS      m.root-servers.net.
> .                       86118   IN      NS      e.root-servers.net.
> .                       86118   IN      NS      a.root-servers.net.
> .                       86118   IN      NS      i.root-servers.net.
> .                       86118   IN      NS      k.root-servers.net.
> .                       86118   IN      NS      g.root-servers.net.
> .                       86118   IN      NS      h.root-servers.net.
> .                       86118   IN      NS      j.root-servers.net.
> .                       86118   IN      RRSIG   NS 8 0 518400 20200921050000
> 20200908040000 46594 .
> wgnBz8tKA9hjwIxmMQgTVwnZaiUpAB9a1+oC5T/syHzqNj1e5qhApLQN
> NLok43hu5Ykt8RFe/IiDZuYxIdyyzItwk
> 4QN8xNgsQsfhVfBbZ26bWRz
> fskquwnFn6Gmvq2qI6o42tsBxXUw09X4sNlNYI2zHB3sKaaMu0AbN9WI
> Pe14jpX/PwaP3m78+XqMy9CiKmuDon6g3BuyecPhCZL5Pa8ZPC7nrKfV
> pfyNSiPoBODsJE96UHGlOCJTFcbu/6Ia4ek3AGOJf+WC84HPrxLT
> riyk XHfbPl7EjTbFSPgT8D7jGBfVCTQU3JSfynv29VFAHWZu1gm5VJWNQGaw u5gatA==
> ;; Received 540 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
> 
> com.                    172800  IN      NS      a.gtld-servers.net.
> com.                    172800  IN      NS      b.gtld-servers.net.
> com.                    172800  IN      NS      c.gtld-servers.net.
> com.                    172800  IN      NS      d.gtld-servers.net.
> com.                    172800  IN      NS      e.gtld-servers.net.
> com.                    172800  IN      NS      f.gtld-servers.net.
> com.                    172800  IN      NS      g.gtld-servers.net.
> com.                    172800  IN      NS      h.gtld-servers.net.
> com.                    172800  IN      NS      i.gtld-servers.net.
> com.                    172800  IN      NS      j.gtld-servers.net.
> com.                    172800  IN      NS      k.gtld-servers.net.
> com.                    172800  IN      NS      l.gtld-servers.net.
> com.                    172800  IN      NS      m.gtld-servers.net.
> com.                    86400   IN      DS      30909 8 2
> E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
> com.                    86400   IN      RRSIG   DS 8 1 86400 20200921050000
> 20200908040000 46594 .
> zz85z6R/YUHxyW+ywA6zrgiYILjPo0i248M3wU+2XCRCneBH6yknQfjM
> LIcbo3vADVUlkJd0l4W2TLd7NPgC255hr2
> +ALojzzHa07jyFmE203Kdw
> ma7XL0C55TdFrCEMhARkZf4EncfJH9JH+fdWRWdMr0EQZd1A+FzMYemO
> o7/L/8ZYq4FOt0vz+zheAJNDveGii+QpXAoDyw4xt3HMUVM+40Z/VgD1
> tk9Y3K9e2wwRNISeHdlq21JFVA2SY/gDgPCzBtM1r9Yz7oFZ2ld5W
> AD0 P84GPEUMgUceAGofwxlV9+dSawhunskb+yVrpdjpizLageyJRWEu/F9A zDXxew==
> ;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in 5 ms
> 
> vmware.com.             172800  IN      NS      dns1.p05.nsone.net.
> vmware.com.             172800  IN      NS      dns2.p05.nsone.net.
> vmware.com.             172800  IN      NS      dns3.p05.nsone.net.
> vmware.com.             172800  IN      NS      dns4.p05.nsone.net.
> vmware.com.             172800  IN      NS      ns01.vmwdns.com.
> vmware.com.             172800  IN      NS      ns02.vmwdns.com.
> vmware.com.             172800  IN      NS      ns03.vmwdns.com.
> vmware.com.             172800  IN      NS      ns04.vmwdns.com.
> vmware.com.             86400   IN      DS      48553 13 2
> AA2C697F3990472642AF01509A18224828E403CA8608EC75D5C83002 CE21847E
> vmware.com.             86400   IN      RRSIG   DS 8 2 86400 20200915062203
> 20200908051203 24966 com.
> FA2xsJKvT2LLn5UEy7hAE7PaYmds7FBkQB0SGhm8riwJRKnxbHAY0tvv
> I1T/k0EzXJ4wy1J5qzNLMjhzFgPxEQB
> 6BwBfJm8qo8Cnzxm4YC5Ko1/9
> pDWooVBHoFfMmJgu14Dk+u1AcHobxH9pPs7az16cLK/3YeaFW3dCrIVQ
> NK2fZc0d/pc7CY0Zl1LjYQdTq+MsZiL2kbepEHD6A/4J6g==
> ;; Received 523 bytes from 2001:503:eea3::30#53(g.gtld-servers.net) in 6 ms
> 
> pubs.vmware.com.        30      IN      CNAME
> pubs.vmware.com.ds.edgekey.net.
> pubs.vmware.com.        30      IN      RRSIG   CNAME 13 3 30 20200909071011
> 20200907071011 12752 vmware.com.
> yTxj4OFvCx3flxtOFAFdkwAOpOAVNibgseFi5U5ekzYbdATw98xZqrDT
> tYs/n46iHFiLN4ql4Y3MS6U
> 16Qr6DQ==
> ;; Received 194 bytes from 45.54.11.1#53(ns01.vmwdns.com) in 11 ms
> 
> real0m32.149s
> user0m0.012s
> sys0m0.012s
> 
> But this looks normal to me. I don't know why the trace only shows 5, 6 and
> 11 ms but takes up to 32 seconds to finish.

Well, that is suspect, but see below.

> 
> Regarding your questions for the ipv6 connectivity. How can I test this?

Run pdns_recursor with the --trace option (or trace=yes in the config
file), do some queries and look at the results in the log file. Now
the recursor logs a lot in trace mode, so take your time trying to
understand what is going on. Members of this list can likely help if
you do not spot anything.

	-Otto

> 
> I did a
> 
> $ dig ipv6.google.com @127.0.0.1
> 
> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> ipv6.google.com @127.0.0.1
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9226
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;ipv6.google.com.INA
> 
> ;; ANSWER SECTION:
> ipv6.google.com.86400   INCNAME   ipv6.l.google.com.
> 
> ;; AUTHORITY SECTION:
> l.google.com.   60INSOAns1.google.com. dns-admin.google.com. 330353109 900
> 900 1800 60
> 
> ;; Query time: 3087 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Tue Sep 08 09:12:50 CEST 2020
> ;; MSG SIZE  rcvd: 115
> 
> and
> 
> $ ping6 ipv6.google.com
> PING ipv6.google.com(fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e))
> 56 data bytes
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=1 ttl=118 time=5.11 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=2 ttl=118 time=5.08 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=3 ttl=118 time=5.12 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=4 ttl=118 time=5.13 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=5 ttl=118 time=5.09 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=6 ttl=118 time=5.08 ms
> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
> icmp_seq=7 ttl=118 time=5.08 ms
> ^C
> --- ipv6.google.com ping statistics ---
> 7 packets transmitted, 7 received, 0% packet loss, time 24ms
> rtt min/avg/max/mdev = 5.075/5.096/5.133/0.043 ms
> 
> and it looks good.
> 
> regards
> Chris
> 
> 
> Am 2020-09-04 15:05, schrieb Otto Moerbeek:
> > On Wed, Sep 02, 2020 at 09:44:37AM +0200, Christian Degenkolb via
> > Pdns-users wrote:
> > 
> > > Hi,
> > > 
> > > I hope somebody on the ML can help me figure out what I'm doing wrong.
> > > I have a local pdns_recursor (version 4.1.11-1+deb10u1 from debian 10)
> > > runing and added it at the top of my /etc/resolve.conf as 127.0.0.1.
> > > 
> > > However I see some strange SERVERFAIL resolves happening and all in
> > > all a
> > > slow DNS system.
> > > 
> > > For example see the following two consecutive resolves and a direct
> > > request
> > > to the NS.
> > > The first one takes nearly 3 seconds vs 11 ms from the same system
> > > if I
> > > query the NS directly.
> > > 
> > > $ dig pubs.vmware.com @127.0.0.1
> > > 
> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
> > > @127.0.0.1
> > > ;; global options: +cmd
> > > ;; Got answer:
> > > ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4929
> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
> > > 
> > > ;; OPT PSEUDOSECTION:
> > > ; EDNS: version: 0, flags:; udp: 4096
> > > ;; QUESTION SECTION:
> > > ;pubs.vmware.com.INA
> > > 
> > > ;; ANSWER SECTION:
> > > pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
> > > pubs.vmware.com.ds.edgekey.net. 10 IN   CNAME
> > > e751.dscx.akamaiedge.net.
> > > 
> > > ;; Query time: 3009 msec
> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
> > > ;; WHEN: Wed Sep 02 09:19:04 CEST 2020
> > > ;; MSG SIZE  rcvd: 123
> > > 
> > > $ dig pubs.vmware.com @127.0.0.1
> > > 
> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
> > > @127.0.0.1
> > > ;; global options: +cmd
> > > ;; Got answer:
> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1345
> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
> > > 
> > > ;; OPT PSEUDOSECTION:
> > > ; EDNS: version: 0, flags:; udp: 4096
> > > ;; QUESTION SECTION:
> > > ;pubs.vmware.com.INA
> > > 
> > > ;; ANSWER SECTION:
> > > pubs.vmware.com.18INCNAME   pubs.vmware.com.ds.edgekey.net.
> > > pubs.vmware.com.ds.edgekey.net. 4 INCNAME   e751.dscx.akamaiedge.net.
> > > e751.dscx.akamaiedge.net. 16INA104.111.214.47
> > > 
> > > ;; Query time: 0 msec
> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
> > > ;; WHEN: Wed Sep 02 09:19:08 CEST 2020
> > > ;; MSG SIZE  rcvd: 139
> > > 
> > > $ dig pubs.vmware.com @ns03.vmwdns.com
> > > 
> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
> > > @ns03.vmwdns.com
> > > ;; global options: +cmd
> > > ;; Got answer:
> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5509
> > > ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
> > > ;; WARNING: recursion requested but not available
> > > 
> > > ;; OPT PSEUDOSECTION:
> > > ; EDNS: version: 0, flags:; udp: 4096
> > > ;; QUESTION SECTION:
> > > ;pubs.vmware.com.INA
> > > 
> > > ;; ANSWER SECTION:
> > > pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
> > > 
> > > ;; Query time: 11 msec
> > > ;; SERVER: 45.54.11.129#53(45.54.11.129)
> > > ;; WHEN: Wed Sep 02 09:34:42 CEST 2020
> > > ;; MSG SIZE  rcvd: 88
> > > 
> > > Also I have a number SERVFAIL in /var/log/syslog (pdns_recurser is
> > > currently
> > > running with loglevel=6).
> > > For example:
> > > 
> > > Sep  2 08:45:35 rho pdns_recursor[19311]: Sending SERVFAIL to
> > > 127.0.0.1
> > > during resolve of 'pubs.vmware.com' because: Too much time waiting for
> > > pubs.vmware.com.ds.edgekey.net|A, timeouts: 5,
> > > throttles: 1, queries: 6, 7991msec
> > > 
> > > # grep 'Too much time waiting for' /var/log/syslog | wc -l
> > > 184
> > > 
> > > As per
> > > https://blog.powerdns.com/2014/12/11/powerdns-graphing-as-a-service/
> > > I send the metrics to https://metronome1.powerdns.com/?server=pdns.rho-test.recursor&beginTime=-172800
> > > 
> > > Does anybody have an idea whats wrong? This seems way to slow for
> > > DNS and
> > > the SERVFAIL schouldn't happen this often.
> > > The server in question is running in a DC of the german Hoster
> > > hetzner.de.
> > > Besides the strange DNS I don't have any problems with the
> > > reliability of
> > > the network connection.
> > > 
> > > thanks
> > > Chris
> > > 
> > > _______________________________________________
> > > Pdns-users mailing list
> > > Pdns-users at mailman.powerdns.com
> > > https://mailman.powerdns.com/mailman/listinfo/pdns-users
> > 
> > You did not share any config or traces, so it's hard to tell. A wild
> > guess: It might be you enabled IPV6 but your IPV6 connectivity is bad.
> > 
> > 	-Otto


More information about the Pdns-users mailing list