[Pdns-users] Slow query and SERVERFAIL from local pdns_recursor

Christian Degenkolb christian+pdns at degenkolb.net
Thu Sep 10 13:40:54 UTC 2020


Hi Thomas,

what is a reasonable low value for udp-truncation-threshold? I tried 
with 900 and 600 (as low as half the default value) but found no 
improvements.

Also I don't think this is a vmware.com problem since I have the same 
problem with multiple domains.

To illustrate I found the tool dnsperf from 
https://www.dns-oarc.net/tools/dnsperf and created a queryfile with the 
list of 500 domains from here https://moz.com/top500 see 
https://paste.ubuntu.com/p/DxGBqRvngv/

If I call the tool against my local resolver on a clean cache (even with 
udp-truncation-threshol=600) I get the following output.

# rec_control wipe-cache $
wiped 4154 records, 8 negative records, 500 packets
# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:26 2020
[Status] Stopping after 1 run through file

<snip multiple lines like "[Timeout] Query timed out: msg id 0" and 
"Warning: received a response with an unexpected (maybe timed out) id: 
162">

[Status] Testing complete (end of file)

Statistics:

   Queries sent:         500
   Queries completed:    278 (55.60%)
   Queries lost:         222 (44.40%)

   Response codes:       NOERROR 209 (75.18%), SERVFAIL 69 (24.82%)
   Average packet size:  request 29, response 56
   Run time (s):         16.455935
   Queries per second:   16.893601

   Average Latency (s):  1.313376 (min 0.000543, max 4.491949)
   Latency StdDev (s):   1.446709

# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:49 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)

Statistics:

   Queries sent:         500
   Queries completed:    500 (100.00%)
   Queries lost:         0 (0.00%)

   Response codes:       NOERROR 281 (56.20%), SERVFAIL 219 (43.80%)
   Average packet size:  request 29, response 50
   Run time (s):         4.571526
   Queries per second:   109.372669

   Average Latency (s):  0.015253 (min 0.000054, max 4.556146)
   Latency StdDev (s):   0.244755

As I see this way to much queries lost without a filled cache and way to 
high SERVFAIL for this kind of domains even on retries.
The  SERVFAIL  stays high on subsequent runs.

Whereas if I run it against 1.1.1.1 (or the hoster DNS server) I get the 
following output.

# ./dnsperf -d queryfile_top500_clean -s 1.1.1.1
DNS Performance Testing Tool
Version 2.3.4

[Status] Command line: dnsperf -d queryfile_top500_clean -s 1.1.1.1
[Status] Sending queries (to 1.1.1.1)
[Status] Started at: Thu Sep 10 15:33:24 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)

Statistics:

   Queries sent:         500
   Queries completed:    500 (100.00%)
   Queries lost:         0 (0.00%)

   Response codes:       NOERROR 499 (99.80%), SERVFAIL 1 (0.20%)
   Average packet size:  request 29, response 77
   Run time (s):         0.882704
   Queries per second:   566.441299

   Average Latency (s):  0.013521 (min 0.005065, max 0.863349)
   Latency StdDev (s):   0.054510

A near perfect score.

Doesn't this mean the problem lies within the local resolver since 
dnsperf would make the same requests the local resolver would make to 
the external DNS server?
Or at least there does not exist an uplink problem but something local 
to my server?

regards
Chris





Am 2020-09-09 10:05, schrieb Thomas Mieslinger via Pdns-users:
> Hi Christian,
> 
> Hetzner might filter ip fragments. Please try if your situation gets
> better if you set udp-truncation-threshold to a reasonable low value.
> 
> By default pdns-recursor does dnssec. I would like to suggest to set
> +dnssec on your dig queries.
> 
> A possible workaround for the vmware.com problems is to add a negative
> trust anchor for vmware.com. in pdns config.
> 
> Cheers Thomas
> 
> On 9/8/20 2:16 PM, Christian Degenkolb via Pdns-users wrote:
>> Hi,
>> 
>> I set the trace=yes option in the recursor config an redid the tests 
>> for
>> pubs.vmware.com.
>> 
>> The log can be found here https://paste.debian.net/hidden/07526601/
>> 
>> I found two timeouts in the logs
>> 
>> Line 41:
>> Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: 
>> Resolved
>> 'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
>> Sep  8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying 
>> IP
>> 45.54.11.1:53, asking 'pubs.vmware.com|A'
>> Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: timeout
>> resolving after 1501.63msec
>> Sep  8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying 
>> to
>> resolve NS 'ns04.vmwdns.com' (2/8)
>> 
>> But a request to the 45.54.11.1 for pubs.vmware.com come back within 
>> 11
>> msec.
>> 
>> $ dig -t A @45.54.11.1 pubs.vmware.com
>> 
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @45.54.11.1
>> pubs.vmware.com
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24122
>> ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>> ;; WARNING: recursion requested but not available
>> 
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;pubs.vmware.com.INA
>> 
>> ;; ANSWER SECTION:
>> pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
>> 
>> ;; Query time: 11 msec
>> ;; SERVER: 45.54.11.1#53(45.54.11.1)
>> ;; WHEN: Tue Sep 08 13:29:57 CEST 2020
>> ;; MSG SIZE  rcvd: 88
>> 
>> and a seconds timeout in line 159:
>> 
>> Sep  8 10:21:56 rho pdns_recursor[25208]: [3] 
>> e751.dscx.akamaiedge.net:
>> Trying IP 2.16.106.23:53, asking 'e751.dscx.akamaiedge.net|A'
>> Sep  8 10:21:57 rho pdns_recursor[25208]: [3] 
>> e751.dscx.akamaiedge.net:
>> timeout resolving after 1501.74msec
>> Sep  8 10:21:57 rho pdns_recursor[25208]: [3] 
>> e751.dscx.akamaiedge.net:
>> Trying to resolve NS 'n3dscx.akamaiedge.net' (2/8)
>> 
>> Same picture here with a very good response time.
>> 
>> $ dig -t A @2.16.106.23 e751.dscx.akamaiedge.net
>> 
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @2.16.106.23
>> e751.dscx.akamaiedge.net
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7947
>> ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>> ;; WARNING: recursion requested but not available
>> 
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;e751.dscx.akamaiedge.net.INA
>> 
>> ;; ANSWER SECTION:
>> e751.dscx.akamaiedge.net. 20INA104.111.214.47
>> 
>> ;; Query time: 5 msec
>> ;; SERVER: 2.16.106.23#53(2.16.106.23)
>> ;; WHEN: Tue Sep 08 13:31:32 CEST 2020
>> ;; MSG SIZE  rcvd: 69
>> 
>> 
>> To check that this is not a vmware.com problem I tested some more and
>> got the same timeouts.
>> 
>> 
>> One more example for
>> 
>> $dig nameservers.dnscheck.co @127.0.0.1
>> 
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> nameservers.dnscheck.co
>> @127.0.0.1
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23852
>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
>> 
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;nameservers.dnscheck.co.INA
>> 
>> ;; Query time: 3005 msec
>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>> ;; WHEN: Tue Sep 08 12:15:29 CEST 2020
>> ;; MSG SIZE  rcvd: 52
>> 
>> can be found here https://paste.debian.net/hidden/b48a78a2/.
>> 
>> This time multiple timeout regarding the root name servers, for 
>> example
>> g.root-servers.net
>> 
>> Sep  8 12:15:21 rho pdns_recursor[25208]: [50] 
>> nameservers.dnscheck.co:
>> Resolved '.' NS g.root-servers.net to: 192.112.36.4
>> Sep  8 12:15:21 rho pdns_recursor[25208]: [50] 
>> nameservers.dnscheck.co:
>> Trying IP 192.112.36.4:53, asking 'nameservers.dnscheck.co|A'
>> Sep  8 12:15:22 rho pdns_recursor[25208]: [50] 
>> nameservers.dnscheck.co:
>> timeout resolving after 1501.63msec
>> Sep  8 12:15:22 rho pdns_recursor[25208]: [50] 
>> nameservers.dnscheck.co:
>> Trying to resolve NS 'j.root-servers.net' (2/13)
>> 
>> Where a direct request via dig works like a charm.
>> 
>> $ dig -t A @192.112.36.4 nameservers.dnscheck.co
>> 
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @192.112.36.4
>> nameservers.dnscheck.co
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18641
>> ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 13
>> ;; WARNING: recursion requested but not available
>> 
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ; COOKIE: ce9eaf15bb34977b41354b5f5f576c3841785bfba5901e93 (good)
>> ;; QUESTION SECTION:
>> ;nameservers.dnscheck.co.INA
>> 
>> ;; AUTHORITY SECTION:
>> co.172800  INNSns5.cctld.co.
>> co.172800  INNSns1.cctld.co.
>> co.172800  INNSns6.cctld.co.
>> co.172800  INNSns4.cctld.co.
>> co.172800  INNSns3.cctld.co.
>> co.172800  INNSns2.cctld.co.
>> 
>> ;; ADDITIONAL SECTION:
>> ns1.cctld.co.   172800  INA156.154.100.25
>> ns2.cctld.co.   172800  INA156.154.101.25
>> ns3.cctld.co.   172800  INA156.154.102.25
>> ns4.cctld.co.   172800  INA156.154.103.25
>> ns5.cctld.co.   172800  INA156.154.104.25
>> ns6.cctld.co.   172800  INA156.154.105.25
>> ns1.cctld.co.   172800  INAAAA2001:502:2eda::21
>> ns2.cctld.co.   172800  INAAAA2001:502:ad09::21
>> ns3.cctld.co.   172800  INAAAA2610:a1:1009::21
>> ns4.cctld.co.   172800  INAAAA2610:a1:1010::21
>> ns5.cctld.co.   172800  INAAAA2610:a1:1011::21
>> ns6.cctld.co.   172800  INAAAA2610:a1:1012::21
>> 
>> ;; Query time: 16 msec
>> ;; SERVER: 192.112.36.4#53(192.112.36.4)
>> ;; WHEN: Tue Sep 08 13:34:20 CEST 2020
>> ;; MSG SIZE  rcvd: 458
>> 
>> 
>> Additionally I get the resolved IPs in the trace logs (line 328
>> apparently from the seconds worker thread) but not the dig output.
>> 
>> Sep  8 12:15:33 rho pdns_recursor[25208]: [51] 
>> nameservers.dnscheck.co:
>> answer is in: resolved to '52.48.61.155|A'
>> Sep  8 12:15:33 rho pdns_recursor[25208]: [51] 
>> nameservers.dnscheck.co:
>> answer is in: resolved to '104.236.169.228|A'
>> Sep  8 12:15:33 rho pdns_recursor[25208]: [51] 
>> nameservers.dnscheck.co:
>> answer is in: resolved to '104.131.72.189|A'
>> 
>> Is this a dig timeout? Or do I only get the response from the first
>> worker thread?
>> 
>> And now I'm more confused then before. The connection from and to the
>> server (SSH, etc) is rock solid.
>> A iperf test shows the full gigabit connection is available.
>> The server is more or less idle and has 8 cores and 32GB RAM as mostly 
>> a
>> docker host with some 20-30 container (nextcloud, mailcow, ...) 
>> running
>> for personal usage by me and my family.
>> 
>> How can I check for problems with a large number of small connections?
>> But this shouldn't be that much fur a single local recursor, should 
>> it?
>> 
>> Also I don't see any network related messages in the kernel log or
>> anywhere else.
>> I'm not aware of any rate limits for the uplink to the provider.
>> 
>> regards
>> Chris
>> 
>> 
>> 
>> 
>> 
>> 
>> Am 2020-09-08 09:33, schrieb Otto Moerbeek:
>>> On Tue, Sep 08, 2020 at 09:22:31AM +0200, Christian Degenkolb wrote:
>>> 
>>>> (send again, first answer was not send cc to the ML)
>>>> 
>>>> Hi,
>>>> 
>>>> sorry for not sending any configs. pdns_recursor runs more or less
>>>> with the
>>>> vanilla config with the following changes:
>>>> 
>>>> forward-zones-recurse=zen.spamhaus.org=1.1.1.1;1.0.0.1 (thats why I
>>>> wanted
>>>> to use the local recursor, as mentioned the server is located in the
>>>> hetzner
>>>> IP Range which apparently is blocked for the spamhaus DNSBL)
>>>> loglevel=6
>>>> log-common-errors=yes
>>>> quiet=no
>>>> root-nx-trust=no (found this as a solution for the SERVERFAIL but 
>>>> did
>>>> not
>>>> work)
>>>> 
>>>> and
>>>> # rec_control set-carbon-server 37.252.122.50 rho-test (for the 
>>>> grafs)
>>>> 
>>>> 
>>>> A trace for the same resolves from my last mail:
>>>> 
>>>>  $ time dig +trace pubs.vmware.com @127.0.0.1
>>>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> +trace pubs.vmware.com
>>>> @127.0.0.1
>>>> ;; global options: +cmd
>>>> .                       86118   IN      NS      d.root-servers.net.
>>>> .                       86118   IN      NS      c.root-servers.net.
>>>> .                       86118   IN      NS      l.root-servers.net.
>>>> .                       86118   IN      NS      b.root-servers.net.
>>>> .                       86118   IN      NS      f.root-servers.net.
>>>> .                       86118   IN      NS      m.root-servers.net.
>>>> .                       86118   IN      NS      e.root-servers.net.
>>>> .                       86118   IN      NS      a.root-servers.net.
>>>> .                       86118   IN      NS      i.root-servers.net.
>>>> .                       86118   IN      NS      k.root-servers.net.
>>>> .                       86118   IN      NS      g.root-servers.net.
>>>> .                       86118   IN      NS      h.root-servers.net.
>>>> .                       86118   IN      NS      j.root-servers.net.
>>>> .                       86118   IN      RRSIG   NS 8 0 518400
>>>> 20200921050000
>>>> 20200908040000 46594 .
>>>> wgnBz8tKA9hjwIxmMQgTVwnZaiUpAB9a1+oC5T/syHzqNj1e5qhApLQN
>>>> NLok43hu5Ykt8RFe/IiDZuYxIdyyzItwk
>>>> 4QN8xNgsQsfhVfBbZ26bWRz
>>>> fskquwnFn6Gmvq2qI6o42tsBxXUw09X4sNlNYI2zHB3sKaaMu0AbN9WI
>>>> Pe14jpX/PwaP3m78+XqMy9CiKmuDon6g3BuyecPhCZL5Pa8ZPC7nrKfV
>>>> pfyNSiPoBODsJE96UHGlOCJTFcbu/6Ia4ek3AGOJf+WC84HPrxLT
>>>> riyk XHfbPl7EjTbFSPgT8D7jGBfVCTQU3JSfynv29VFAHWZu1gm5VJWNQGaw 
>>>> u5gatA==
>>>> ;; Received 540 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
>>>> 
>>>> com.                    172800  IN      NS      a.gtld-servers.net.
>>>> com.                    172800  IN      NS      b.gtld-servers.net.
>>>> com.                    172800  IN      NS      c.gtld-servers.net.
>>>> com.                    172800  IN      NS      d.gtld-servers.net.
>>>> com.                    172800  IN      NS      e.gtld-servers.net.
>>>> com.                    172800  IN      NS      f.gtld-servers.net.
>>>> com.                    172800  IN      NS      g.gtld-servers.net.
>>>> com.                    172800  IN      NS      h.gtld-servers.net.
>>>> com.                    172800  IN      NS      i.gtld-servers.net.
>>>> com.                    172800  IN      NS      j.gtld-servers.net.
>>>> com.                    172800  IN      NS      k.gtld-servers.net.
>>>> com.                    172800  IN      NS      l.gtld-servers.net.
>>>> com.                    172800  IN      NS      m.gtld-servers.net.
>>>> com.                    86400   IN      DS      30909 8 2
>>>> E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
>>>> com.                    86400   IN      RRSIG   DS 8 1 86400
>>>> 20200921050000
>>>> 20200908040000 46594 .
>>>> zz85z6R/YUHxyW+ywA6zrgiYILjPo0i248M3wU+2XCRCneBH6yknQfjM
>>>> LIcbo3vADVUlkJd0l4W2TLd7NPgC255hr2
>>>> +ALojzzHa07jyFmE203Kdw
>>>> ma7XL0C55TdFrCEMhARkZf4EncfJH9JH+fdWRWdMr0EQZd1A+FzMYemO
>>>> o7/L/8ZYq4FOt0vz+zheAJNDveGii+QpXAoDyw4xt3HMUVM+40Z/VgD1
>>>> tk9Y3K9e2wwRNISeHdlq21JFVA2SY/gDgPCzBtM1r9Yz7oFZ2ld5W
>>>> AD0 P84GPEUMgUceAGofwxlV9+dSawhunskb+yVrpdjpizLageyJRWEu/F9A 
>>>> zDXxew==
>>>> ;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in 
>>>> 5 ms
>>>> 
>>>> vmware.com.             172800  IN      NS      dns1.p05.nsone.net.
>>>> vmware.com.             172800  IN      NS      dns2.p05.nsone.net.
>>>> vmware.com.             172800  IN      NS      dns3.p05.nsone.net.
>>>> vmware.com.             172800  IN      NS      dns4.p05.nsone.net.
>>>> vmware.com.             172800  IN      NS      ns01.vmwdns.com.
>>>> vmware.com.             172800  IN      NS      ns02.vmwdns.com.
>>>> vmware.com.             172800  IN      NS      ns03.vmwdns.com.
>>>> vmware.com.             172800  IN      NS      ns04.vmwdns.com.
>>>> vmware.com.             86400   IN      DS      48553 13 2
>>>> AA2C697F3990472642AF01509A18224828E403CA8608EC75D5C83002 CE21847E
>>>> vmware.com.             86400   IN      RRSIG   DS 8 2 86400
>>>> 20200915062203
>>>> 20200908051203 24966 com.
>>>> FA2xsJKvT2LLn5UEy7hAE7PaYmds7FBkQB0SGhm8riwJRKnxbHAY0tvv
>>>> I1T/k0EzXJ4wy1J5qzNLMjhzFgPxEQB
>>>> 6BwBfJm8qo8Cnzxm4YC5Ko1/9
>>>> pDWooVBHoFfMmJgu14Dk+u1AcHobxH9pPs7az16cLK/3YeaFW3dCrIVQ
>>>> NK2fZc0d/pc7CY0Zl1LjYQdTq+MsZiL2kbepEHD6A/4J6g==
>>>> ;; Received 523 bytes from 2001:503:eea3::30#53(g.gtld-servers.net)
>>>> in 6 ms
>>>> 
>>>> pubs.vmware.com.        30      IN      CNAME
>>>> pubs.vmware.com.ds.edgekey.net.
>>>> pubs.vmware.com.        30      IN      RRSIG   CNAME 13 3 30
>>>> 20200909071011
>>>> 20200907071011 12752 vmware.com.
>>>> yTxj4OFvCx3flxtOFAFdkwAOpOAVNibgseFi5U5ekzYbdATw98xZqrDT
>>>> tYs/n46iHFiLN4ql4Y3MS6U
>>>> 16Qr6DQ==
>>>> ;; Received 194 bytes from 45.54.11.1#53(ns01.vmwdns.com) in 11 ms
>>>> 
>>>> real0m32.149s
>>>> user0m0.012s
>>>> sys0m0.012s
>>>> 
>>>> But this looks normal to me. I don't know why the trace only shows 
>>>> 5,
>>>> 6 and
>>>> 11 ms but takes up to 32 seconds to finish.
>>> 
>>> Well, that is suspect, but see below.
>>> 
>>>> 
>>>> Regarding your questions for the ipv6 connectivity. How can I test 
>>>> this?
>>> 
>>> Run pdns_recursor with the --trace option (or trace=yes in the config
>>> file), do some queries and look at the results in the log file. Now
>>> the recursor logs a lot in trace mode, so take your time trying to
>>> understand what is going on. Members of this list can likely help if
>>> you do not spot anything.
>>> 
>>>     -Otto
>>> 
>>>> 
>>>> I did a
>>>> 
>>>> $ dig ipv6.google.com @127.0.0.1
>>>> 
>>>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> ipv6.google.com 
>>>> @127.0.0.1
>>>> ;; global options: +cmd
>>>> ;; Got answer:
>>>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9226
>>>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
>>>> 
>>>> ;; OPT PSEUDOSECTION:
>>>> ; EDNS: version: 0, flags:; udp: 4096
>>>> ;; QUESTION SECTION:
>>>> ;ipv6.google.com.INA
>>>> 
>>>> ;; ANSWER SECTION:
>>>> ipv6.google.com.86400   INCNAME   ipv6.l.google.com.
>>>> 
>>>> ;; AUTHORITY SECTION:
>>>> l.google.com.   60INSOAns1.google.com. dns-admin.google.com.
>>>> 330353109 900
>>>> 900 1800 60
>>>> 
>>>> ;; Query time: 3087 msec
>>>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> ;; WHEN: Tue Sep 08 09:12:50 CEST 2020
>>>> ;; MSG SIZE  rcvd: 115
>>>> 
>>>> and
>>>> 
>>>> $ ping6 ipv6.google.com
>>>> PING ipv6.google.com(fra16s13-in-x0e.1e100.net
>>>> (2a00:1450:4001:819::200e))
>>>> 56 data bytes
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=1 ttl=118 time=5.11 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=2 ttl=118 time=5.08 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=3 ttl=118 time=5.12 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=4 ttl=118 time=5.13 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=5 ttl=118 time=5.09 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=6 ttl=118 time=5.08 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=7 ttl=118 time=5.08 ms
>>>> ^C
>>>> --- ipv6.google.com ping statistics ---
>>>> 7 packets transmitted, 7 received, 0% packet loss, time 24ms
>>>> rtt min/avg/max/mdev = 5.075/5.096/5.133/0.043 ms
>>>> 
>>>> and it looks good.
>>>> 
>>>> regards
>>>> Chris
>>>> 
>>>> 
>>>> Am 2020-09-04 15:05, schrieb Otto Moerbeek:
>>>> > On Wed, Sep 02, 2020 at 09:44:37AM +0200, Christian Degenkolb via
>>>> > Pdns-users wrote:
>>>> >
>>>> > > Hi,
>>>> > >
>>>> > > I hope somebody on the ML can help me figure out what I'm doing
>>>> wrong.
>>>> > > I have a local pdns_recursor (version 4.1.11-1+deb10u1 from
>>>> debian 10)
>>>> > > runing and added it at the top of my /etc/resolve.conf as 127.0.0.1.
>>>> > >
>>>> > > However I see some strange SERVERFAIL resolves happening and all in
>>>> > > all a
>>>> > > slow DNS system.
>>>> > >
>>>> > > For example see the following two consecutive resolves and a direct
>>>> > > request
>>>> > > to the NS.
>>>> > > The first one takes nearly 3 seconds vs 11 ms from the same system
>>>> > > if I
>>>> > > query the NS directly.
>>>> > >
>>>> > > $ dig pubs.vmware.com @127.0.0.1
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @127.0.0.1
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4929
>>>> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
>>>> > > pubs.vmware.com.ds.edgekey.net. 10 IN   CNAME
>>>> > > e751.dscx.akamaiedge.net.
>>>> > >
>>>> > > ;; Query time: 3009 msec
>>>> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> > > ;; WHEN: Wed Sep 02 09:19:04 CEST 2020
>>>> > > ;; MSG SIZE  rcvd: 123
>>>> > >
>>>> > > $ dig pubs.vmware.com @127.0.0.1
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @127.0.0.1
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1345
>>>> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.18INCNAME   pubs.vmware.com.ds.edgekey.net.
>>>> > > pubs.vmware.com.ds.edgekey.net. 4 INCNAME
>>>> e751.dscx.akamaiedge.net.
>>>> > > e751.dscx.akamaiedge.net. 16INA104.111.214.47
>>>> > >
>>>> > > ;; Query time: 0 msec
>>>> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> > > ;; WHEN: Wed Sep 02 09:19:08 CEST 2020
>>>> > > ;; MSG SIZE  rcvd: 139
>>>> > >
>>>> > > $ dig pubs.vmware.com @ns03.vmwdns.com
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @ns03.vmwdns.com
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5509
>>>> > > ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>>>> > > ;; WARNING: recursion requested but not available
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.30INCNAME   pubs.vmware.com.ds.edgekey.net.
>>>> > >
>>>> > > ;; Query time: 11 msec
>>>> > > ;; SERVER: 45.54.11.129#53(45.54.11.129)
>>>> > > ;; WHEN: Wed Sep 02 09:34:42 CEST 2020
>>>> > > ;; MSG SIZE  rcvd: 88
>>>> > >
>>>> > > Also I have a number SERVFAIL in /var/log/syslog (pdns_recurser is
>>>> > > currently
>>>> > > running with loglevel=6).
>>>> > > For example:
>>>> > >
>>>> > > Sep  2 08:45:35 rho pdns_recursor[19311]: Sending SERVFAIL to
>>>> > > 127.0.0.1
>>>> > > during resolve of 'pubs.vmware.com' because: Too much time
>>>> waiting for
>>>> > > pubs.vmware.com.ds.edgekey.net|A, timeouts: 5,
>>>> > > throttles: 1, queries: 6, 7991msec
>>>> > >
>>>> > > # grep 'Too much time waiting for' /var/log/syslog | wc -l
>>>> > > 184
>>>> > >
>>>> > > As per
>>>> > > https://blog.powerdns.com/2014/12/11/powerdns-graphing-as-a-service/
>>>> > > I send the metrics to
>>>> https://metronome1.powerdns.com/?server=pdns.rho-test.recursor&beginTime=-172800
>>>> 
>>>> > >
>>>> > > Does anybody have an idea whats wrong? This seems way to slow for
>>>> > > DNS and
>>>> > > the SERVFAIL schouldn't happen this often.
>>>> > > The server in question is running in a DC of the german Hoster
>>>> > > hetzner.de.
>>>> > > Besides the strange DNS I don't have any problems with the
>>>> > > reliability of
>>>> > > the network connection.
>>>> > >
>>>> > > thanks
>>>> > > Chris
>>>> > >
>>>> > > _______________________________________________
>>>> > > Pdns-users mailing list
>>>> > > Pdns-users at mailman.powerdns.com
>>>> > > https://mailman.powerdns.com/mailman/listinfo/pdns-users
>>>> >
>>>> > You did not share any config or traces, so it's hard to tell. A wild
>>>> > guess: It might be you enabled IPV6 but your IPV6 connectivity is bad.
>>>> >
>>>> >     -Otto
>> _______________________________________________
>> Pdns-users mailing list
>> Pdns-users at mailman.powerdns.com
>> https://mailman.powerdns.com/mailman/listinfo/pdns-users
> _______________________________________________
> Pdns-users mailing list
> Pdns-users at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/pdns-users


More information about the Pdns-users mailing list