[Pdns-users] Slow query and SERVERFAIL from local pdns_recursor
Christian Degenkolb
christian+pdns at degenkolb.net
Thu Sep 10 13:40:54 UTC 2020
Hi Thomas,
what is a reasonable low value for udp-truncation-threshold? I tried
with 900 and 600 (as low as half the default value) but found no
improvements.
Also I don't think this is a vmware.com problem since I have the same
problem with multiple domains.
To illustrate I found the tool dnsperf from
https://www.dns-oarc.net/tools/dnsperf and created a queryfile with the
list of 500 domains from here https://moz.com/top500 see
https://paste.ubuntu.com/p/DxGBqRvngv/
If I call the tool against my local resolver on a clean cache (even with
udp-truncation-threshol=600) I get the following output.
# rec_control wipe-cache $
wiped 4154 records, 8 negative records, 500 packets
# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4
[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:26 2020
[Status] Stopping after 1 run through file
<snip multiple lines like "[Timeout] Query timed out: msg id 0" and
"Warning: received a response with an unexpected (maybe timed out) id:
162">
[Status] Testing complete (end of file)
Statistics:
Queries sent: 500
Queries completed: 278 (55.60%)
Queries lost: 222 (44.40%)
Response codes: NOERROR 209 (75.18%), SERVFAIL 69 (24.82%)
Average packet size: request 29, response 56
Run time (s): 16.455935
Queries per second: 16.893601
Average Latency (s): 1.313376 (min 0.000543, max 4.491949)
Latency StdDev (s): 1.446709
# ./dnsperf -d queryfile_top500_clean
DNS Performance Testing Tool
Version 2.3.4
[Status] Command line: dnsperf -d queryfile_top500_clean
[Status] Sending queries (to 127.0.0.1)
[Status] Started at: Thu Sep 10 15:29:49 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)
Statistics:
Queries sent: 500
Queries completed: 500 (100.00%)
Queries lost: 0 (0.00%)
Response codes: NOERROR 281 (56.20%), SERVFAIL 219 (43.80%)
Average packet size: request 29, response 50
Run time (s): 4.571526
Queries per second: 109.372669
Average Latency (s): 0.015253 (min 0.000054, max 4.556146)
Latency StdDev (s): 0.244755
As I see this way to much queries lost without a filled cache and way to
high SERVFAIL for this kind of domains even on retries.
The SERVFAIL stays high on subsequent runs.
Whereas if I run it against 1.1.1.1 (or the hoster DNS server) I get the
following output.
# ./dnsperf -d queryfile_top500_clean -s 1.1.1.1
DNS Performance Testing Tool
Version 2.3.4
[Status] Command line: dnsperf -d queryfile_top500_clean -s 1.1.1.1
[Status] Sending queries (to 1.1.1.1)
[Status] Started at: Thu Sep 10 15:33:24 2020
[Status] Stopping after 1 run through file
[Status] Testing complete (end of file)
Statistics:
Queries sent: 500
Queries completed: 500 (100.00%)
Queries lost: 0 (0.00%)
Response codes: NOERROR 499 (99.80%), SERVFAIL 1 (0.20%)
Average packet size: request 29, response 77
Run time (s): 0.882704
Queries per second: 566.441299
Average Latency (s): 0.013521 (min 0.005065, max 0.863349)
Latency StdDev (s): 0.054510
A near perfect score.
Doesn't this mean the problem lies within the local resolver since
dnsperf would make the same requests the local resolver would make to
the external DNS server?
Or at least there does not exist an uplink problem but something local
to my server?
regards
Chris
Am 2020-09-09 10:05, schrieb Thomas Mieslinger via Pdns-users:
> Hi Christian,
>
> Hetzner might filter ip fragments. Please try if your situation gets
> better if you set udp-truncation-threshold to a reasonable low value.
>
> By default pdns-recursor does dnssec. I would like to suggest to set
> +dnssec on your dig queries.
>
> A possible workaround for the vmware.com problems is to add a negative
> trust anchor for vmware.com. in pdns config.
>
> Cheers Thomas
>
> On 9/8/20 2:16 PM, Christian Degenkolb via Pdns-users wrote:
>> Hi,
>>
>> I set the trace=yes option in the recursor config an redid the tests
>> for
>> pubs.vmware.com.
>>
>> The log can be found here https://paste.debian.net/hidden/07526601/
>>
>> I found two timeouts in the logs
>>
>> Line 41:
>> Sep 8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com:
>> Resolved
>> 'vmware.com' NS ns01.vmwdns.com to: 45.54.11.1
>> Sep 8 10:21:54 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying
>> IP
>> 45.54.11.1:53, asking 'pubs.vmware.com|A'
>> Sep 8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: timeout
>> resolving after 1501.63msec
>> Sep 8 10:21:56 rho pdns_recursor[25208]: [3] pubs.vmware.com: Trying
>> to
>> resolve NS 'ns04.vmwdns.com' (2/8)
>>
>> But a request to the 45.54.11.1 for pubs.vmware.com come back within
>> 11
>> msec.
>>
>> $ dig -t A @45.54.11.1 pubs.vmware.com
>>
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @45.54.11.1
>> pubs.vmware.com
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24122
>> ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>> ;; WARNING: recursion requested but not available
>>
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;pubs.vmware.com.INA
>>
>> ;; ANSWER SECTION:
>> pubs.vmware.com.30INCNAME pubs.vmware.com.ds.edgekey.net.
>>
>> ;; Query time: 11 msec
>> ;; SERVER: 45.54.11.1#53(45.54.11.1)
>> ;; WHEN: Tue Sep 08 13:29:57 CEST 2020
>> ;; MSG SIZE rcvd: 88
>>
>> and a seconds timeout in line 159:
>>
>> Sep 8 10:21:56 rho pdns_recursor[25208]: [3]
>> e751.dscx.akamaiedge.net:
>> Trying IP 2.16.106.23:53, asking 'e751.dscx.akamaiedge.net|A'
>> Sep 8 10:21:57 rho pdns_recursor[25208]: [3]
>> e751.dscx.akamaiedge.net:
>> timeout resolving after 1501.74msec
>> Sep 8 10:21:57 rho pdns_recursor[25208]: [3]
>> e751.dscx.akamaiedge.net:
>> Trying to resolve NS 'n3dscx.akamaiedge.net' (2/8)
>>
>> Same picture here with a very good response time.
>>
>> $ dig -t A @2.16.106.23 e751.dscx.akamaiedge.net
>>
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @2.16.106.23
>> e751.dscx.akamaiedge.net
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7947
>> ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>> ;; WARNING: recursion requested but not available
>>
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;e751.dscx.akamaiedge.net.INA
>>
>> ;; ANSWER SECTION:
>> e751.dscx.akamaiedge.net. 20INA104.111.214.47
>>
>> ;; Query time: 5 msec
>> ;; SERVER: 2.16.106.23#53(2.16.106.23)
>> ;; WHEN: Tue Sep 08 13:31:32 CEST 2020
>> ;; MSG SIZE rcvd: 69
>>
>>
>> To check that this is not a vmware.com problem I tested some more and
>> got the same timeouts.
>>
>>
>> One more example for
>>
>> $dig nameservers.dnscheck.co @127.0.0.1
>>
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> nameservers.dnscheck.co
>> @127.0.0.1
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23852
>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
>>
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ;; QUESTION SECTION:
>> ;nameservers.dnscheck.co.INA
>>
>> ;; Query time: 3005 msec
>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>> ;; WHEN: Tue Sep 08 12:15:29 CEST 2020
>> ;; MSG SIZE rcvd: 52
>>
>> can be found here https://paste.debian.net/hidden/b48a78a2/.
>>
>> This time multiple timeout regarding the root name servers, for
>> example
>> g.root-servers.net
>>
>> Sep 8 12:15:21 rho pdns_recursor[25208]: [50]
>> nameservers.dnscheck.co:
>> Resolved '.' NS g.root-servers.net to: 192.112.36.4
>> Sep 8 12:15:21 rho pdns_recursor[25208]: [50]
>> nameservers.dnscheck.co:
>> Trying IP 192.112.36.4:53, asking 'nameservers.dnscheck.co|A'
>> Sep 8 12:15:22 rho pdns_recursor[25208]: [50]
>> nameservers.dnscheck.co:
>> timeout resolving after 1501.63msec
>> Sep 8 12:15:22 rho pdns_recursor[25208]: [50]
>> nameservers.dnscheck.co:
>> Trying to resolve NS 'j.root-servers.net' (2/13)
>>
>> Where a direct request via dig works like a charm.
>>
>> $ dig -t A @192.112.36.4 nameservers.dnscheck.co
>>
>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> -t A @192.112.36.4
>> nameservers.dnscheck.co
>> ; (1 server found)
>> ;; global options: +cmd
>> ;; Got answer:
>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18641
>> ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 13
>> ;; WARNING: recursion requested but not available
>>
>> ;; OPT PSEUDOSECTION:
>> ; EDNS: version: 0, flags:; udp: 4096
>> ; COOKIE: ce9eaf15bb34977b41354b5f5f576c3841785bfba5901e93 (good)
>> ;; QUESTION SECTION:
>> ;nameservers.dnscheck.co.INA
>>
>> ;; AUTHORITY SECTION:
>> co.172800 INNSns5.cctld.co.
>> co.172800 INNSns1.cctld.co.
>> co.172800 INNSns6.cctld.co.
>> co.172800 INNSns4.cctld.co.
>> co.172800 INNSns3.cctld.co.
>> co.172800 INNSns2.cctld.co.
>>
>> ;; ADDITIONAL SECTION:
>> ns1.cctld.co. 172800 INA156.154.100.25
>> ns2.cctld.co. 172800 INA156.154.101.25
>> ns3.cctld.co. 172800 INA156.154.102.25
>> ns4.cctld.co. 172800 INA156.154.103.25
>> ns5.cctld.co. 172800 INA156.154.104.25
>> ns6.cctld.co. 172800 INA156.154.105.25
>> ns1.cctld.co. 172800 INAAAA2001:502:2eda::21
>> ns2.cctld.co. 172800 INAAAA2001:502:ad09::21
>> ns3.cctld.co. 172800 INAAAA2610:a1:1009::21
>> ns4.cctld.co. 172800 INAAAA2610:a1:1010::21
>> ns5.cctld.co. 172800 INAAAA2610:a1:1011::21
>> ns6.cctld.co. 172800 INAAAA2610:a1:1012::21
>>
>> ;; Query time: 16 msec
>> ;; SERVER: 192.112.36.4#53(192.112.36.4)
>> ;; WHEN: Tue Sep 08 13:34:20 CEST 2020
>> ;; MSG SIZE rcvd: 458
>>
>>
>> Additionally I get the resolved IPs in the trace logs (line 328
>> apparently from the seconds worker thread) but not the dig output.
>>
>> Sep 8 12:15:33 rho pdns_recursor[25208]: [51]
>> nameservers.dnscheck.co:
>> answer is in: resolved to '52.48.61.155|A'
>> Sep 8 12:15:33 rho pdns_recursor[25208]: [51]
>> nameservers.dnscheck.co:
>> answer is in: resolved to '104.236.169.228|A'
>> Sep 8 12:15:33 rho pdns_recursor[25208]: [51]
>> nameservers.dnscheck.co:
>> answer is in: resolved to '104.131.72.189|A'
>>
>> Is this a dig timeout? Or do I only get the response from the first
>> worker thread?
>>
>> And now I'm more confused then before. The connection from and to the
>> server (SSH, etc) is rock solid.
>> A iperf test shows the full gigabit connection is available.
>> The server is more or less idle and has 8 cores and 32GB RAM as mostly
>> a
>> docker host with some 20-30 container (nextcloud, mailcow, ...)
>> running
>> for personal usage by me and my family.
>>
>> How can I check for problems with a large number of small connections?
>> But this shouldn't be that much fur a single local recursor, should
>> it?
>>
>> Also I don't see any network related messages in the kernel log or
>> anywhere else.
>> I'm not aware of any rate limits for the uplink to the provider.
>>
>> regards
>> Chris
>>
>>
>>
>>
>>
>>
>> Am 2020-09-08 09:33, schrieb Otto Moerbeek:
>>> On Tue, Sep 08, 2020 at 09:22:31AM +0200, Christian Degenkolb wrote:
>>>
>>>> (send again, first answer was not send cc to the ML)
>>>>
>>>> Hi,
>>>>
>>>> sorry for not sending any configs. pdns_recursor runs more or less
>>>> with the
>>>> vanilla config with the following changes:
>>>>
>>>> forward-zones-recurse=zen.spamhaus.org=1.1.1.1;1.0.0.1 (thats why I
>>>> wanted
>>>> to use the local recursor, as mentioned the server is located in the
>>>> hetzner
>>>> IP Range which apparently is blocked for the spamhaus DNSBL)
>>>> loglevel=6
>>>> log-common-errors=yes
>>>> quiet=no
>>>> root-nx-trust=no (found this as a solution for the SERVERFAIL but
>>>> did
>>>> not
>>>> work)
>>>>
>>>> and
>>>> # rec_control set-carbon-server 37.252.122.50 rho-test (for the
>>>> grafs)
>>>>
>>>>
>>>> A trace for the same resolves from my last mail:
>>>>
>>>> $ time dig +trace pubs.vmware.com @127.0.0.1
>>>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> +trace pubs.vmware.com
>>>> @127.0.0.1
>>>> ;; global options: +cmd
>>>> . 86118 IN NS d.root-servers.net.
>>>> . 86118 IN NS c.root-servers.net.
>>>> . 86118 IN NS l.root-servers.net.
>>>> . 86118 IN NS b.root-servers.net.
>>>> . 86118 IN NS f.root-servers.net.
>>>> . 86118 IN NS m.root-servers.net.
>>>> . 86118 IN NS e.root-servers.net.
>>>> . 86118 IN NS a.root-servers.net.
>>>> . 86118 IN NS i.root-servers.net.
>>>> . 86118 IN NS k.root-servers.net.
>>>> . 86118 IN NS g.root-servers.net.
>>>> . 86118 IN NS h.root-servers.net.
>>>> . 86118 IN NS j.root-servers.net.
>>>> . 86118 IN RRSIG NS 8 0 518400
>>>> 20200921050000
>>>> 20200908040000 46594 .
>>>> wgnBz8tKA9hjwIxmMQgTVwnZaiUpAB9a1+oC5T/syHzqNj1e5qhApLQN
>>>> NLok43hu5Ykt8RFe/IiDZuYxIdyyzItwk
>>>> 4QN8xNgsQsfhVfBbZ26bWRz
>>>> fskquwnFn6Gmvq2qI6o42tsBxXUw09X4sNlNYI2zHB3sKaaMu0AbN9WI
>>>> Pe14jpX/PwaP3m78+XqMy9CiKmuDon6g3BuyecPhCZL5Pa8ZPC7nrKfV
>>>> pfyNSiPoBODsJE96UHGlOCJTFcbu/6Ia4ek3AGOJf+WC84HPrxLT
>>>> riyk XHfbPl7EjTbFSPgT8D7jGBfVCTQU3JSfynv29VFAHWZu1gm5VJWNQGaw
>>>> u5gatA==
>>>> ;; Received 540 bytes from 127.0.0.1#53(127.0.0.1) in 0 ms
>>>>
>>>> com. 172800 IN NS a.gtld-servers.net.
>>>> com. 172800 IN NS b.gtld-servers.net.
>>>> com. 172800 IN NS c.gtld-servers.net.
>>>> com. 172800 IN NS d.gtld-servers.net.
>>>> com. 172800 IN NS e.gtld-servers.net.
>>>> com. 172800 IN NS f.gtld-servers.net.
>>>> com. 172800 IN NS g.gtld-servers.net.
>>>> com. 172800 IN NS h.gtld-servers.net.
>>>> com. 172800 IN NS i.gtld-servers.net.
>>>> com. 172800 IN NS j.gtld-servers.net.
>>>> com. 172800 IN NS k.gtld-servers.net.
>>>> com. 172800 IN NS l.gtld-servers.net.
>>>> com. 172800 IN NS m.gtld-servers.net.
>>>> com. 86400 IN DS 30909 8 2
>>>> E2D3C916F6DEEAC73294E8268FB5885044A833FC5459588F4A9184CF C41A5766
>>>> com. 86400 IN RRSIG DS 8 1 86400
>>>> 20200921050000
>>>> 20200908040000 46594 .
>>>> zz85z6R/YUHxyW+ywA6zrgiYILjPo0i248M3wU+2XCRCneBH6yknQfjM
>>>> LIcbo3vADVUlkJd0l4W2TLd7NPgC255hr2
>>>> +ALojzzHa07jyFmE203Kdw
>>>> ma7XL0C55TdFrCEMhARkZf4EncfJH9JH+fdWRWdMr0EQZd1A+FzMYemO
>>>> o7/L/8ZYq4FOt0vz+zheAJNDveGii+QpXAoDyw4xt3HMUVM+40Z/VgD1
>>>> tk9Y3K9e2wwRNISeHdlq21JFVA2SY/gDgPCzBtM1r9Yz7oFZ2ld5W
>>>> AD0 P84GPEUMgUceAGofwxlV9+dSawhunskb+yVrpdjpizLageyJRWEu/F9A
>>>> zDXxew==
>>>> ;; Received 1175 bytes from 198.97.190.53#53(h.root-servers.net) in
>>>> 5 ms
>>>>
>>>> vmware.com. 172800 IN NS dns1.p05.nsone.net.
>>>> vmware.com. 172800 IN NS dns2.p05.nsone.net.
>>>> vmware.com. 172800 IN NS dns3.p05.nsone.net.
>>>> vmware.com. 172800 IN NS dns4.p05.nsone.net.
>>>> vmware.com. 172800 IN NS ns01.vmwdns.com.
>>>> vmware.com. 172800 IN NS ns02.vmwdns.com.
>>>> vmware.com. 172800 IN NS ns03.vmwdns.com.
>>>> vmware.com. 172800 IN NS ns04.vmwdns.com.
>>>> vmware.com. 86400 IN DS 48553 13 2
>>>> AA2C697F3990472642AF01509A18224828E403CA8608EC75D5C83002 CE21847E
>>>> vmware.com. 86400 IN RRSIG DS 8 2 86400
>>>> 20200915062203
>>>> 20200908051203 24966 com.
>>>> FA2xsJKvT2LLn5UEy7hAE7PaYmds7FBkQB0SGhm8riwJRKnxbHAY0tvv
>>>> I1T/k0EzXJ4wy1J5qzNLMjhzFgPxEQB
>>>> 6BwBfJm8qo8Cnzxm4YC5Ko1/9
>>>> pDWooVBHoFfMmJgu14Dk+u1AcHobxH9pPs7az16cLK/3YeaFW3dCrIVQ
>>>> NK2fZc0d/pc7CY0Zl1LjYQdTq+MsZiL2kbepEHD6A/4J6g==
>>>> ;; Received 523 bytes from 2001:503:eea3::30#53(g.gtld-servers.net)
>>>> in 6 ms
>>>>
>>>> pubs.vmware.com. 30 IN CNAME
>>>> pubs.vmware.com.ds.edgekey.net.
>>>> pubs.vmware.com. 30 IN RRSIG CNAME 13 3 30
>>>> 20200909071011
>>>> 20200907071011 12752 vmware.com.
>>>> yTxj4OFvCx3flxtOFAFdkwAOpOAVNibgseFi5U5ekzYbdATw98xZqrDT
>>>> tYs/n46iHFiLN4ql4Y3MS6U
>>>> 16Qr6DQ==
>>>> ;; Received 194 bytes from 45.54.11.1#53(ns01.vmwdns.com) in 11 ms
>>>>
>>>> real0m32.149s
>>>> user0m0.012s
>>>> sys0m0.012s
>>>>
>>>> But this looks normal to me. I don't know why the trace only shows
>>>> 5,
>>>> 6 and
>>>> 11 ms but takes up to 32 seconds to finish.
>>>
>>> Well, that is suspect, but see below.
>>>
>>>>
>>>> Regarding your questions for the ipv6 connectivity. How can I test
>>>> this?
>>>
>>> Run pdns_recursor with the --trace option (or trace=yes in the config
>>> file), do some queries and look at the results in the log file. Now
>>> the recursor logs a lot in trace mode, so take your time trying to
>>> understand what is going on. Members of this list can likely help if
>>> you do not spot anything.
>>>
>>> -Otto
>>>
>>>>
>>>> I did a
>>>>
>>>> $ dig ipv6.google.com @127.0.0.1
>>>>
>>>> ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> ipv6.google.com
>>>> @127.0.0.1
>>>> ;; global options: +cmd
>>>> ;; Got answer:
>>>> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9226
>>>> ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
>>>>
>>>> ;; OPT PSEUDOSECTION:
>>>> ; EDNS: version: 0, flags:; udp: 4096
>>>> ;; QUESTION SECTION:
>>>> ;ipv6.google.com.INA
>>>>
>>>> ;; ANSWER SECTION:
>>>> ipv6.google.com.86400 INCNAME ipv6.l.google.com.
>>>>
>>>> ;; AUTHORITY SECTION:
>>>> l.google.com. 60INSOAns1.google.com. dns-admin.google.com.
>>>> 330353109 900
>>>> 900 1800 60
>>>>
>>>> ;; Query time: 3087 msec
>>>> ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> ;; WHEN: Tue Sep 08 09:12:50 CEST 2020
>>>> ;; MSG SIZE rcvd: 115
>>>>
>>>> and
>>>>
>>>> $ ping6 ipv6.google.com
>>>> PING ipv6.google.com(fra16s13-in-x0e.1e100.net
>>>> (2a00:1450:4001:819::200e))
>>>> 56 data bytes
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=1 ttl=118 time=5.11 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=2 ttl=118 time=5.08 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=3 ttl=118 time=5.12 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=4 ttl=118 time=5.13 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=5 ttl=118 time=5.09 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=6 ttl=118 time=5.08 ms
>>>> 64 bytes from fra16s13-in-x0e.1e100.net (2a00:1450:4001:819::200e):
>>>> icmp_seq=7 ttl=118 time=5.08 ms
>>>> ^C
>>>> --- ipv6.google.com ping statistics ---
>>>> 7 packets transmitted, 7 received, 0% packet loss, time 24ms
>>>> rtt min/avg/max/mdev = 5.075/5.096/5.133/0.043 ms
>>>>
>>>> and it looks good.
>>>>
>>>> regards
>>>> Chris
>>>>
>>>>
>>>> Am 2020-09-04 15:05, schrieb Otto Moerbeek:
>>>> > On Wed, Sep 02, 2020 at 09:44:37AM +0200, Christian Degenkolb via
>>>> > Pdns-users wrote:
>>>> >
>>>> > > Hi,
>>>> > >
>>>> > > I hope somebody on the ML can help me figure out what I'm doing
>>>> wrong.
>>>> > > I have a local pdns_recursor (version 4.1.11-1+deb10u1 from
>>>> debian 10)
>>>> > > runing and added it at the top of my /etc/resolve.conf as 127.0.0.1.
>>>> > >
>>>> > > However I see some strange SERVERFAIL resolves happening and all in
>>>> > > all a
>>>> > > slow DNS system.
>>>> > >
>>>> > > For example see the following two consecutive resolves and a direct
>>>> > > request
>>>> > > to the NS.
>>>> > > The first one takes nearly 3 seconds vs 11 ms from the same system
>>>> > > if I
>>>> > > query the NS directly.
>>>> > >
>>>> > > $ dig pubs.vmware.com @127.0.0.1
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @127.0.0.1
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4929
>>>> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.30INCNAME pubs.vmware.com.ds.edgekey.net.
>>>> > > pubs.vmware.com.ds.edgekey.net. 10 IN CNAME
>>>> > > e751.dscx.akamaiedge.net.
>>>> > >
>>>> > > ;; Query time: 3009 msec
>>>> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> > > ;; WHEN: Wed Sep 02 09:19:04 CEST 2020
>>>> > > ;; MSG SIZE rcvd: 123
>>>> > >
>>>> > > $ dig pubs.vmware.com @127.0.0.1
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @127.0.0.1
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1345
>>>> > > ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.18INCNAME pubs.vmware.com.ds.edgekey.net.
>>>> > > pubs.vmware.com.ds.edgekey.net. 4 INCNAME
>>>> e751.dscx.akamaiedge.net.
>>>> > > e751.dscx.akamaiedge.net. 16INA104.111.214.47
>>>> > >
>>>> > > ;; Query time: 0 msec
>>>> > > ;; SERVER: 127.0.0.1#53(127.0.0.1)
>>>> > > ;; WHEN: Wed Sep 02 09:19:08 CEST 2020
>>>> > > ;; MSG SIZE rcvd: 139
>>>> > >
>>>> > > $ dig pubs.vmware.com @ns03.vmwdns.com
>>>> > >
>>>> > > ; <<>> DiG 9.11.5-P4-5.1+deb10u2-Debian <<>> pubs.vmware.com
>>>> > > @ns03.vmwdns.com
>>>> > > ;; global options: +cmd
>>>> > > ;; Got answer:
>>>> > > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5509
>>>> > > ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
>>>> > > ;; WARNING: recursion requested but not available
>>>> > >
>>>> > > ;; OPT PSEUDOSECTION:
>>>> > > ; EDNS: version: 0, flags:; udp: 4096
>>>> > > ;; QUESTION SECTION:
>>>> > > ;pubs.vmware.com.INA
>>>> > >
>>>> > > ;; ANSWER SECTION:
>>>> > > pubs.vmware.com.30INCNAME pubs.vmware.com.ds.edgekey.net.
>>>> > >
>>>> > > ;; Query time: 11 msec
>>>> > > ;; SERVER: 45.54.11.129#53(45.54.11.129)
>>>> > > ;; WHEN: Wed Sep 02 09:34:42 CEST 2020
>>>> > > ;; MSG SIZE rcvd: 88
>>>> > >
>>>> > > Also I have a number SERVFAIL in /var/log/syslog (pdns_recurser is
>>>> > > currently
>>>> > > running with loglevel=6).
>>>> > > For example:
>>>> > >
>>>> > > Sep 2 08:45:35 rho pdns_recursor[19311]: Sending SERVFAIL to
>>>> > > 127.0.0.1
>>>> > > during resolve of 'pubs.vmware.com' because: Too much time
>>>> waiting for
>>>> > > pubs.vmware.com.ds.edgekey.net|A, timeouts: 5,
>>>> > > throttles: 1, queries: 6, 7991msec
>>>> > >
>>>> > > # grep 'Too much time waiting for' /var/log/syslog | wc -l
>>>> > > 184
>>>> > >
>>>> > > As per
>>>> > > https://blog.powerdns.com/2014/12/11/powerdns-graphing-as-a-service/
>>>> > > I send the metrics to
>>>> https://metronome1.powerdns.com/?server=pdns.rho-test.recursor&beginTime=-172800
>>>>
>>>> > >
>>>> > > Does anybody have an idea whats wrong? This seems way to slow for
>>>> > > DNS and
>>>> > > the SERVFAIL schouldn't happen this often.
>>>> > > The server in question is running in a DC of the german Hoster
>>>> > > hetzner.de.
>>>> > > Besides the strange DNS I don't have any problems with the
>>>> > > reliability of
>>>> > > the network connection.
>>>> > >
>>>> > > thanks
>>>> > > Chris
>>>> > >
>>>> > > _______________________________________________
>>>> > > Pdns-users mailing list
>>>> > > Pdns-users at mailman.powerdns.com
>>>> > > https://mailman.powerdns.com/mailman/listinfo/pdns-users
>>>> >
>>>> > You did not share any config or traces, so it's hard to tell. A wild
>>>> > guess: It might be you enabled IPV6 but your IPV6 connectivity is bad.
>>>> >
>>>> > -Otto
>> _______________________________________________
>> Pdns-users mailing list
>> Pdns-users at mailman.powerdns.com
>> https://mailman.powerdns.com/mailman/listinfo/pdns-users
> _______________________________________________
> Pdns-users mailing list
> Pdns-users at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/pdns-users
More information about the Pdns-users
mailing list