[Pdns-users] troubleshooting dnsdist -> recursor instability

Christoph cm at appliedprivacy.net
Sun Oct 23 23:46:52 UTC 2022


Hi,

we have the following setup running on debian 11 bare metal servers and 
doing about 250 qps:

dnsdist -> recursor1 (running on the same machine as dnsdist/localhost)
	-> recursor2 (on a second machine on the same network)

Full config files are found at the end of this email.

The log lines below should give you a feeling how frequent
the dnsdist health check fails.

Is this unexpected or not unusual?
If unusual: what would be the usual ways to further track this issue down?

thanks!
Christoph

recursor1:

Oct 23 23:03:22 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:03:23 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:03:33 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:03:34 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:03:51 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:03:52 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:05:23 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:05:24 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:07:11 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:07:12 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:09:40 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:09:41 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:10:37 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:10:38 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:12:10 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:12:11 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:14:19 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:14:23 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:15:12 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:15:13 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:15:47 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:15:50 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:17:04 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:17:05 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:18:00 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:18:01 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:18:46 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:18:47 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:20:48 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:20:49 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:23:55 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:23:56 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:24:51 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:24:52 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:24:59 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:25:00 dnsdist: Marking downstream 127.0.0.1:54 as 'up'
Oct 23 23:25:35 dnsdist: Marking downstream 127.0.0.1:54 as 'down'
Oct 23 23:25:36 dnsdist: Marking downstream 127.0.0.1:54 as 'up'

recursor2:

Oct 23 23:00:51 dnsdist: Marking downstream 109.70.100.126:53 as 'down'
Oct 23 23:00:52 dnsdist: Marking downstream 109.70.100.126:53 as 'up'
Oct 23 23:02:52 dnsdist: Marking downstream 109.70.100.126:53 as 'down'
Oct 23 23:02:53 dnsdist: Marking downstream 109.70.100.126:53 as 'up'
Oct 23 23:31:59 dnsdist: Marking downstream 109.70.100.126:53 as 'down'
Oct 23 23:32:00 dnsdist: Marking downstream 109.70.100.126:53 as 'up'
Oct 23 23:37:40 dnsdist: Marking downstream 109.70.100.126:53 as 'down'
Oct 23 23:37:41 dnsdist: Marking downstream 109.70.100.126:53 as 'up'


dnsdist.conf (1.7.2-1pdns.bullseye amd64)
---------------------
newServer({address="127.0.0.1:54", maxInFlight=1000})
newServer({address="109.70.100.126", maxInFlight=1000})
setServerPolicy(firstAvailable)

setLocal('127.0.0.1', { maxInFlight=1000} )
setACL('127.0.0.0/8')
pc = newPacketCache(50000, {maxTTL=86400, minTTL=3, 
temporaryFailureTTL=60, staleTTL=60, dontAge=false})
getPool(""):setCache(pc)
controlSocket('127.0.0.1:5199')
setConsoleACL('127.0.0.1/8')
setKey("---")
webserver("127.0.0.1:8083")
setWebserverConfig({password="---"})
---------------------


recursor.conf (4.7.3-1pdns.bullseye) for the one running on localhost, 
the one at 109.70.100.126:53 has basically the same config with other IPs:
---------------------
config-dir=/etc/powerdns
setuid=pdns
setgid=pdns

aggressive-nsec-cache-size=100000
allow-from=127.0.0.0/8,109.70.100.0/24
distributor-threads=1
dnssec=validate
dnssec-log-bogus=no
edns-padding-from=127.0.0.0/8,109.70.100.0/24
edns-padding-mode=padded-queries-only
extended-resolution-errors=yes
local-address=127.0.0.1:54,109.70.100.125:53,109.70.100.136:53
log-common-errors=no
log-rpz-changes=no
log-timestamp=yes
loglevel=3
max-busy-dot-probes=5
max-cache-entries=1000000
max-packetcache-entries=500000
minimum-ttl-override=5
new-domain-tracking=no
nothing-below-nxdomain=dnssec
pdns-distributes-queries=yes
qname-minimization=yes
query-local-address=109.70.100.125,2a03:e600:100::178
quiet=yes
refresh-on-ttl-perc=10
threads=8
version-string=PowerDNS Recursor
webserver=yes
webserver-address=127.0.0.1
webserver-allow-from=127.0.0.1,::1
webserver-loglevel=normal
webserver-password=---

lua-config-file=/etc/powerdns/config.lua

---------------------

related github feature request is scheduled for dnsdist 1.9:
https://github.com/PowerDNS/pdns/issues/12113




More information about the Pdns-users mailing list