[dnsdist] dnsdist 1.8.0 thread spinning
Otto Moerbeek
otto at drijf.net
Sat Jul 15 07:42:53 UTC 2023
On Fri, Jul 14, 2023 at 03:06:12PM -0500, Dustin Marquess via dnsdist wrote:
> So far we've had instances with dnsdist 1.8.0 having a thread in a tight loop. OS versions seem to vary widely, so I don't believe it's a glibc bug.
>
> Config on both is the same plain config:
>
> setLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addACL('10.0.0.0/8')
> newServer({address="10.112.104.116", checkType="A", checkClass=DNSClass.IN, checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> newServer({address="10.112.106.177", checkType="A", checkClass=DNSClass.IN, checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> newServer({address="10.9.41.68", checkType="A", checkClass=DNSClass.IN, checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> setServerPolicy(firstAvailable)
>
> -- Tuning
> setRingBuffersSize(1000000, 100)
> setMaxTCPClientThreads(20)
>
> -- Caching
> -- We should make these tunables configurable
> pc = newPacketCache(100000, {maxTTL=86400, minTTL=0, temporaryFailureTTL=60, staleTTL=60, dontAge=false})
> getPool(""):setCache(pc)
>
> -- Don't try and hit the internet
> setSecurityPollSuffix("")
>
> [pid 2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)
> [pid 2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)
> [pid 2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)
> [pid 2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad file descriptor)
>
> In each case, a strace shows a bad recvfrom() call in a tight loop:
>
> Obviously -1 is a bad fd! Restarting dnsdist seems to resolve it. The only idea I can come up with is that when dnsdist first starts, it's unable to contact the upstream DNS servers and that somehow causes the issue. When we restart it, it IS able to contact them, and so works fine.
>
> Any ideas?
>
> Thanks!
> -Dustin
This is likely https://github.com/PowerDNS/pdns/pull/12726
ATM this is not marked for backporting to 1.8.x. Don't know if that is
an omission.
-Otto
More information about the dnsdist
mailing list