[Pdns-users] troubleshooting dnsdist -> recursor instability

Thomas Mieslinger miesi at mail.com
Mon Oct 24 06:34:40 UTC 2022



Am 24.10.22 um 01:46 schrieb Christoph via Pdns-users:
> Hi,
>
> we have the following setup running on debian 11 bare metal servers and
> doing about 250 qps:

This is rather low.

 > [..]
> Is this unexpected or not unusual?
> If unusual: what would be the usual ways to further track this issue down?

to me: unusual.

I'd use dnscap (tcpdump with a decent filter) on dnsdist and recursor
machines. See if check query goes out from dnsdist, comes in to
recursor, see if reply goes out from recursors, comes in to dnsdist.

Review nftables config on all machines. Maybe someone of your team
installed hashlimit magic to avoid overload.

Look for a metric which tells you whether you hit the "max in flight"
limit. If you have long running queries (taking 1000ms in the recursor)
the inflight limit can be reached quickly.

If your recursors are too slow, their caches might be too cold. You
could heat these caches with generated queries for . and all tld
nameservers.

> [..]

Good luck.

Cheers

Thomas


More information about the Pdns-users mailing list