[dnsdist] backend drops metrics for TCP

Christoph cm at appliedprivacy.net
Wed Sep 13 05:30:14 UTC 2023


> This counter will always be 0 for TCP backends indeed, it is only 
> incremented when we give up waiting on a UDP response.

Thanks for confirming.

> The default timeout for TCP backends is set at 30s, while for UDP 
> responses it is at 2s. So it is very possible that dnsdist no longer 
> considers the response a timeout but the application now does. You might 
> try to tune the 'tcpRecvTimeout' on `newServer`. Note that this suggests 
> that the backend is slow to answer, so tuning dnsdist might not help at 
> all and investigating why the backend struggles with these queries might 
> be needed.

I've switched back to using UDP.
Is there an easy way to log queries that timeout (2s) - and not log any 
others? To investigate some examples further?

https://dnsdist.org/rules-actions.html?highlight=addaction#ERCodeRule
https://dnsdist.org/reference/constants.html#dnsrcode
The only RCode with "time" in it: DNSRCode.BADTIME

Yes, I'm also investigating the increased timeout rate on the backend 
Recursor side and I'm in contact with Otto about it. So far disabling 
agg. NSEC caching has been the most significant workaround for that problem.

> Do you enable out-of-order processing, 
> via 'maxInFlight' on `newServer`? 

yes (1k)

If so, are you sure that the backend
> actually supports it?

A while back you pointed out a problem in our Recursor config
since then Recursor should work with maxInFlight config.

best regards,
Christoph


More information about the dnsdist mailing list