[dnsdist] dnsdist 1.7.4 Debian Bullseye vs 1.8.4 Bullseye
Aleš Rygl
ales at rygl.net
Mon Oct 9 10:32:41 UTC 2023
Hi
> On 05/10/2023 10:41, Aleš Rygl via dnsdist wrote:
>> Thanks for your response. After some deep documentation reading
>> and config tweaking I am nearly on the previous values regarding CPU
>> load, apart from latency, which is still higher (1.3ms -> 2.3ms). I
>> suspect a different way the latency is likely computed (I noticed a
>> new set of latency counters for TLS, TCP, etc.) here. The key
>> configuration parameter is setMaxTCPClientThreads(). Changing
>> anything else (cache shards, number of listeners, etc.) has nearly no
>> impact. We had 256 with 1.7.4. now it is 16. Going up here means a
>> rapid increase of CPU load, having less than 16 means dropping TCP
>> connections in showTCPStats(), where Queued hits Max Queued. Insane
>> values like 1024 kills the CPU. We have a physical server with 16
>> phys. cores, OS sees 32 cores.
>
> OK, this is clearly unexpected. I mean, since 1.4.0 you should not be
> needing more TCP worker threads than the number of cores, since a
> single worker can handle a lot (easily thousands) of TCP connections,
> but having a larger value should not kill the CPU so I'm wondering if
> we are busy-looping somewhere. I have not been able to reproduce that
> so far, so I would be really interested in seeing the perf output if
> you can get it.
>
Update: after some testing I can say that dnsdist 1.7.4 on Bookworm has
the same issue as 1.8.1. The reason is apparently here:
https://github.com/openssl/openssl/issues/17064. There is a safe
workaround - lowering setMaxTCPClientThreads(). Watch out TCP queueing -
use showTCPStats(). And improving TLS performance using STEK file can
help as well.
I'd like to thank Remi for his excellent support.
Ales
More information about the dnsdist
mailing list