[dnsdist] dnsdist 1.7.4 Debian Bullseye vs 1.8.4 Bullseye

Aleš Rygl ales at rygl.net
Mon Oct 9 10:32:41 UTC 2023


Hi
> On 05/10/2023 10:41, Aleš Rygl via dnsdist wrote:
>>      Thanks for your response. After some deep documentation reading 
>> and config tweaking I am nearly on the previous values regarding CPU 
>> load, apart from latency, which is still higher (1.3ms -> 2.3ms). I 
>> suspect a different way the latency is likely computed (I noticed a 
>> new set of latency counters for TLS, TCP, etc.) here.  The key 
>> configuration parameter is setMaxTCPClientThreads(). Changing 
>> anything else (cache shards, number of listeners, etc.) has nearly no 
>> impact. We had 256 with 1.7.4. now it is 16. Going up here means a 
>> rapid increase of CPU load, having less than 16 means dropping TCP 
>> connections in showTCPStats(), where Queued hits Max Queued. Insane 
>> values like 1024 kills the CPU. We have a physical server with 16 
>> phys. cores, OS sees 32 cores.
>
> OK, this is clearly unexpected. I mean, since 1.4.0 you should not be 
> needing more TCP worker threads than the number of cores, since a 
> single worker can handle a lot (easily thousands) of TCP connections, 
> but having a larger value should not kill the CPU so I'm wondering if 
> we are busy-looping somewhere. I have not been able to reproduce that 
> so far, so I would be really interested in seeing the perf output if 
> you can get it.
>
Update: after some testing I can say that dnsdist 1.7.4 on Bookworm has 
the same issue as 1.8.1. The reason is apparently here: 
https://github.com/openssl/openssl/issues/17064. There is a safe 
workaround - lowering setMaxTCPClientThreads(). Watch out TCP queueing - 
use showTCPStats(). And improving TLS performance using STEK file can 
help as well.

I'd like to thank Remi for his excellent support.

Ales





More information about the dnsdist mailing list