[dnsdist] dnsdist tuning for high qps on nxdomain ddos

Remi Gacogne remi.gacogne at powerdns.com
Mon May 6 08:41:38 UTC 2024


Hi!

On 03/05/2024 22:20, Jasper Aikema via dnsdist wrote:
> Currently we are stuck at a max of +/- 200k qps for nxdomain requests 
> and want to be able to serve +/- 300k qps per server.

200k QPS is fairly low based on what you describe. Would you mind 
sharing the whole configuration (redacting passwords and keys, of 
course), and telling us a bit more about the hardware dnsdist is running on?

> We have done the following:
> - added multiple (6x the amount of cores) addLocal listeners for IPv4 
> and IPv6, with the options reusePort=true and tcpFastOpenQueueSize=100
 > - add multiple (2x the amount of cores) newServer to the backend, with
 > the options tcpFastOpen=true and sockets=(2x the amount of cores)

6 times the amount of cores is probably not a good idea. I usually 
advise to make it so that the number of threads is roughly equivalent to 
the number of cores that are dedicated to dnsdist, so in your case the 
number of addLocal + the number of newServer + the number of TCP workers 
should ideally match the number of cores you have. If you need to 
overcommit the cores a bit that's fine, but keep it to something like 
twice the number of cores you have, not 10 times.

> - setMaxTCPClientThreads(1000)
I'm pretty sure this does not make sense, I would first go with the 
default until you see TCP/DoT connections are not processed correctly.

> And the defaults like caching requests (which doesn't work for nxdomain) 
> and limit the amount of qps per ip (which also doens't work for nxdomain 
> attack because they use public resolvers).

When you say it doesn't work for NXDomain, I'm assuming you mean it 
doesn't solve the problem of random sub-domains attacks, not that a 
NXDomain is not properly cached/accounted?
> When we simulate a nxdomain attack (with 200k qps and 500MBit of 
> traffic) , we get a high load on the dnsdist server (50% CPU for dsndist 
> and a lot of interrupts and context switches).

I expect lowering the number of threads will reduce the context switches 
a lot. If you are still not getting good QPS numbers, I would suggest 
checking if disabling the rules help, to figure out the bottleneck. You 
might also want to take a look with "perf top -p <pid of dnsdist>" 
during the high load to see where the CPU time is spent.

> So the question from me to you are:
> - how much qps are you able to push through dnsdist using a powerdns or 
> bind backend

It really depends on the hardware you have and the rules you are 
enabling, but it's quite common to see people pushing 400k+ QPS on a 
single DNSdist without a lot of fine tuning, and a fair amount of 
remaining head-room.

> - have I overlooked some tuning parameters, e.g. more kernel parameters 
> or some dnsdist parameters

I shared a few parameters a while ago: [1].

> - what is the best method of sending packets for a domain to a seperate 
> backend, right we now we use 'addAction("<domain>", 
> PoolAction("abuse")), but is this the least CPU intensive one? Are there 
> better methods?

It's the best method and should be really cheap.

 > I have seen eBPF socket filtering, but as far as I have seen that is 
for dropping unwanted packets.

Correct. You could look into enabling AF_XDP / XSK [2] but I would 
recommend checking that you really cannot get the performance you want 
with normal processing first, as AF_XDP has some rough edges.

[1]: https://mailman.powerdns.com/pipermail/dnsdist/2023-January/001271.html
[2]: https://dnsdist.org/advanced/xsk.html

Best regards,
-- 
Remi Gacogne
PowerDNS B.V
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20240506/44770c1e/attachment.sig>


More information about the dnsdist mailing list