[dnsdist] dnsdist tuning for high qps on nxdomain ddos
Remi Gacogne
remi.gacogne at powerdns.com
Mon May 6 08:41:38 UTC 2024
Hi!
On 03/05/2024 22:20, Jasper Aikema via dnsdist wrote:
> Currently we are stuck at a max of +/- 200k qps for nxdomain requests
> and want to be able to serve +/- 300k qps per server.
200k QPS is fairly low based on what you describe. Would you mind
sharing the whole configuration (redacting passwords and keys, of
course), and telling us a bit more about the hardware dnsdist is running on?
> We have done the following:
> - added multiple (6x the amount of cores) addLocal listeners for IPv4
> and IPv6, with the options reusePort=true and tcpFastOpenQueueSize=100
> - add multiple (2x the amount of cores) newServer to the backend, with
> the options tcpFastOpen=true and sockets=(2x the amount of cores)
6 times the amount of cores is probably not a good idea. I usually
advise to make it so that the number of threads is roughly equivalent to
the number of cores that are dedicated to dnsdist, so in your case the
number of addLocal + the number of newServer + the number of TCP workers
should ideally match the number of cores you have. If you need to
overcommit the cores a bit that's fine, but keep it to something like
twice the number of cores you have, not 10 times.
> - setMaxTCPClientThreads(1000)
I'm pretty sure this does not make sense, I would first go with the
default until you see TCP/DoT connections are not processed correctly.
> And the defaults like caching requests (which doesn't work for nxdomain)
> and limit the amount of qps per ip (which also doens't work for nxdomain
> attack because they use public resolvers).
When you say it doesn't work for NXDomain, I'm assuming you mean it
doesn't solve the problem of random sub-domains attacks, not that a
NXDomain is not properly cached/accounted?
> When we simulate a nxdomain attack (with 200k qps and 500MBit of
> traffic) , we get a high load on the dnsdist server (50% CPU for dsndist
> and a lot of interrupts and context switches).
I expect lowering the number of threads will reduce the context switches
a lot. If you are still not getting good QPS numbers, I would suggest
checking if disabling the rules help, to figure out the bottleneck. You
might also want to take a look with "perf top -p <pid of dnsdist>"
during the high load to see where the CPU time is spent.
> So the question from me to you are:
> - how much qps are you able to push through dnsdist using a powerdns or
> bind backend
It really depends on the hardware you have and the rules you are
enabling, but it's quite common to see people pushing 400k+ QPS on a
single DNSdist without a lot of fine tuning, and a fair amount of
remaining head-room.
> - have I overlooked some tuning parameters, e.g. more kernel parameters
> or some dnsdist parameters
I shared a few parameters a while ago: [1].
> - what is the best method of sending packets for a domain to a seperate
> backend, right we now we use 'addAction("<domain>",
> PoolAction("abuse")), but is this the least CPU intensive one? Are there
> better methods?
It's the best method and should be really cheap.
> I have seen eBPF socket filtering, but as far as I have seen that is
for dropping unwanted packets.
Correct. You could look into enabling AF_XDP / XSK [2] but I would
recommend checking that you really cannot get the performance you want
with normal processing first, as AF_XDP has some rough edges.
[1]: https://mailman.powerdns.com/pipermail/dnsdist/2023-January/001271.html
[2]: https://dnsdist.org/advanced/xsk.html
Best regards,
--
Remi Gacogne
PowerDNS B.V
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20240506/44770c1e/attachment.sig>
More information about the dnsdist
mailing list