[dnsdist] dnsdist tuning for high qps on nxdomain ddos

Jasper Aikema jasper.aikema+dnsdist at gmail.com
Fri May 3 20:20:50 UTC 2024


Hi all,

We have been using dnsdist for a couple years and it works pretty nice for
us.

In our situation dnsdist is installed in front of a powerdns server which
host our domains names (250k) as an authoritative dns server.

dnsdist is used to block unwanted request (e.g. when a domain is getting a
ddos with nxdomains request, it automatically creates a whitelist for
existing requests and blocks all other traffic for that domain).

To make things even better (and handle more requests) we are looking into
the setup and try to make it perform better. We have switched to a bind
backend for the domains which are getting a ddos. As far as we have seen it
is better at responding to nxdomain requests because of a lack of a
database query.

Currently we are stuck at a max of +/- 200k qps for nxdomain requests and
want to be able to serve +/- 300k qps per server.

We have done the following:
- added multiple (6x the amount of cores) addLocal listeners for IPv4 and
IPv6, with the options reusePort=true and tcpFastOpenQueueSize=100
- add multiple (2x the amount of cores) newServer to the backend, with the
options tcpFastOpen=true and sockets=(2x the amount of cores)
- setMaxTCPClientThreads(1000)
- stop using connection tracking in the firewall
- run dndist + powerdns on a single machine (8 cores and 16GB ram) and a
bind backend on two seperate servers (also 8 cores and 16GB ram)

And the defaults like caching requests (which doesn't work for nxdomain)
and limit the amount of qps per ip (which also doens't work for nxdomain
attack because they use public resolvers).

On a attack we have the following dnsdist rules (the abuse pool is
connected to the bind backend and the all pool is the powerdns backend)
# dnsdist --client -e 'showRules()';
#   Name                             Matches Rule
                          Action
0                                      29558 qname in <attacked domain>.
                            to pool abuse
1                                       1346 (opcode==4) || (opcode==5) ||
(qtype==AXFR) || (qtype==IXFR) set rcode 5
2                                      11961 IP (/32, /64) match for QPS
over 100 burst 100           delay by 100 msec
3                                   44931079 All
                           to pool all

When we simulate a nxdomain attack (with 200k qps and 500MBit of traffic) ,
we get a high load on the dnsdist server (50% CPU for dsndist and a lot of
interrupts and context switches).

The network connection is not the bottleneck, it is able to do 10G of
traffic, if we only use bind as a backend (so skip dnsdist) it also is able
to serve those numbers.

So the question from me to you are:
- how much qps are you able to push through dnsdist using a powerdns or
bind backend
- have I overlooked some tuning parameters, e.g. more kernel parameters or
some dnsdist parameters
- how can we get more insights into what dnsdist is doing, which metrics
are most usefull to us. I have seen the metrics (
https://dnsdist.org/statistics.html) and keeping an eye on those metrics.
- what is the best method of sending packets for a domain to a seperate
backend, right we now we use 'addAction("<domain>", PoolAction("abuse")),
but is this the least CPU intensive one? Are there better methods?

I have seen eBPF socket filtering, but as far as I have seen that is for
dropping unwanted packets.

Please let me know if you have more questions, happy to answer them.

Kind regards,

Jasper Aikema
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20240503/9bd1a4ce/attachment.htm>


More information about the dnsdist mailing list