<div dir="ltr"><div dir="ltr"><div dir="ltr"><span class="gmail-im"><span><div>> 200k QPS is fairly low based on what you describe. Would you mind <br>


> sharing the whole configuration (redacting passwords and keys, of <br>


> course), and telling us a bit more about the hardware dnsdist is running on?</div><div><br></div></span><div>The


 server is a virtual server (Ubuntu 22.04) on our vmware platform with 


16GB of memory and 8 cores (Intel Xeon 4214R @2.4Ghz). I have pasted the


 new config at the bottom of this message.<br></div></span><span><span class="gmail-im"><div></div><div><br></div><div>> 6 times the amount of cores is probably not a good idea. I usually <br>


> advise to make it so that the number of threads is roughly equivalent to <br>


> the number of cores that are dedicated to dnsdist, so in your case the <br>


> number of addLocal + the number of newServer + the number of TCP workers <br>


> should ideally match the number of cores you have. If you need to <br>


> overcommit the cores a bit that's fine, but keep it to something like <br>


> twice the number of cores you have, not 10 times.</div><div><br></div><div>> I'm pretty sure this does not make sense, I would first go with the <br>


> default until you see TCP/DoT connections are not processed correctly.</div><div><br></div></span></span><span class="gmail-im"><div>I


 did overcommit / try to tune, because I was getting a high number of 


udp-in-errors and also a high number of Drops in showServers().</div><div>If those issues are gone, I agree there should be no reason to overcommit.</div></span></div><div dir="ltr"><span><div><br></div><span class="gmail-im"><div>> When you say it doesn't work for NXDomain, I'm assuming you mean it <br>


> doesn't solve the problem of random sub-domains attacks, not that a <br>


> NXDomain is not properly cached/accounted?</div><div><br></div></span></span><span class="gmail-im"><div>Yes.


 That is indeed what I meant, the responses are getting cached, but that


 is exactly why nxdomains attacks are working. They request a lot of 


random sub-domains and caching doesnt help making it more responsive.<br></div><span><div><br></div><div>> I expect lowering the number of threads will reduce the context switches <br>


> a lot. If you are still not getting good QPS numbers, I would suggest <br>


> checking if disabling the rules help, to figure out the bottleneck. You <br>


> might also want to take a look with "perf top -p <pid of dnsdist>" <br>


> during the high load to see where the CPU time is spent.</div><div><br></div></span><div>I have updated the config and lowered the threads. But now I get a high number of udp-in-errors. The perf top command gives:</div><div><br></div><div>Samples:


 80K of event 'cpu-clock:pppH', 4000 Hz, Event count (approx.): 


15028605853 lost: 0/0 drop: 0/0                                          


                                                               <br>Overhead


  Shared Object                                   Symbol                


                                                        <br>   4.78% 


 [kernel]                                        [k] __lock_text_start  


                                                       <br>   2.29% 


 [kernel]                                        [k] 


copy_user_generic_unrolled                                              


  <br>   2.29%  [kernel]                                        [k] 


copy_from_kernel_nofault                                                


  <br>   1.86%  [nf_conntrack]                                  [k] 


__nf_conntrack_find_get                                                  


 <br>   1.81%  [kernel]                                        [k] 


__fget_files                                                            


  <br>   1.42%  [kernel]                                        [k] 


_raw_spin_lock                                                          


  <br>   1.39%  [vmxnet3]                                       [k] 


vmxnet3_poll_rx_only                                                    


  <br>   1.34%  [kernel]                                        [k] 


finish_task_switch.isra.0                                                


 <br>   1.32%  [nf_tables]                                     [k] 


nft_do_chain                                                            


  <br>   1.23%  libc.so.6                                       [.] 


cfree                                                                    


 <br>   1.08%  [kernel]                                        [k] 


__siphash_unaligned                                                      


 <br>   1.07%  [kernel]                                        [k] 


syscall_enter_from_user_mode                                            


  <br>   1.05%  [kernel]                                        [k] 


memcg_slab_free_hook                                                    


  <br>   1.00%  [kernel]                                        [k] memset_orig                  <br></div><div><br></div><div>We have the following configuration:</div><div><br></div><div>setACL({'<a href="http://0.0.0.0/0" target="_blank">0.0.0.0/0</a>', '::/0'})<br>controlSocket("<a href="http://127.0.0.1:5900" target="_blank">127.0.0.1:5900</a>")<br>setKey("<pwd>")<br>webserver("<a href="http://127.0.0.1:8083" target="_blank">127.0.0.1:8083</a>")<br>setWebserverConfig({password=hashPassword("<pwd>")})<br>addLocal("<own IPv4>:53",{reusePort=true,tcpFastOpenQueueSize=100})<br>addLocal("<own IPv4>:53",{reusePort=true,tcpFastOpenQueueSize=100})<br>newServer({address="<a href="http://127.0.0.1:54" target="_blank">127.0.0.1:54</a>", pool="all"})<br>newServer({address="<a href="http://127.0.0.1:54" target="_blank">127.0.0.1:54</a>", pool="all"})<br>newServer({address="<bind server 1>:53", pool="abuse", tcpFastOpen=true, maxCheckFailures=5, sockets=16})<br>newServer({address="<bind server 2>:53", pool="abuse", tcpFastOpen=true, maxCheckFailures=5, sockets=16})<br>addAction(OrRule({OpcodeRule(DNSOpcode.Notify),


 OpcodeRule(DNSOpcode.Update), QTypeRule(DNSQType.AXFR), 


QTypeRule(DNSQType.IXFR)}), RCodeAction(DNSRCode.REFUSED))<br>addAction(AllRule(), PoolAction("all"))</div><div><br></div><div>We have removed the caching and qps blocker per IP, because we are attacking it from 4 servers.</div><div><br></div><div>Already thanks for all the help you can give me.</div></span></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Op ma 6 mei 2024 om 10:41 schreef Remi Gacogne via dnsdist <<a href="mailto:dnsdist@mailman.powerdns.com">dnsdist@mailman.powerdns.com</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br>


<br>


On 03/05/2024 22:20, Jasper Aikema via dnsdist wrote:<br>


> Currently we are stuck at a max of +/- 200k qps for nxdomain requests <br>


> and want to be able to serve +/- 300k qps per server.<br>


<br>


200k QPS is fairly low based on what you describe. Would you mind <br>


sharing the whole configuration (redacting passwords and keys, of <br>


course), and telling us a bit more about the hardware dnsdist is running on?<br>


<br>


> We have done the following:<br>


> - added multiple (6x the amount of cores) addLocal listeners for IPv4 <br>


> and IPv6, with the options reusePort=true and tcpFastOpenQueueSize=100<br>


 > - add multiple (2x the amount of cores) newServer to the backend, with<br>


 > the options tcpFastOpen=true and sockets=(2x the amount of cores)<br>


<br>


6 times the amount of cores is probably not a good idea. I usually <br>


advise to make it so that the number of threads is roughly equivalent to <br>


the number of cores that are dedicated to dnsdist, so in your case the <br>


number of addLocal + the number of newServer + the number of TCP workers <br>


should ideally match the number of cores you have. If you need to <br>


overcommit the cores a bit that's fine, but keep it to something like <br>


twice the number of cores you have, not 10 times.<br>


<br>


> - setMaxTCPClientThreads(1000)<br>


I'm pretty sure this does not make sense, I would first go with the <br>


default until you see TCP/DoT connections are not processed correctly.<br>


<br>


> And the defaults like caching requests (which doesn't work for nxdomain) <br>


> and limit the amount of qps per ip (which also doens't work for nxdomain <br>


> attack because they use public resolvers).<br>


<br>


When you say it doesn't work for NXDomain, I'm assuming you mean it <br>


doesn't solve the problem of random sub-domains attacks, not that a <br>


NXDomain is not properly cached/accounted?<br>


> When we simulate a nxdomain attack (with 200k qps and 500MBit of <br>


> traffic) , we get a high load on the dnsdist server (50% CPU for dsndist <br>


> and a lot of interrupts and context switches).<br>


<br>


I expect lowering the number of threads will reduce the context switches <br>


a lot. If you are still not getting good QPS numbers, I would suggest <br>


checking if disabling the rules help, to figure out the bottleneck. You <br>


might also want to take a look with "perf top -p <pid of dnsdist>" <br>


during the high load to see where the CPU time is spent.<br>


<br>


> So the question from me to you are:<br>


> - how much qps are you able to push through dnsdist using a powerdns or <br>


> bind backend<br>


<br>


It really depends on the hardware you have and the rules you are <br>


enabling, but it's quite common to see people pushing 400k+ QPS on a <br>


single DNSdist without a lot of fine tuning, and a fair amount of <br>


remaining head-room.<br>


<br>


> - have I overlooked some tuning parameters, e.g. more kernel parameters <br>


> or some dnsdist parameters<br>


<br>


I shared a few parameters a while ago: [1].<br>


<br>


> - what is the best method of sending packets for a domain to a seperate <br>


> backend, right we now we use 'addAction("<domain>", <br>


> PoolAction("abuse")), but is this the least CPU intensive one? Are there <br>


> better methods?<br>


<br>


It's the best method and should be really cheap.<br>


<br>


 > I have seen eBPF socket filtering, but as far as I have seen that is <br>


for dropping unwanted packets.<br>


<br>


Correct. You could look into enabling AF_XDP / XSK [2] but I would <br>


recommend checking that you really cannot get the performance you want <br>


with normal processing first, as AF_XDP has some rough edges.<br>


<br>


[1]: <a href="https://mailman.powerdns.com/pipermail/dnsdist/2023-January/001271.html" rel="noreferrer" target="_blank">https://mailman.powerdns.com/pipermail/dnsdist/2023-January/001271.html</a><br>


[2]: <a href="https://dnsdist.org/advanced/xsk.html" rel="noreferrer" target="_blank">https://dnsdist.org/advanced/xsk.html</a><br>


<br>


Best regards,<br>


-- <br>


Remi Gacogne<br>


PowerDNS B.V<br>


_______________________________________________<br>


dnsdist mailing list<br>


<a href="mailto:dnsdist@mailman.powerdns.com" target="_blank">dnsdist@mailman.powerdns.com</a><br>


<a href="https://mailman.powerdns.com/mailman/listinfo/dnsdist" rel="noreferrer" target="_blank">https://mailman.powerdns.com/mailman/listinfo/dnsdist</a><br>


</blockquote></div>