[dnsdist] dnsdist performance

Tue Jun 19 20:42:42 UTC 2018

Just a follow-up on this.

We upgraded to 1.3, everything fine.
Redid our test with the following results:
# procs # listenerskq/s
<http://tornado.pert.com.ar/twiki/bin/view/FAQ/PersonalDnsDistPruebasCarga?sortcol=2;table=1;up=0#sorted_table>%
no error
<http://tornado.pert.com.ar/twiki/bin/view/FAQ/PersonalDnsDistPruebasCarga?sortcol=3;table=1;up=0#sorted_table>
1 6 340 94%
1 9 380 98%
1 15 400 100%
1 15 500 99%
1 25 600 98%
3 6 500 99%
Above 550 Kqps we were over 85% of interface usage and were many network
errors.

One important thing we noticed is that in our production configuration we
have ~60 drop rules :
addAction('example.com', DropAction())

Whit these rules the performance dropped almost 30%
We are moving those rules to the backend resolvers now, because it doesn't
make sense
to have this drop on performance when almost all our queries are cached
(98% hit rate)

Thanks again for this great software!

Nico

On Thu, Apr 6, 2017 at 6:33 PM Nico <nicomail at gmail.com> wrote:

> Hi Remi,
> Yes, the new version was almost 30% better in the full config test. Great!
>
> > So quite a noticeable gain but it looks like lock contention is still an
> > issue. I would like to understand why, if you don't mind answering a few
> > questions.
> >
> > - You mentioned having 32 cores, are they real cores or is it with
> > hyper-threading? Intel reports [1] only 8 real cores for the E5-2660, so
> > you should probably stick with at most 8 total threads per CPU
> > (listeners mostly in your case).
> You are right, this is with HT.
> CPU(s):                32
> Thread(s) per core:    2
> Core(s) per socket:    8
> Socket(s):             2
> NUMA node(s):          2
> Model name:            Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
> CPU MHz:               2194.783
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              20480K
>
> Regarding the number of listeners, we run the tests with different amounts
> of listeners:
> 1,2,4,8,12,
> 1 listener was the worse, 120 Kqps,
> the other configs were more or less the same oscillating from 165 to 175
> being the
> 2 and 8 listeners configs the more stable..
>
>
> > - I'd be interested in the results of the dumpStats() and
> > cache:printStats() commands during your test, as well as a perf top,
> > ideally with a vanilla dnsdist and a dnsdist-concur.
> See attached file
>
> > - The cache cleaning algo might be a bit aggressive by default, you can
> > tweak it with:
> > setCacheCleaningDelay(30)
> > setCacheCleaningPercentage(20)
> Done, but no impact. (because of our test set)
>
> > - Exporting carbon data to our public metronome instance would be great
> > too, as it would immediately make a lot of metrics available to us. You
> > can do that with: carbonServer('37.252.122.50  ', '<yourname>', 30)
> unfortunately can't do that. This is on a closed net.
> We have our own carbon-graphite and check the stats there.
> can send you any additional info you like to get.
>
> > - Tuning the network buffer might also help:
> > net.core.rmem_max=33554432
> > net.core.wmem_max=33554432
> > net.core.rmem_default=16777216
> > net.core.wmem_default=16777216
> Already done with very similar values.
> Also tried kernel.sched_migration_cost_ns, but with no visible impact.
>
>
> > - Would you consider upgrading your kernel? There has been a lot of
> > improvements since 3.10.0, and we noticed huge performance increases in
> > the past just by upgrading to a 4.x one.
> I would like to do that, but we are required to use redhat....
> We've done some tests on a small core2 with 4 cores whith 4.9 and
> we obtained almost the same results as in the "big one".
> This was a surprise.
> Trying to find a way (if security approves) to update redhat kernel.
>
>
> > Oh and if you didn't already, would you mind setting
> > setMaxUDPOutstanding() to 65535? Even at a 99% cache hit ratio, that
> > leaves quite a few requests going to the backend so we better be sure we
> > don't mess up these. The cache in dnsdist tries very hard not to degrade
> > performance, so we prefer skipping the cache and passing the query to a
> > backend rather than waiting for a cache lock, for example.
> Already done, also no difference.
> The queries we are sending are ~50 continously repeating.
>
> Will keep testing. But I think this is all we can get by now.
> The optimum config now seems to be 3 processes with 6 or 8 listeners each.
> Will have to do some workarounds on the stats (aggregation rules on
> graphite?) and
> service control scripts.
>
> Thanks again!
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20180619/11b5308a/attachment.html>