[Pdns-users] test driving LMDB backend
Bart Mortelmans
powerdns at bart.bim.be
Mon May 6 08:02:24 UTC 2019
Hi,
I've been test-driving the new PowerDNS LMDB backend. Even though my
tests are very basic, I though some of you might be interested in my
findings.
TL;DR: It's easy to set-up (at least as slave). In my basic set-up, it
could handle about 7 times the load the MySQL back-end could handle. And
it starts incredibly fast.
I used a cheap 1CPU 1G RAM VPS for my tests.
For this I compiled the latest source available from github. If anybody
is interested in instructions on how to get this compiled on CentOS 7,
let me know.
To enable lmdb, I simply put this in pdns.conf:
launch=lmdb
lmdb-filename=/var/pdns/pdns2.lmdb
After starting PDNS it uses about 60Mb of memory. As expected, this
remains the same as you start loading up zones into the database. Only
the disk cache (which the system takes care of) increases.
Once you actually start asking questions, PDNS memory usage does grow. I
guess at some point, the query cache and packet cache might end up
cannibalizing the disk cache.
I loaded it with about 25000 slave zones. Mostly small zones, which I
guess would be typical for a shared DNS-hosting (since that's where they
come from). The folder in which the LMDB database is kept, only grew to
67Mb.
If you restart PDNS, it will be responding to requests in less than 1
second.
I actually found out because systemd was restarting the service every
couple of minutes. Turns out that I should have put "Type=simple" in the
.service-file instead of "Type=notify".
As a very basic test to make sure the zone transfers went okay, I
checked the results of "dig -t AXFR" from the test-server and the
master-server of 1000 of the domain names and all of them were identical.
LMDB is meant to be quick, so I wanted to do some tests on the load it
could handle. I used dnsblast for this and kept an eye on the PowerDNS
Metronome service.
While dnsblast was running, I also had a simple script requesting ever
changing random subdomains from one of the zones. If it didn't receive
the correct answer within 1 second, it would print an error.
dnsblast also requests mostly random subdomains (I changed it to request
subdomains from a domain name in my DB). It should be a reasonably good
test of what your DB can handle.
I could start dnsblast with 7500 requests per second before a line
appeared on the Metronome "DB queue" graph. All the requests from my
separate test-script were still being answered. When I went up to 10000
requests per second sent by dnsblast, some of the requests of my
test-script were not answered in time and the DB-queue went up to about
2000.
Once the first requests need to be queued for the DB, you should be near
the maximum of the sustained load you'll be able to handle. In this
case, this seemed to be around 7500 requests per second.
I did the same test with a similar set-up but using the MySQL back-end.
In that case, at about 1000 requests per second the DB-queue had some
requests in it and from 1500 requests per second not all the requests
from my separate script were being answered. Metronome also started
showing figures in the "Timedout queries" graph.
In this case the maximum of the sustained load seems to be around 1000
unique requests per second.
The numbers are there more for comparison then actually knowing what
this back-end can handle. Again: all this on very basic machines, 1CPU
and 1G RAM. And dnsblast was running on the same machine and using about
30% of that one CPU...
When at maximum load, it clearly was the CPU that was the bottleneck. So
if you want to be able to handle a bigger load, adding extra CPU-power
should be the first priority.
What did confuse me was the reply ratio shown by dnsblast. From about
300 requests per second and more, the reply rate went below 100% and
would many times be around 30% or even lower. This seemed to resemble
the numbers shown by Metronome as "UDP in-error/s". So it looked like
most requests were not being answered, but at the same time my separate
script would still get replies within less then 1 second to every single
one of its requests. Can anybody shed some light on what "UDP
in-error/s" means?
This was consistent both with MySQL back-end and LMDB.
Regards,
Bart Mortelmans
More information about the Pdns-users
mailing list