[Pdns-users] Recursor QPS ceiling

Robert Edmonds edmonds at mycre.ws
Mon Dec 22 18:17:53 UTC 2014

Morten Stevens wrote:
> 2014-12-18 0:23 GMT+01:00 Ciro Iriarte <cyruspy at gmail.com>:
> > Hi!, could anybody achieve more than 100k QPS on a recursor while hitting
> > cache?.
> >
> > I only run two tests so far and in both cases the ceiling was around 100k
> > QPS. The machine seems to be able to take more beating,  maybe I just need
> > more clients or there's some kind of logic limit.
> Hi,
> Here is my result compared with bind and unbound:
> 1) pdns-recursor 3.6.2: 169k QPS
> 2) unbound 1.5.1: 327k QPS
> 3) bind 9.9.4-P2: 251k QPS
> I am surprised that bind9 is much faster than pdns-recursor in my test...
> Test machine:
> - Fedora 20
> - Linux 3.17.7
> - Intel Xeon E5 2640 v3 (Haswell-EP)
> - 32 GB DDR4 ECC Memory
> Full dnsperf results:
> [snip]


I've occasionally done some casual benchmarking of various DNS servers,
and I've never been too impressed with dnsperf/resperf.  Those tools
always seem to produce results that are much lower than what they
"should" be.  I don't know why that is, as I haven't actually read the
source code, but I suspect either it adjusts its behavior too punitively
if packets are lost or that a single box running dnsperf/resperf isn't
capable of producing query loads capable of saturating the receiver.

I have not tested PowerDNS specifically, but when I tested BIND and
Unbound recently, I came up with a completely synthetic query generator
setup: it would always ask the same question for an already cached RRset
using the 'trafgen' tool from the netsniff-ng suite, and the query rate
offered would be precisely varied on the sending side using the 'tc'
facility in the Linux kernel.  The notes and results from that benchmark
setup start on slide 30 in this slide deck:


(Note that the plots show responses/second rather than queries/second,
but looking at the raw TX/RX data, the actual query loss rate tends to
be very low until reaching the "maximum".  And I plot against CPU
utilization because I'm interested in efficiency.)

Unbound performed very well in this benchmark scenario (and from my
understanding of PowerDNS's architecture, I would guess PowerDNS would
perform very competitively, too): it handled >800K QPS before dropping a
significant number of queries.  Probably, it would have ended up
saturating the network interface if I had more than four cores in my
test system.  (Note that the query pattern is very unrealistic, though:
identical questions, 100% cache hit rate.)

Also, I would recommend specifying the exact chipset used in the network
interface of the test system, when quoting the system's hardware
details.  I found significant performance differences between three
common Intel gigabit NIC chipsets (I210, I217, and I350).  The lower end
chipsets introduced premature bottlenecks, while the I350 was able to
keep up with the rest of the system and allowed the receiver to come
close to saturating the CPUs in the system.

Robert Edmonds

More information about the Pdns-users mailing list