[Pdns-users] UDP Connection Table Exhaustion?

Stefan Schmidt stefan.schmidt at freenet.ag
Mon Jul 6 09:28:15 UTC 2009


On Thu, Jul 02, 2009 at 06:04:03PM +0200, Sten Spans wrote:
>> Hey there,
>>
>> Does anyone have any tips and/or tricks for running a medium-scale DNS  
>> recursive resolver appropriate to my situation? Medium being bigger 
>> than "run it off a DSL router" but smaller than "get a server farm to 
>> do it"!

0- do _not_ run it on a DSL router ;)
10- get a server farm to do it, no really if you get a chance to be
running a redundant recursive dns setup do it, today so many services
depend on this core service that the redundancy should be well worth it.
Also note that most stub resolvers (like libc) are taking their time
when failing over to the next nameserver entry in resolv.conf, some even
stall for 30s by default.

> 1- make sure you configure a limit for max-cache-entries
>    otherwise it will keep growing and consume all memory.
>    100k - 500k should cover most regular servers.

If you're already using ~1GB it's probably around 2.5 million entries
already. 1 million gives me 78% hitrate but this really depends on your
customerbase.

> 2- don't load any iptables modules if at all possible,
>    the state tracking causes serious (performance) problems
>    on loaded servers.

I do run connection tracking and am running 4 recursor processes [1] on
a Dual DC Opteron > 1000 qps each without problems with these settings:
net.ipv4.netfilter.ip_conntrack_udp_timeout=60
net.ipv4.netfilter.ip_conntrack_udp_timeout_stream=120
net.ipv4.netfilter.ip_conntrack_max=1000000
kernel.printk=4 4 1 7
net.core.netdev_max_backlog=3000
vm.min_free_kbytes=8118
net.core.rmem_max=8388608
net.core.wmem_max=8388608
net.core.rmem_default=2097152
net.core.wmem_default=2097152
net.ipv4.tcp_syncookies=0

The default for ip_conntrack_max is simply not made for nameservers, so
it needs to be tuned.

[1] This is a load balancing setup ofc. I'm using linux IPVS+keepalived.

> 3- make sure to explicitly configure incoming and outgoing ips,
>    preferably different ones. This enables later load-balancing / anycast
>    schemes and makes the traffic-flow predictable. If your network
>    layout is a bit more complicated (bgp, multiple gateways) then the
>    linus arp_announce and arp_filter sysctls should be tweaked to
>    make sure that Linux selects the correct source-ips.

yepp, totaly worth it.

> 4- use allow-from-file to configure the ranges that should be allowed
>    to use your nameservers.

That is _if_ you really have that many distinct IP-Ranges your clients
come from, i still use the 'regular' allow-from statement.

> This should allow you to handle 5-10k queries on reasonable hardware
> with a decent uplink (100mbit). Anything beyond that will require
> compilation with a recent compiler and system specific tuning,
> binding pdns to a specific cpu and the ethernet driver to another
> for example. This kind of tuning should only be done with careful
> measurements to test the effect of each change.

Also read the PERFORMANCE section of the README file from the recursors
tarball. c++ profiling for the recursor works just fine for me..

	Stefan
-- 
- All right, Gamma shift, time to defend the Federation against gaseous
  anomalies.
Commander Janice Rand, "Flashback.", ST-Voyager 



More information about the Pdns-users mailing list