[Pdns-users] pdns-recursor flooded with bogus lookups, SERVFAILs ensue

russell nealis codemunkee at gmail.com
Sat Mar 29 19:48:15 UTC 2014


Hi Everyone,

I'm having some trouble with my recursor seemingly getting overloaded and
returning servfail messages for addresses it should otherwise return
successfully.

In a nutshell, I have a DNS recursor that I provide to customers on my
network, it limits queries to only CIDR ranges I control. However, some of
my customers seem to be inadvertently running open recursors on the
Internet by using Windows AD, and when their machine can't resolve a bogus
name, they forward it along to my recursors.

The names being looked up are complete garbage, e.g., I'm watching tcpdump
right now and see a request for hckcj6aq71ae0f6c.net,
hckq6cj1682cz7hph3b98fypjw3lzoy.com, etc). Each time one of these gets
looked up it takes my recusor approximately 2 seconds to figure out that
it's not a real domain. After that it's cached, but since there is so much
junk coming in my thread count shoots up to 1000 and plateuaus there. I see
about 3000 reqs per sec on average aggregate but it can shoot up to 6000.

When I use dnstop I can pretty quickly establish what customers are giving
me trouble, and I can even see a pattern of certain bogus domains being hit
more frequently than others. If I put in  forward entries on my server for
some of the most frequently hit bogus domains things do calm down because I
suspect it's not longer reaching out to TLD DNS servers to try to look up
the bogus entry and it more or less immediately returns a SERVFAIL.

I understand the proper approach is to tell the customers to stop allowing
DNS recursion on the public internet, and I'm working on that. However, I
have thousands of customer machines and it's likely that this will crop up
again. So my questions are:

(1) Do you suspect this is a DNS amplification attack where my customers
machines are getting abused? Or some other kind of attack (e.g. DNS cache
poisoning?)
(2) I've considered using iptables to slow down the query rate allowed by
the customers but in the documentation it says I should be wary of using
iptables since the volume of traffic could quickly overwhelm it? I noticed
there is a throttle mechanism mentioned in the documentation but I can't
determine whether that's something I can configure or if it's just built in
logic.
(3) In general, what would you recommend to be proactive with something
like this? I'm thinking about writing some code to run dnstop and look for
customers that seem to be misconfigured and then put in ACLs on my network
appliances to block their traffic to my recursors until they remedy their
machines, however this seems heavy handed.

Thanks for your help, I appreciate it!

-Russ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/pdns-users/attachments/20140329/208d6c23/attachment.html>


More information about the Pdns-users mailing list