[dnsdist] Clarification on weight in newServer option

Frank Even lists+powerdns.com at elitists.org
Fri Jul 28 08:30:45 UTC 2017

On Fri, Jul 28, 2017 at 1:02 AM, Remi Gacogne <remi.gacogne at powerdns.com> wrote:
> On 07/27/2017 07:53 PM, Frank Even wrote:
>> So, weight seems to be honored on initial traffic receipt.  But if I
>> test by taking down the node with a higher weighting, so the traffic
>> shifts to nodes with lower weighting, then I bring the heavier
>> weighted node back into rotation, traffic does not seem to shift back
>> to it.  Now maybe it eventually does, but in 5 minutes or so of
>> testing it did not.  Even after taking away the test traffic from the
>> client and then re-applying it after a couple minutes, traffic was
>> still choosing the lesser weighted node, not the heavier weighted.  I
>> guess the question is if left for a longer length of time, would it
>> eventually start honoring the weight?
> That's not expected, especially since we keep no state to do the
> load-balancing. Which policy are you using, wrandom or whashed?

Whatever is default.  Is whashed default?  Here's my config (minus the
ACLs and setKey), it was simple for testing:

# dnsdist -V
dnsdist 1.1.0 (Lua 5.1)
Enabled features: dnscrypt libsodium protobuf re2 systemd

-- listen on following address:port
-- backend resolvers
newServer({address="", order=0, weight=1000,
checkType="A", checkName="example.com", maxCheckFailures=5,
newServer({address="", weight=50, checkType="A",
checkName="example.com", maxCheckFailures=5, mustResolve=true})

> How do you "take down" the node, and more importantly does dnsdist
> correctly mark it up when you bring it back?

I just shut down named on the backend node.  Yes, dnsdist does mark it
correctly as up/down/up.  But traffic does not shift back to it until
dnsdist is restarted.  Now, this was not a long length test.  5
minutes at most probably after bringing the other node back in, but in
that time, traffic did not shift back.  To get the desired behavior in
my test environment I had to add the order option.  With that, then it
does flip it right back when I bring the .74 node back online.  I was
going to add in a third backend node to do a little more traffic shift
testing, as we have anycast environments throughout the world and I'm
trying to come up with a strategy that doesn't blast traffic all over
the place until load or redundancy needs require it.  An initial
implementation on a busy cluster in one part of the US spread traffic
across multiple backend nodes consistently across the world.  I
suppose it wouldn't have been so bad, relying on latency, etc. until
we had an app with numerous 60 second TTLs that was hitting this
cluster.  I inadvertently caused them a bit of latency, so I'm
definitely going to need to tune our installation and I just want to
ensure I understand the traffic patterns so I can tune it properly to
our environment.


More information about the dnsdist mailing list