[dnsdist] Clarification on weight in newServer option
Frank Even
lists+powerdns.com at elitists.org
Fri Jul 28 08:30:45 UTC 2017
On Fri, Jul 28, 2017 at 1:02 AM, Remi Gacogne <remi.gacogne at powerdns.com> wrote:
> On 07/27/2017 07:53 PM, Frank Even wrote:
>> So, weight seems to be honored on initial traffic receipt. But if I
>> test by taking down the node with a higher weighting, so the traffic
>> shifts to nodes with lower weighting, then I bring the heavier
>> weighted node back into rotation, traffic does not seem to shift back
>> to it. Now maybe it eventually does, but in 5 minutes or so of
>> testing it did not. Even after taking away the test traffic from the
>> client and then re-applying it after a couple minutes, traffic was
>> still choosing the lesser weighted node, not the heavier weighted. I
>> guess the question is if left for a longer length of time, would it
>> eventually start honoring the weight?
>
> That's not expected, especially since we keep no state to do the
> load-balancing. Which policy are you using, wrandom or whashed?
Whatever is default. Is whashed default? Here's my config (minus the
ACLs and setKey), it was simple for testing:
# dnsdist -V
dnsdist 1.1.0 (Lua 5.1)
Enabled features: dnscrypt libsodium protobuf re2 systemd
--
-- listen on following address:port
addLocal("10.36.181.95:53")
--
-- backend resolvers
--
newServer({address="10.36.191.74", order=0, weight=1000,
checkType="A", checkName="example.com", maxCheckFailures=5,
mustResolve=true})
newServer({address="10.36.191.75", weight=50, checkType="A",
checkName="example.com", maxCheckFailures=5, mustResolve=true})
--
> How do you "take down" the node, and more importantly does dnsdist
> correctly mark it up when you bring it back?
I just shut down named on the backend node. Yes, dnsdist does mark it
correctly as up/down/up. But traffic does not shift back to it until
dnsdist is restarted. Now, this was not a long length test. 5
minutes at most probably after bringing the other node back in, but in
that time, traffic did not shift back. To get the desired behavior in
my test environment I had to add the order option. With that, then it
does flip it right back when I bring the .74 node back online. I was
going to add in a third backend node to do a little more traffic shift
testing, as we have anycast environments throughout the world and I'm
trying to come up with a strategy that doesn't blast traffic all over
the place until load or redundancy needs require it. An initial
implementation on a busy cluster in one part of the US spread traffic
across multiple backend nodes consistently across the world. I
suppose it wouldn't have been so bad, relying on latency, etc. until
we had an app with numerous 60 second TTLs that was hitting this
cluster. I inadvertently caused them a bit of latency, so I'm
definitely going to need to tune our installation and I just want to
ensure I understand the traffic patterns so I can tune it properly to
our environment.
Thanks!
More information about the dnsdist
mailing list