[Pdns-users] LUA records to load balance

Adam Carrgilson AdamC at arcing.co.uk
Fri Oct 27 16:19:27 UTC 2023


Hi there,



I’m hoping to garner some assistance and advice as I’m hoping I could
replace an existing load balancing solution entirely with PowerDNS and Lua
records.



I’ve put quite a bit of work into this, but I’m reaching some dead ends
that I hope someone experienced might be able to help me with.



What I’m load balancing is a geographically dispersed object storage
system, consisting of three sites; each site consists of eight nodes, and
any of these can answer a query.

The system can run with one out of the three sites being completely down,
or any number of the constituent nodes being down, and we take advantage of
it in undertaking maintenance.



Currently in my approach, I undertake a DNS-based weighted round robin
around each of the sites; weighted because each site stores a different
capacity, and I want to prefer the least heavily utilised site. This
weighted round robin takes into account the overall health of the nodes
making up each site and won’t respond with a site that has a total outage.
The response should be a list of all the addresses of healthy nodes,
potentially eight IP’s. There are two health checks that probe each node,
one on an HTTP endpoint, and another on HTTPS.



The weighted round robin, I understand I can achieve in PowerDNS with a LUA
record with the pickwrandom command.

That was the easy part, I also thought I could utilise a separate record to
store the weighting configuration in.

What I’ve come up with there is:



caimito-weighting.internal 0    IN  LUA LUA   "weightinga={'110'};
weightingb={'50'}; weightingc={'175'}"

caimito.internal           0    IN  LUA CNAME "; include('caimito-weighting.
internal'); return pickwrandom({ {weightinga,  'caimito-a.internal'}, {w
eightingb,  'caimito-b.internal'}, {weightingc, 'caimito-c.internal'} })"



I had that working before I shifted it to using variables as the weighting,
and I can’t understand why it’s failing me now. I’m also not sure how I
would approach a site having no healthy IP addresses available.



That was using records like this:

caimito-a.internal      0   IN  A   10.10.0.1

caimito-a.internal      0   IN  A   10.10.0.2

caimito-a.internal      0   IN  A   10.10.0.3

caimito-b.internal      0   IN  A   10.10.1.1

caimito-b.internal      0   IN  A   10.10.1.2

caimito-b.internal      0   IN  A   10.10.1.3

caimito-c.internal      0   IN  A   10.10.2.1

caimito-c.internal      0   IN  A   10.10.2.2

caimito-c.internal      0   IN  A   10.10.2.3



Whereas what I’d like to do is something more akin to:

caimito-a.internal      0   IN  LUA  A   "ifurlup('
http://10.10.0.1:9020/?ping', {'10.10.0.1'}, {stringmatch='Data Node is
Available'})"

caimito-a.internal      0   IN  LUA  A   "ifurlup('
http://10.10.0.2:9020/?ping', {'10.10.0.2'}, {stringmatch='Data Node is
Available'})"

caimito-a.internal      0   IN  LUA  A   "ifurlup('
http://10.10.0.3:9020/?ping', {'10.10.0.3'}, {stringmatch='Data Node is
Available'})"

caimito-b.internal      0   IN  LUA  A   "ifurlup('
http://10.10.1.1:9020/?ping', {'10.10.1.1'}, {stringmatch='Data Node is
Available'})"

caimito-b.internal      0   IN  LUA  A   "ifurlup('
http://10.10.1.2:9020/?ping', {'10.10.1.2'}, {stringmatch='Data Node is
Available'})"

caimito-b.internal      0   IN  LUA  A   "ifurlup('
http://10.10.1.3:9020/?ping', {'10.10.1.3'}, {stringmatch='Data Node is
Available'})"

caimito-c.internal      0   IN  LUA  A   "ifurlup('
http://10.10.2.1:9020/?ping', {'10.10.2.1'}, {stringmatch='Data Node is
Available'})"

caimito-c.internal      0   IN  LUA  A   "ifurlup('
http://10.10.2.2:9020/?ping', {'10.10.2.2'}, {stringmatch='Data Node is
Available'})"

caimito-c.internal      0   IN  LUA  A   "ifurlup('
http://10.10.2.3:9020/?ping', {'10.10.2.3'}, {stringmatch='Data Node is
Available'})"



Only somehow running a logical and between two different ifurlup’s:

"if (ifurlup('http://10.10.0.1:9020/?ping', {'10.10.0.1'},
{stringmatch='Data Node is Available'})) and (ifurlup('
https://10.10.0.1:9021/?ping', {'10.10.0.1'}, {stringmatch='Data Node is
Available'}))"



However, I understand that the ifurlup health checks don’t work that way
because they fallback returning a random result when all tests fail,
whereas I want no result on a failure.

I spotted the feature request to ‘add a true/false flag to LUA test
functions’ (https://github.com/PowerDNS/pdns/issues/7468) that might aid me
here, but it wasn’t implemented.



I also caught a that in the design of the backupSelector the was originally
a fallback of ‘none’ (https://github.com/PowerDNS/pdns/pull/6894), which
might have also been useful in my case, but I understand that was removed
before merge.



If anyone can offer suggestions or pointers on how I might best approach
any of this, I’m all ears.



Thanks in advance.

Adam.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/pdns-users/attachments/20231027/a0f9f66c/attachment.htm>


More information about the Pdns-users mailing list