[dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'

Rais Ahmed rais.ahmed at tes.com.pk
Thu Mar 31 09:00:33 UTC 2022


Hi,

Thanks...!

We have achieved stability by switching bind to pdns-recursor as backend. Seems abnormal behavior of bind.

It will be great if anyone can guide us for the best practices to achieve maximum stable utilization, let say we have 4 vCPUs can we configure 6-8 Listener addresses (addLocal) & 6-8 Back-end server (newServer) or for best practices & recommendations we need to go with 4 (addLocal) 4 (newServer) entries. Planning 50KQPS load.

 Example:

Hardware:
4 CPUs
4GB RAM 

DNSDist Config:

---- Listen addresses
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })
addLocal(192.168.1.1:53', { reusePort=true })

---- Back-end server
newServer({address='192.168.0.1', maxCheckFailures=5, checkInterval=5, qps=1000, order=1})
newServer({address='192.168.0.1', maxCheckFailures=5, checkInterval=5, qps=1000, order=2})
newServer({address='192.168.0.2', maxCheckFailures=5, checkInterval=5, qps=1000, order=3})
newServer({address='192.168.0.2', maxCheckFailures=5, checkInterval=5, qps=1000, order=4})
newServer({address='192.168.0.2', maxCheckFailures=5, checkInterval=5, qps=1000, order=5})
newServer({address='192.168.0.2', maxCheckFailures=5, checkInterval=5, qps=1000, order=6})

---- Server Load Balancing Policy
setServerPolicy(leastOutstanding)


Regards,
Rais 

-----Original Message-----
From: Klaus Darilion <klaus.darilion at nic.at> 
Sent: Thursday, March 24, 2022 12:38 PM
To: Rais Ahmed <rais.ahmed at tes.com.pk>; dnsdist at mailman.powerdns.com
Subject: AW: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'

Have you tested how many Qps your Backend is capably to handle? First test your Backend performance to know how much qps a single backend can handle. I guess 500k qps might be difficult to achieve with bind. If you need more performance switch the Backend to NSD or Knot.

regards
Klaus

> -----Ursprüngliche Nachricht-----
> Von: dnsdist <dnsdist-bounces at mailman.powerdns.com> Im Auftrag von 
> Rais Ahmed via dnsdist
> Gesendet: Mittwoch, 23. März 2022 22:02
> An: dnsdist at mailman.powerdns.com
> Betreff: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
> 
> Hi,
> Thanks for reply...!
> 
> We have configured setMaxUDPOutstanding(65535) and still we are seeing 
> backend down, logs are showing frequently as below.
> 
> Timeout while waiting for the health check response from backend
> 192.168.1.1:53
> Timeout while waiting for the health check response from backend
> 192.168.1.2:53
> 
> Please have a look at below dnsdist configuration and help us to find 
> misconfiguration (16 Listeners & 8+8 backends added as per vCPUs 
> available
> (2 Socket x 8 Cores):
> 
> controlSocket('127.0.0.1:5199')
> setKey("")
> 
> ---- Listen addresses
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true })
> 
> ---- Back-end server
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=1}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=2}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=3}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=4}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=5}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=6}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=7}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=8}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=9}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=10}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=11}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=12}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=13}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=14}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=15}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=16})
> 
> setMaxUDPOutstanding(65535)
> 
> ---- Server Load Balancing Policy
> setServerPolicy(leastOutstanding)
> 
> ---- Web-server
> webserver('192.168.0.1:8083')
> setWebserverConfig({acl='192.168.0.0/24', password='Secret'})
> 
> ---- Customers Policy
> customerACLs={'192.168.1.0/24'}
> setACL(customerACLs)
> 
> pc = newPacketCache(300000, {maxTTL=86400, minTTL=0, 
> temporaryFailureTTL=60, staleTTL=60, dontAge=false})
> getPool(""):setCache(pc)
> 
> setVerboseHealthChecks(true)
> 
> Servers Specs are as below:
> Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16 
> Multiqueues.
> Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with 
> 16 Multiqueues.
> 
> We are trying to handle 500K qps (will increase hardware specs, If 
> required) or with above specs atleast 100K qps.
> 
> 
> Regards,
> Rais
> 
> -----Original Message-----
> From: dnsdist <dnsdist-bounces at mailman.powerdns.com> On Behalf Of 
> dnsdist-request at mailman.powerdns.com
> Sent: Wednesday, March 23, 2022 5:00 PM
> To: dnsdist at mailman.powerdns.com
> Subject: dnsdist Digest, Vol 79, Issue 3
> 
> Send dnsdist mailing list submissions to
> 	dnsdist at mailman.powerdns.com
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://mailman.powerdns.com/mailman/listinfo/dnsdist
> or, via email, send a message with subject or body 'help' to
> 	dnsdist-request at mailman.powerdns.com
> 
> You can reach the person managing the list at
> 	dnsdist-owner at mailman.powerdns.com
> 
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of dnsdist digest..."
> 
> 
> Today's Topics:
> 
>    1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed)
>    2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down'
>       (Remi Gacogne)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Tue, 22 Mar 2022 23:00:25 +0000
> From: Rais Ahmed <rais.ahmed at tes.com.pk>
> To: "dnsdist at mailman.powerdns.com" <dnsdist at mailman.powerdns.com>
> Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
> Message-ID:
> 	<PAXPR08MB70737E4E1CCEFC4A7F61E1E6A0179 at PAXPR08MB7073.e
> urprd08.prod.outlook.com>
> 
> Content-Type: text/plain; charset="us-ascii"
> 
> Hi,
> 
> We have configured dnsdist instance to handle around 500k QPS, but we 
> are seeing downstream down frequently once QPS reached above 25k. 
> below are the logs which we found to relative issue.
> 
> dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
> dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL:
> <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2
> befd6e2/attachment-0001.htm>
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 23 Mar 2022 10:32:22 +0100
> From: Remi Gacogne <remi.gacogne at powerdns.com>
> To: Rais Ahmed <rais.ahmed at tes.com.pk>, "dnsdist at mailman.powerdns.com"
> 	<dnsdist at mailman.powerdns.com>
> Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as
> 	'down'
> Message-ID: <5a95cbeb-7c82-9bc1-0b4c-8726f814432e at powerdns.com>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> Hi,
> 
>  > We have configured dnsdist instance to handle around 500k QPS, but 
> we  > are seeing downstream down frequently once QPS reached above 25k.
> below  > are the logs which we found to relative issue.
>  >
>  > dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
>  >
>  > dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
> 
> You might be able to get more information about why the health-checks 
> are failing by adding setVerboseHealthChecks(true) to your configuration.
> 
> It usually happens because the backend is overwhelmed and needs to be 
> tuned to handle the load, but it might also be caused by a network 
> issue, like a link reaching its maximum capacity, or by dnsdist itself 
> being overwhelmed and needing tuning (like increasing the number of
> newServer() directives, see [1]).
> 
> [1]:
> https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over-
> https
> 
> Best regards,
> --
> Remi Gacogne
> PowerDNS.COM BV - https://www.powerdns.com/
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist
> 
> 
> ------------------------------
> 
> End of dnsdist Digest, Vol 79, Issue 3
> **************************************
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist


More information about the dnsdist mailing list