[dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'

Rais Ahmed rais.ahmed at tes.com.pk
Thu Mar 24 10:11:26 UTC 2022


Hi,

Thanks for the guidance...!

We are testing with multiple scenarios, with/without kernel tuning. We observed UDP packets errors on both backend servers (not a single UDP error on dnsdist LB server).

Tested with resperf 15K QPS
resperf -s 192.168.0.1 -R -d queryfile-example-10million-201202 -C 100 -c 300 -r 0 -m 15000 -q 200000

Backend 1: 192.168.1.1 (without Kernel tuning):
netstat -su
IcmpMsg:
    InType3: 2229
    InType8: 6
    InType11: 194
    OutType0: 6
    OutType3: 762
Udp:
    1634847 packets received
    843 packets to unknown port received.
    193891 packet receive errors
    1859642 packets sent
    193891 receive buffer errors
    0 send buffer errors
UdpLite:
IpExt:
    InOctets: 580762744
    OutOctets: 237368675
    InNoECTPkts: 1995692
    InECT0Pkts: 27

Backend 2: 192.168.1.2 (with Kernel Tuning):
netstat -su
IcmpMsg:
    InType3: 19177
    InType8: 5802
    InType11: 2645
    OutType0: 5802
    OutType3: 5122
Udp:
    10798358 packets received
    6846 packets to unknown port received.
    4815377 packet receive errors
    11949871 packets sent
    4815377 receive buffer errors
    0 send buffer errors
UdpLite:
IpExt:
    InNoRoutes: 11
    InOctets: 3312682950
    OutOctets: 1741771756
    InNoECTPkts: 16355120
    InECT1Pkts: 72
    InECT0Pkts: 92
    InCEPkts: 4

Kernel Tuning configured in /etc/rc.local

ethtool -L eth0 combined 16
echo 52428800 > /proc/sys/net/netfilter/nf_conntrack_max
sysctl -w net.core.rmem_max=33554432
sysctl -w net.core.wmem_max=33554432
sysctl -w net.core.rmem_default=16777216
sysctl -w net.core.wmem_default=16777216
sysctl -w net.core.netdev_max_backlog=65536
sysctl -w net.core.somaxconn=1024
ulimit -n 16000

Network config/ specs are same on all three servers, are we doing something wrong?


Regards,
Rais 

-----Original Message-----
From: Klaus Darilion <klaus.darilion at nic.at> 
Sent: Thursday, March 24, 2022 12:38 PM
To: Rais Ahmed <rais.ahmed at tes.com.pk>; dnsdist at mailman.powerdns.com
Subject: AW: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'

Have you tested how many Qps your Backend is capably to handle? First test your Backend performance to know how much qps a single backend can handle. I guess 500k qps might be difficult to achieve with bind. If you need more performance switch the Backend to NSD or Knot.

regards
Klaus

> -----Ursprüngliche Nachricht-----
> Von: dnsdist <dnsdist-bounces at mailman.powerdns.com> Im Auftrag von 
> Rais Ahmed via dnsdist
> Gesendet: Mittwoch, 23. März 2022 22:02
> An: dnsdist at mailman.powerdns.com
> Betreff: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
> 
> Hi,
> Thanks for reply...!
> 
> We have configured setMaxUDPOutstanding(65535) and still we are seeing 
> backend down, logs are showing frequently as below.
> 
> Timeout while waiting for the health check response from backend
> 192.168.1.1:53
> Timeout while waiting for the health check response from backend
> 192.168.1.2:53
> 
> Please have a look at below dnsdist configuration and help us to find 
> misconfiguration (16 Listeners & 8+8 backends added as per vCPUs 
> available
> (2 Socket x 8 Cores):
> 
> controlSocket('127.0.0.1:5199')
> setKey("")
> 
> ---- Listen addresses
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true }) 
> addLocal('192.168.0.1:53', { reusePort=true })
> 
> ---- Back-end server
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=1}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=2}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=3}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=4}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=5}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=6}) 
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=7}) newServer({address='192.168.1.1', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=8}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=9}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=10}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=11}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=12}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=13}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=14}) 
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, 
> weight=4, qps=40000, order=15}) newServer({address='192.168.1.2', 
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=16})
> 
> setMaxUDPOutstanding(65535)
> 
> ---- Server Load Balancing Policy
> setServerPolicy(leastOutstanding)
> 
> ---- Web-server
> webserver('192.168.0.1:8083')
> setWebserverConfig({acl='192.168.0.0/24', password='Secret'})
> 
> ---- Customers Policy
> customerACLs={'192.168.1.0/24'}
> setACL(customerACLs)
> 
> pc = newPacketCache(300000, {maxTTL=86400, minTTL=0, 
> temporaryFailureTTL=60, staleTTL=60, dontAge=false})
> getPool(""):setCache(pc)
> 
> setVerboseHealthChecks(true)
> 
> Servers Specs are as below:
> Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16 
> Multiqueues.
> Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with 
> 16 Multiqueues.
> 
> We are trying to handle 500K qps (will increase hardware specs, If 
> required) or with above specs atleast 100K qps.
> 
> 
> Regards,
> Rais
> 
> -----Original Message-----
> From: dnsdist <dnsdist-bounces at mailman.powerdns.com> On Behalf Of 
> dnsdist-request at mailman.powerdns.com
> Sent: Wednesday, March 23, 2022 5:00 PM
> To: dnsdist at mailman.powerdns.com
> Subject: dnsdist Digest, Vol 79, Issue 3
> 
> Send dnsdist mailing list submissions to
> 	dnsdist at mailman.powerdns.com
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://mailman.powerdns.com/mailman/listinfo/dnsdist
> or, via email, send a message with subject or body 'help' to
> 	dnsdist-request at mailman.powerdns.com
> 
> You can reach the person managing the list at
> 	dnsdist-owner at mailman.powerdns.com
> 
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of dnsdist digest..."
> 
> 
> Today's Topics:
> 
>    1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed)
>    2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down'
>       (Remi Gacogne)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Tue, 22 Mar 2022 23:00:25 +0000
> From: Rais Ahmed <rais.ahmed at tes.com.pk>
> To: "dnsdist at mailman.powerdns.com" <dnsdist at mailman.powerdns.com>
> Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
> Message-ID:
> 	<PAXPR08MB70737E4E1CCEFC4A7F61E1E6A0179 at PAXPR08MB7073.e
> urprd08.prod.outlook.com>
> 
> Content-Type: text/plain; charset="us-ascii"
> 
> Hi,
> 
> We have configured dnsdist instance to handle around 500k QPS, but we 
> are seeing downstream down frequently once QPS reached above 25k. 
> below are the logs which we found to relative issue.
> 
> dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
> dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL:
> <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2
> befd6e2/attachment-0001.htm>
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 23 Mar 2022 10:32:22 +0100
> From: Remi Gacogne <remi.gacogne at powerdns.com>
> To: Rais Ahmed <rais.ahmed at tes.com.pk>, "dnsdist at mailman.powerdns.com"
> 	<dnsdist at mailman.powerdns.com>
> Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as
> 	'down'
> Message-ID: <5a95cbeb-7c82-9bc1-0b4c-8726f814432e at powerdns.com>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> Hi,
> 
>  > We have configured dnsdist instance to handle around 500k QPS, but 
> we  > are seeing downstream down frequently once QPS reached above 25k.
> below  > are the logs which we found to relative issue.
>  >
>  > dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
>  >
>  > dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
> 
> You might be able to get more information about why the health-checks 
> are failing by adding setVerboseHealthChecks(true) to your configuration.
> 
> It usually happens because the backend is overwhelmed and needs to be 
> tuned to handle the load, but it might also be caused by a network 
> issue, like a link reaching its maximum capacity, or by dnsdist itself 
> being overwhelmed and needing tuning (like increasing the number of
> newServer() directives, see [1]).
> 
> [1]:
> https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over-
> https
> 
> Best regards,
> --
> Remi Gacogne
> PowerDNS.COM BV - https://www.powerdns.com/
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist
> 
> 
> ------------------------------
> 
> End of dnsdist Digest, Vol 79, Issue 3
> **************************************
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist


More information about the dnsdist mailing list