[dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Rais Ahmed
rais.ahmed at tes.com.pk
Thu Mar 24 10:11:26 UTC 2022
Hi,
Thanks for the guidance...!
We are testing with multiple scenarios, with/without kernel tuning. We observed UDP packets errors on both backend servers (not a single UDP error on dnsdist LB server).
Tested with resperf 15K QPS
resperf -s 192.168.0.1 -R -d queryfile-example-10million-201202 -C 100 -c 300 -r 0 -m 15000 -q 200000
Backend 1: 192.168.1.1 (without Kernel tuning):
netstat -su
IcmpMsg:
InType3: 2229
InType8: 6
InType11: 194
OutType0: 6
OutType3: 762
Udp:
1634847 packets received
843 packets to unknown port received.
193891 packet receive errors
1859642 packets sent
193891 receive buffer errors
0 send buffer errors
UdpLite:
IpExt:
InOctets: 580762744
OutOctets: 237368675
InNoECTPkts: 1995692
InECT0Pkts: 27
Backend 2: 192.168.1.2 (with Kernel Tuning):
netstat -su
IcmpMsg:
InType3: 19177
InType8: 5802
InType11: 2645
OutType0: 5802
OutType3: 5122
Udp:
10798358 packets received
6846 packets to unknown port received.
4815377 packet receive errors
11949871 packets sent
4815377 receive buffer errors
0 send buffer errors
UdpLite:
IpExt:
InNoRoutes: 11
InOctets: 3312682950
OutOctets: 1741771756
InNoECTPkts: 16355120
InECT1Pkts: 72
InECT0Pkts: 92
InCEPkts: 4
Kernel Tuning configured in /etc/rc.local
ethtool -L eth0 combined 16
echo 52428800 > /proc/sys/net/netfilter/nf_conntrack_max
sysctl -w net.core.rmem_max=33554432
sysctl -w net.core.wmem_max=33554432
sysctl -w net.core.rmem_default=16777216
sysctl -w net.core.wmem_default=16777216
sysctl -w net.core.netdev_max_backlog=65536
sysctl -w net.core.somaxconn=1024
ulimit -n 16000
Network config/ specs are same on all three servers, are we doing something wrong?
Regards,
Rais
-----Original Message-----
From: Klaus Darilion <klaus.darilion at nic.at>
Sent: Thursday, March 24, 2022 12:38 PM
To: Rais Ahmed <rais.ahmed at tes.com.pk>; dnsdist at mailman.powerdns.com
Subject: AW: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Have you tested how many Qps your Backend is capably to handle? First test your Backend performance to know how much qps a single backend can handle. I guess 500k qps might be difficult to achieve with bind. If you need more performance switch the Backend to NSD or Knot.
regards
Klaus
> -----Ursprüngliche Nachricht-----
> Von: dnsdist <dnsdist-bounces at mailman.powerdns.com> Im Auftrag von
> Rais Ahmed via dnsdist
> Gesendet: Mittwoch, 23. März 2022 22:02
> An: dnsdist at mailman.powerdns.com
> Betreff: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
>
> Hi,
> Thanks for reply...!
>
> We have configured setMaxUDPOutstanding(65535) and still we are seeing
> backend down, logs are showing frequently as below.
>
> Timeout while waiting for the health check response from backend
> 192.168.1.1:53
> Timeout while waiting for the health check response from backend
> 192.168.1.2:53
>
> Please have a look at below dnsdist configuration and help us to find
> misconfiguration (16 Listeners & 8+8 backends added as per vCPUs
> available
> (2 Socket x 8 Cores):
>
> controlSocket('127.0.0.1:5199')
> setKey("")
>
> ---- Listen addresses
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
> addLocal('192.168.0.1:53', { reusePort=true })
>
> ---- Back-end server
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=1}) newServer({address='192.168.1.1',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=2})
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=3}) newServer({address='192.168.1.1',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=4})
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=5}) newServer({address='192.168.1.1',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=6})
> newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=7}) newServer({address='192.168.1.1',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=8})
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=9}) newServer({address='192.168.1.2',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=10})
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=11}) newServer({address='192.168.1.2',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=12})
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=13}) newServer({address='192.168.1.2',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=14})
> newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
> weight=4, qps=40000, order=15}) newServer({address='192.168.1.2',
> maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=16})
>
> setMaxUDPOutstanding(65535)
>
> ---- Server Load Balancing Policy
> setServerPolicy(leastOutstanding)
>
> ---- Web-server
> webserver('192.168.0.1:8083')
> setWebserverConfig({acl='192.168.0.0/24', password='Secret'})
>
> ---- Customers Policy
> customerACLs={'192.168.1.0/24'}
> setACL(customerACLs)
>
> pc = newPacketCache(300000, {maxTTL=86400, minTTL=0,
> temporaryFailureTTL=60, staleTTL=60, dontAge=false})
> getPool(""):setCache(pc)
>
> setVerboseHealthChecks(true)
>
> Servers Specs are as below:
> Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16
> Multiqueues.
> Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with
> 16 Multiqueues.
>
> We are trying to handle 500K qps (will increase hardware specs, If
> required) or with above specs atleast 100K qps.
>
>
> Regards,
> Rais
>
> -----Original Message-----
> From: dnsdist <dnsdist-bounces at mailman.powerdns.com> On Behalf Of
> dnsdist-request at mailman.powerdns.com
> Sent: Wednesday, March 23, 2022 5:00 PM
> To: dnsdist at mailman.powerdns.com
> Subject: dnsdist Digest, Vol 79, Issue 3
>
> Send dnsdist mailing list submissions to
> dnsdist at mailman.powerdns.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mailman.powerdns.com/mailman/listinfo/dnsdist
> or, via email, send a message with subject or body 'help' to
> dnsdist-request at mailman.powerdns.com
>
> You can reach the person managing the list at
> dnsdist-owner at mailman.powerdns.com
>
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of dnsdist digest..."
>
>
> Today's Topics:
>
> 1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed)
> 2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down'
> (Remi Gacogne)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 22 Mar 2022 23:00:25 +0000
> From: Rais Ahmed <rais.ahmed at tes.com.pk>
> To: "dnsdist at mailman.powerdns.com" <dnsdist at mailman.powerdns.com>
> Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
> Message-ID:
> <PAXPR08MB70737E4E1CCEFC4A7F61E1E6A0179 at PAXPR08MB7073.e
> urprd08.prod.outlook.com>
>
> Content-Type: text/plain; charset="us-ascii"
>
> Hi,
>
> We have configured dnsdist instance to handle around 500k QPS, but we
> are seeing downstream down frequently once QPS reached above 25k.
> below are the logs which we found to relative issue.
>
> dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
> dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
> -------------- next part -------------- An HTML attachment was
> scrubbed...
> URL:
> <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2
> befd6e2/attachment-0001.htm>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 23 Mar 2022 10:32:22 +0100
> From: Remi Gacogne <remi.gacogne at powerdns.com>
> To: Rais Ahmed <rais.ahmed at tes.com.pk>, "dnsdist at mailman.powerdns.com"
> <dnsdist at mailman.powerdns.com>
> Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as
> 'down'
> Message-ID: <5a95cbeb-7c82-9bc1-0b4c-8726f814432e at powerdns.com>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Hi,
>
> > We have configured dnsdist instance to handle around 500k QPS, but
> we > are seeing downstream down frequently once QPS reached above 25k.
> below > are the logs which we found to relative issue.
> >
> > dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
> >
> > dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
>
> You might be able to get more information about why the health-checks
> are failing by adding setVerboseHealthChecks(true) to your configuration.
>
> It usually happens because the backend is overwhelmed and needs to be
> tuned to handle the load, but it might also be caused by a network
> issue, like a link reaching its maximum capacity, or by dnsdist itself
> being overwhelmed and needing tuning (like increasing the number of
> newServer() directives, see [1]).
>
> [1]:
> https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over-
> https
>
> Best regards,
> --
> Remi Gacogne
> PowerDNS.COM BV - https://www.powerdns.com/
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist
>
>
> ------------------------------
>
> End of dnsdist Digest, Vol 79, Issue 3
> **************************************
> _______________________________________________
> dnsdist mailing list
> dnsdist at mailman.powerdns.com
> https://mailman.powerdns.com/mailman/listinfo/dnsdist
More information about the dnsdist
mailing list