[dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Klaus Darilion
klaus.darilion at nic.at
Thu Mar 24 11:13:23 UTC 2022
Indeed that might be a problem. We use (ferm syntax):
table raw {
# Wir wollen NOTRACK fuer eingehende DNS Anfragen und die dazugehoerigen
# ausgehenden Antworten. Ausgehende DNS Anfragen sollen weiter getrackt
# werden damit die dazugehoerige Antwort rein darf.
chain PREROUTING {
proto (udp tcp) dport 53 NOTRACK;
}
chain OUTPUT {
proto (udp tcp) sport 53 NOTRACK;
}
}
Same for IPv4 and IPv6 in our case.
regards
Klaus
Von: dnsdist <dnsdist-bounces at mailman.powerdns.com> Im Auftrag von Rasto Rickardt via dnsdist
Gesendet: Donnerstag, 24. März 2022 11:36
An: dnsdist at mailman.powerdns.com
Betreff: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Hello Rais,
i noticed that you are increasing nf_conntrack_max. I am not sure how the backend servers are connected,
but i suggest not to use connection tracking/NAT at all. You can use for example dedicated interface for backend
management and other one to connect to dnsdist.
r.
On 24/03/2022 11:11, Rais Ahmed via dnsdist wrote:
Hi,
Thanks for the guidance...!
We are testing with multiple scenarios, with/without kernel tuning. We observed UDP packets errors on both backend servers (not a single UDP error on dnsdist LB server).
Tested with resperf 15K QPS
resperf -s 192.168.0.1 -R -d queryfile-example-10million-201202 -C 100 -c 300 -r 0 -m 15000 -q 200000
Backend 1: 192.168.1.1 (without Kernel tuning):
netstat -su
IcmpMsg:
InType3: 2229
InType8: 6
InType11: 194
OutType0: 6
OutType3: 762
Udp:
1634847 packets received
843 packets to unknown port received.
193891 packet receive errors
1859642 packets sent
193891 receive buffer errors
0 send buffer errors
UdpLite:
IpExt:
InOctets: 580762744
OutOctets: 237368675
InNoECTPkts: 1995692
InECT0Pkts: 27
Backend 2: 192.168.1.2 (with Kernel Tuning):
netstat -su
IcmpMsg:
InType3: 19177
InType8: 5802
InType11: 2645
OutType0: 5802
OutType3: 5122
Udp:
10798358 packets received
6846 packets to unknown port received.
4815377 packet receive errors
11949871 packets sent
4815377 receive buffer errors
0 send buffer errors
UdpLite:
IpExt:
InNoRoutes: 11
InOctets: 3312682950
OutOctets: 1741771756
InNoECTPkts: 16355120
InECT1Pkts: 72
InECT0Pkts: 92
InCEPkts: 4
Kernel Tuning configured in /etc/rc.local
ethtool -L eth0 combined 16
echo 52428800 > /proc/sys/net/netfilter/nf_conntrack_max
sysctl -w net.core.rmem_max=33554432
sysctl -w net.core.wmem_max=33554432
sysctl -w net.core.rmem_default=16777216
sysctl -w net.core.wmem_default=16777216
sysctl -w net.core.netdev_max_backlog=65536
sysctl -w net.core.somaxconn=1024
ulimit -n 16000
Network config/ specs are same on all three servers, are we doing something wrong?
Regards,
Rais
-----Original Message-----
From: Klaus Darilion mailto:klaus.darilion at nic.at
Sent: Thursday, March 24, 2022 12:38 PM
To: Rais Ahmed mailto:rais.ahmed at tes.com.pk; mailto:dnsdist at mailman.powerdns.com
Subject: AW: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Have you tested how many Qps your Backend is capably to handle? First test your Backend performance to know how much qps a single backend can handle. I guess 500k qps might be difficult to achieve with bind. If you need more performance switch the Backend to NSD or Knot.
regards
Klaus
-----Ursprüngliche Nachricht-----
Von: dnsdist mailto:dnsdist-bounces at mailman.powerdns.com Im Auftrag von
Rais Ahmed via dnsdist
Gesendet: Mittwoch, 23. März 2022 22:02
An: mailto:dnsdist at mailman.powerdns.com
Betreff: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Hi,
Thanks for reply...!
We have configured setMaxUDPOutstanding(65535) and still we are seeing
backend down, logs are showing frequently as below.
Timeout while waiting for the health check response from backend
192.168.1.1:53
Timeout while waiting for the health check response from backend
192.168.1.2:53
Please have a look at below dnsdist configuration and help us to find
misconfiguration (16 Listeners & 8+8 backends added as per vCPUs
available
(2 Socket x 8 Cores):
controlSocket('127.0.0.1:5199')
setKey("")
---- Listen addresses
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
addLocal('192.168.0.1:53', { reusePort=true })
---- Back-end server
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=1}) newServer({address='192.168.1.1',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=2})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=3}) newServer({address='192.168.1.1',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=4})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=5}) newServer({address='192.168.1.1',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=6})
newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=7}) newServer({address='192.168.1.1',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=8})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=9}) newServer({address='192.168.1.2',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=10})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=11}) newServer({address='192.168.1.2',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=12})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=13}) newServer({address='192.168.1.2',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=14})
newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5,
weight=4, qps=40000, order=15}) newServer({address='192.168.1.2',
maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=16})
setMaxUDPOutstanding(65535)
---- Server Load Balancing Policy
setServerPolicy(leastOutstanding)
---- Web-server
webserver('192.168.0.1:8083')
setWebserverConfig({acl='192.168.0.0/24', password='Secret'})
---- Customers Policy
customerACLs={'192.168.1.0/24'}
setACL(customerACLs)
pc = newPacketCache(300000, {maxTTL=86400, minTTL=0,
temporaryFailureTTL=60, staleTTL=60, dontAge=false})
getPool(""):setCache(pc)
setVerboseHealthChecks(true)
Servers Specs are as below:
Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16
Multiqueues.
Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with
16 Multiqueues.
We are trying to handle 500K qps (will increase hardware specs, If
required) or with above specs atleast 100K qps.
Regards,
Rais
-----Original Message-----
From: dnsdist mailto:dnsdist-bounces at mailman.powerdns.com On Behalf Of
mailto:dnsdist-request at mailman.powerdns.com
Sent: Wednesday, March 23, 2022 5:00 PM
To: mailto:dnsdist at mailman.powerdns.com
Subject: dnsdist Digest, Vol 79, Issue 3
Send dnsdist mailing list submissions to
mailto:dnsdist at mailman.powerdns.com
To subscribe or unsubscribe via the World Wide Web, visit
https://mailman.powerdns.com/mailman/listinfo/dnsdist
or, via email, send a message with subject or body 'help' to
mailto:dnsdist-request at mailman.powerdns.com
You can reach the person managing the list at
mailto:dnsdist-owner at mailman.powerdns.com
When replying, please edit your Subject line so it is more specific than "Re:
Contents of dnsdist digest..."
Today's Topics:
1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed)
2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down'
(Remi Gacogne)
----------------------------------------------------------------------
Message: 1
Date: Tue, 22 Mar 2022 23:00:25 +0000
From: Rais Ahmed mailto:rais.ahmed at tes.com.pk
To: mailto:dnsdist at mailman.powerdns.com mailto:dnsdist at mailman.powerdns.com
Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down'
Message-ID:
mailto:PAXPR08MB70737E4E1CCEFC4A7F61E1E6A0179 at PAXPR08MB7073.eurprd08.prod.outlook.com
Content-Type: text/plain; charset="us-ascii"
Hi,
We have configured dnsdist instance to handle around 500k QPS, but we
are seeing downstream down frequently once QPS reached above 25k.
below are the logs which we found to relative issue.
dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
-------------- next part -------------- An HTML attachment was
scrubbed...
URL:
http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2befd6e2/attachment-0001.htm
------------------------------
Message: 2
Date: Wed, 23 Mar 2022 10:32:22 +0100
From: Remi Gacogne mailto:remi.gacogne at powerdns.com
To: Rais Ahmed mailto:rais.ahmed at tes.com.pk, mailto:dnsdist at mailman.powerdns.com
mailto:dnsdist at mailman.powerdns.com
Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as
'down'
Message-ID: mailto:5a95cbeb-7c82-9bc1-0b4c-8726f814432e at powerdns.com
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi,
> We have configured dnsdist instance to handle around 500k QPS, but
we > are seeing downstream down frequently once QPS reached above 25k.
below > are the logs which we found to relative issue.
>
> dnsdist[29321]: Marking downstream server1 IP:53 as 'down'
>
> dnsdist[29321]: Marking downstream server2 IP:53 as 'down'
You might be able to get more information about why the health-checks
are failing by adding setVerboseHealthChecks(true) to your configuration.
It usually happens because the backend is overwhelmed and needs to be
tuned to handle the load, but it might also be caused by a network
issue, like a link reaching its maximum capacity, or by dnsdist itself
being overwhelmed and needing tuning (like increasing the number of
newServer() directives, see [1]).
[1]:
https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over-
https
Best regards,
--
Remi Gacogne
PowerDNS.COM BV - https://www.powerdns.com/
------------------------------
Subject: Digest Footer
_______________________________________________
dnsdist mailing list
mailto:dnsdist at mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist
------------------------------
End of dnsdist Digest, Vol 79, Issue 3
**************************************
_______________________________________________
dnsdist mailing list
mailto:dnsdist at mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist
_______________________________________________
dnsdist mailing list
mailto:dnsdist at mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist
More information about the dnsdist
mailing list