[dnsdist] Erratic CPU Usage

Jahanzeb Arshad jahanzeb at nayatel.com
Thu Dec 24 09:28:11 UTC 2020


Following is the configuration file. The strace output is attached here

https://drive.google.com/file/d/1lRBNr6PB83zdbfMVtSKbzZ5ZKipy140f/view?usp=sharing


controlSocket('127.0.0.1:5199')
setConsoleACL('127.0.0.1/32')
setKey("KEY")

-- Add Listeners, one entry for each CPU Core for v4 and v6
addLocal("0.0.0.0:53", {reusePort=true}) -- Listen on IPv4, port 53
addLocal("0.0.0.0:53", {reusePort=true}) 
addLocal("0.0.0.0:53", {reusePort=true}) 
addLocal("0.0.0.0:53", {reusePort=true}) 
addLocal("[::]:53", {reusePort=true}) -- Listen to IPv6 port 53
addLocal("[::]:53", {reusePort=true})
addLocal("[::]:53", {reusePort=true})
addLocal("[::]:53", {reusePort=true})

-- Backend Servers, adding multiple UDP threads for backends to handle
more load
newServer({name="Caching-048-3-1", address="10.12.48.3",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-048-3-2", address="10.12.48.3",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-048-4-1", address="10.12.48.4",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-048-4-2", address="10.12.48.4",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-188-3-1", address="10.12.188.3",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-188-3-2", address="10.12.188.3",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-188-4-1", address="10.12.188.4",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})
newServer({name="Caching-188-4-2", address="10.12.188.4",
maxCheckFailures=3, checkInterval=5, rise=3, weight=5, qps=2500})

-- Allow clients IPs to connect
addACL("100.64.0.0/10")
addACL("2001:d000::/32")

-- Limit /32 ipv4 and /128 ipv6 to 40 (burst10) /80 (burst20) QPS
-- addAction(MaxQPSIPRule(50,32,128,10), DelayAction(500))
addAction(MaxQPSIPRule(100,32,128,20), DropAction())

setServerPolicy(leastOutstanding)

-- Enable DNS query caching
-- 300K entries with avg 512B would take about 150MB of RAM
pc = newPacketCache(300000, {maxTTL=86400, minTTL=0,
temporaryFailureTTL=60, staleTTL=60, dontAge=false})
getPool(""):setCache(pc)

-- Enable Webserver
webserver("0.0.0.0:8083", "PASSWORD", "API-KEY", {}, "192.168.19.0/24")

-- Graphing Statistics
carbonServer("37.252.122.50", "NTL-DNSLB-GD-01", 30, "dnsdist", "main")

Regards

Jahanzeb

On Thu, 2020-12-24 at 09:56 +0100, Remi Gacogne via dnsdist wrote:
> Hi,
> 
> On 12/24/20 7:25 AM, Jahanzeb Arshad via dnsdist wrote:
> > We have deployed two instances of dnsdist v1.5.1 on CentOS 7.9.
> > After 
> > running for 7-8 days both the machines start showing erratic CPU
> > usage 
> > pattern. The CPU usage jumps to 40% then 0 and the servers keep on
> > doing 
> > this. If the process is restarted this is fixed. Need some help to 
> > identify and fix this issue.
> 
> Would you mind posting the dnsdist configuration you are using, after
> redacting passwords and API keys, of course?
> 
> It would also be very helpful if you could do a "strace -f -p <pid of
> dnsdist>" for a few seconds after the process has started acting out.
> Getting a backtrace might also help, which can be done by attaching a
> debugger via "gdb -p <pid of dnsdist>" then issuing "thread apply all
> bt 
> full".
> 
> Best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20201224/49c67554/attachment.htm>


More information about the dnsdist mailing list