[Pdns-users] Recursor 3.4-pre : rec_control get cache-bytes taking about 5sec after few hours running

GAVARRET, David david.gavarret at sfr.com
Fri Jan 20 10:38:22 UTC 2012


Hello,

we have just upgraded some more production servers with version 3.4-pre of pdns_recursor.
Over time (in our case, after about 18 hours), reading of statistics via the command "rec_control get-all" takes more and more time, up to 5 seconds. These 5 seconds also seem to be a timeout limit, and so when it is reached, the command fails with the following error :
" Fatal: Unable to receive message over control channel: Success " 

In the log file, the message is a bit longer :
" Error dealing with control socket request: Unable to send message over control channel '/var/run/powerdns//lsockcaf2V2': No such file or directory "

With version 3.3, statistics are always returned within 10 ms, even after hundreds of days running.
The settings are identical on the servers running 3.3 and servers running 3.4 version. The CPU and load average are also quite the same.

It seems that the "cache-bytes" statistic is the one taking so much time. Here is the measure time for getting each statistic with unitary command "rec_control get" :

$ for id in `sudo rec_control --socket-dir=/var/run/powerdns/ get-all | cut -f 1`; do echo "$id : " ; time sudo rec_control --socket-dir=/var/run/powerdns/ get $id ; done
all-outqueries :
41460979

real    0m0.007s
user    0m0.000s
sys     0m0.000s
dlg-only-drops :
0

real    0m0.006s
user    0m0.004s
sys     0m0.000s
dont-outqueries :
151500

real    0m0.005s
user    0m0.004s
sys     0m0.000s
max-mthread-stack :
36264

real    0m0.006s
user    0m0.004s
sys     0m0.004s
outgoing-timeouts :
1776957

real    0m0.005s
user    0m0.004s
sys     0m0.000s
tcp-outqueries :
49713

real    0m0.006s
user    0m0.004s
sys     0m0.004s
throttled-out :
418136

real    0m0.006s
user    0m0.004s
sys     0m0.000s
throttled-outqueries :
418136

real    0m0.006s
user    0m0.004s
sys     0m0.004s
unreachables :
226341

real    0m0.006s
user    0m0.004s
sys     0m0.000s
answers-slow :
899924

real    0m0.006s
user    0m0.004s
sys     0m0.004s
answers0-1 :
17180424

real    0m0.005s
user    0m0.004s
sys     0m0.000s
answers1-10 :
8097358

real    0m0.005s
user    0m0.004s
sys     0m0.004s
answers10-100 :
12337003

real    0m0.005s
user    0m0.004s
sys     0m0.004s
answers100-1000 :
8438958

real    0m0.006s
user    0m0.004s
sys     0m0.000s
case-mismatches :
0

real    0m0.005s
user    0m0.000s
sys     0m0.008s
chain-resends :
150757

real    0m0.005s
user    0m0.004s
sys     0m0.000s
client-parse-errors :
23334

real    0m0.006s
user    0m0.004s
sys     0m0.000s
edns-ping-matches :
0

real    0m0.006s
user    0m0.004s
sys     0m0.000s
edns-ping-mismatches :
0

real    0m0.006s
user    0m0.000s
sys     0m0.004s
ipv6-outqueries :
0

real    0m0.006s
user    0m0.000s
sys     0m0.004s
no-packet-error :
146612428

real    0m0.006s
user    0m0.000s
sys     0m0.004s
noedns-outqueries :
41506921

real    0m0.006s
user    0m0.000s
sys     0m0.008s
noerror-answers :
251917785

real    0m0.006s
user    0m0.004s
sys     0m0.000s
noping-outqueries :
0

real    0m0.006s
user    0m0.004s
sys     0m0.000s
nsset-invalidations :
84551

real    0m0.006s
user    0m0.000s
sys     0m0.008s
nxdomain-answers :
12293232

real    0m0.006s
user    0m0.000s
sys     0m0.008s
over-capacity-drops :
0

real    0m0.006s
user    0m0.004s
sys     0m0.000s
qa-latency :
26

real    0m0.006s
user    0m0.000s
sys     0m0.012s
questions :
267200911

real    0m0.006s
user    0m0.000s
sys     0m0.004s
resource-limits :
1

real    0m0.006s
user    0m0.000s
sys     0m0.008s
server-parse-errors :
2

real    0m0.006s
user    0m0.000s
sys     0m0.004s
servfail-answers :
2964254

real    0m0.006s
user    0m0.004s
sys     0m0.008s
spoof-prevents :
0

real    0m0.005s
user    0m0.004s
sys     0m0.000s
tcp-client-overflow :
0

real    0m0.006s
user    0m0.000s
sys     0m0.004s
tcp-questions :
11471

real    0m0.005s
user    0m0.004s
sys     0m0.004s
unauthorized-tcp :
0

real    0m0.006s
user    0m0.000s
sys     0m0.004s
unauthorized-udp :
0

real    0m0.005s
user    0m0.000s
sys     0m0.008s
unexpected-packets :
183889

real    0m0.006s
user    0m0.004s
sys     0m0.000s
cache-bytes :
1467740916

real    0m4.875s <<<<<<
user    0m0.000s
sys     0m0.008s
cache-entries :
12811541

real    0m0.009s
user    0m0.004s
sys     0m0.000s
cache-hits :
16743327

real    0m0.005s
user    0m0.004s
sys     0m0.008s
cache-misses :
30214506

real    0m0.006s
user    0m0.004s
sys     0m0.000s
concurrent-queries :
111

real    0m0.006s
user    0m0.004s
sys     0m0.000s
malloc-bytes :
0

real    0m0.006s
user    0m0.004s
sys     0m0.000s
negcache-entries :
1600262

real    0m0.007s
user    0m0.000s
sys     0m0.008s
nsspeeds-entries :
48068

real    0m0.006s
user    0m0.004s
sys     0m0.000s
packetcache-bytes :
53881659

real    0m0.119s
user    0m0.000s
sys     0m0.004s
packetcache-entries :
502387

real    0m0.006s
user    0m0.000s
sys     0m0.004s
packetcache-hits :
220244614

real    0m0.005s
user    0m0.000s
sys     0m0.004s
packetcache-misses :
46970437

real    0m0.006s
user    0m0.004s
sys     0m0.000s
sys-msec :
8463024

real    0m0.005s
user    0m0.000s
sys     0m0.004s
tcp-clients :
0

real    0m0.005s
user    0m0.004s
sys     0m0.000s
throttle-entries :
14957

real    0m0.005s
user    0m0.004s
sys     0m0.000s
uptime :
64397

real    0m0.005s
user    0m0.000s
sys     0m0.004s
user-msec :
20958557

real    0m0.005s
user    0m0.008s
sys     0m0.000s



Here is our recursor.conf file:

setuid=20100
setgid=20100
socket-owner=pdns
socket-group=pdns
socket-mode=770
socket-dir=/var/run/powerdns
allow-from-file=/etc/powerdns/dns-resolver-allow-from
forward-zones-file=/etc/powerdns/dns-resolver-forward-zones
local-address=...
max-cache-entries=16000000
stack-size=250000
threads=4
logging-facility=0
version-string=3.4-pre

I can provide any other information if needed,

Kind Regards,

-- 
David Gavarret



More information about the Pdns-users mailing list