[Pdns-users] PowerDNS issues
Andrey Sedletsky
asedletsky at spd-mgts.ru
Fri Sep 10 09:07:37 UTC 2021
> Hi there and have a Good day!
> Andrey Sedletsky on behalf PJSC MGTS (Moscow City Telephone Network)
> company!
>
> We are using your recursive DNS servers (Open Source PowerDNS
> recurser) and we've got a couple of questions to you (actually more).
> We were contacted by one of our clients with the problem of the
> inability to resolve the domain name "cm.taxi".
> From the request trace on the server, it can be seen that PowerDNS
> does not accept a response from an authoritative server because the AA
> (Authoritative Answer) flag is not set to one.
>
> Sep 04 01:47:38 a975-icache02 pdns_recursor[2575]: Removing record
> 'cm.taxi|A|91.231.114.19' in the answer section without the AA bit set
> received from cm.taxi
> Sep 04 01:47:38 a975-icache02 pdns_recursor[2575]: Removing record
> 'cm.taxi|A|91.231.114.18' in the answer section without the AA bit set
> received from cm.taxi
>
> The full log can be found in the attachment, there is also a dump file
> illustrating the problem.
> So our first question. Whether this is a normal behavior of PowerDNS
> Recursor and can it be changed (in general or for specific zones) ?
>
>
> Also, not so long ago, we had an issue when restarting the
> pdns-recursor process. After the restart (around 11 am), the number of
> servfail responses towards clients began to increase.
> The load on the server at this moment was about 300 thousand requests
> per second.
> By the evening (about 22 hours), the number of servfail responses
> began to approach 30 percent of the total number of requests,
> and the call center began to receive mass appeals from subscribers
> about the impossibility of resolving domain names.
> By this time, the load has grown to 400 thousand requests per second
> (the standard value for the current time of day).
> Switching to a backup server with a similar configuration (hardware
> and software) did not solve the problem. It was reproduced on the
> backup server too. The restart did not help either.
> In the end, the problem was solved by reducing the parameter
> max-threads=16 to eight.
> In this regard, there are a number of questions.
> What could be the reason for this behavior (until the problem
> occurred, the server was working normally for several months at the
> same load and with the same configuration) ?
> What tests should be performed to identify bottlenecks in the system
> and the pdns-recursor itself?
> What metrics should be put on monitoring to prevent the occurrence of
> such situations?
> And again in the attachment there is a screenshot illustrating the
> situation at that time.
>
> One last question.
> Our company would like to have commercial support for your product. Is
> this possible and, if so, what needs to be done for this ?
Below is the link to the attachments:
https://cloud.mail.ru/public/3y53/RzaP6z2a6
>
> Additional information:
>
> >rec_control version
> 4.3.6
> > less /etc/oracle-release
> Oracle Linux Server release 8.4
> >2 CPUs (28 cores, 56 threads)
> >128 GB RAM
>
>
> PDNS was installed from EPEL Repo
> grep -i process recursor.conf
> # dnssec DNSSEC mode: off/process-no-validate
> (default)/process/log-fail/validate
> # dnssec=process-no-validate
>
>
>
> Best Regards,
> Andrey
More information about the Pdns-users
mailing list