[Pdns-users] recursor fail to resolve

Sergio P Cesar sergio at winc.net
Mon May 4 13:19:08 UTC 2020

Thank you Demi,
I appreciate your reply. 
Not sure I have a leg to stand on eith them, the old standard reply I got when I contacted them.
"No one else is having problems, only you" 
I do wonder what the recursor do on transient failures,  it is never guaranteed one will always get a reply. A packet may get dropped by any router in the path. An overloaded router out there or a bandwidth controller could just drop packets...
I am just grasping for straws here as I am not good at this...


On May 4, 2020 7:41:09 AM CDT, Remi Gacogne via Pdns-users <pdns-users at mailman.powerdns.com> wrote:
>On 5/1/20 10:31 PM, Sergio Cesar via Pdns-users wrote:
>> Thus the question remains: what do I need to change in the recursor
>> configuration to make it work as bind does and resolve even tough it
>> looks like an issue at their end?
>I don't know how bind does resolve but we are doing the right thing
>here, we get a delegation to two NS (mail1.alestra.net.mx. and
>dns.alestra.net.mx.) for s-s.mx. from the mx. zone, and both of these
>servers fail to respond to the first request we send to them. There is
>nothing else to try but return a SERVFAIL. This zone is broken and
>to be fixed.
>> On 5/1/2020 12:22 PM, Aki Tuomi wrote:
>>> Can you try with 'dig' instead? Also the logs seem truncated.
>>> I'm getting SERVFAIL intermittedly too, which suggests problem at
>>> their end. Their servers seem unresponsive sometimes, especially if
>>> you try
>>> dig s-s.mx @mail2.alestra.net.mx.
>>> dig s-s.mx @dns.alestra.net.mx.
>>> and wait some time (like 10 seconds) in between.
>>> Aki
>>>> On 05/01/2020 7:17 PM Sergio Cesar <sergio at winc.net> wrote:
>>>>   root at ns1:~# host s-s.mx
>>>> Host s-s.mx not found: 2(SERVFAIL)
>>>> root at ns1:~# cat /var/log/syslog | grep s-s.mx
>>>> May  1 09:42:51 ns1 pdns_server[16452]: Remote wants
>>>> 's-s/mx.winc.net|A', do = 1, bufsize = 1232 (4096): packetcache
>>>> May  1 11:08:43 ns1 pdns_recursor[22995]: 3 [38702/1] question for
>>>> 's-s.mx|A' from
>>>> May  1 11:08:46 ns1 pdns_recursor[22995]: 3 [38702/1] answer to
>>>> 's-s.m |A': 0 answers, 1 additional, took 5 packets, 3106.89 netw
>>>> 3110.29 tot ms, 0 throttled, 2 timeouts, 0 tcp connections, rcode=2
>>>> May  1 12:14:25 ns1 pdns_recursor[22995]: 3 [39863/1] question for
>>>> 's-s.mx|A' from
>>>> May  1 12:14:28 ns1 pdns_recursor[22995]: 3 [39863/1] answer to
>>>> 's-s.m |A': 0 answers, 0 additional, took 2 packets, 3006.53 netw
>>>> 3010.36 tot ms, 0 throttled, 2 timeouts, 0 tcp connections, rcode=2
>>>> On 5/1/2020 12:12 PM, Aki Tuomi wrote:
>>>>> Next step, try to resolve s-s.mx and check your logs. Like
>>>>> /var/log/syslog?
>>>>> Aki
>>>>>> On 05/01/2020 7:09 PM Sergio Cesar <sergio at winc.net> wrote:
>>>>>>    Thank you for the reply.
>>>>>> Here it is, not sure what that means.
>>>>>> The recursor is running on the same server as the PDNS with a
>>>>>> different
>>>>>> IP address.  if that makes a difference.
>>>>>> root at ns1:~# rec_control trace-regex s-s.mx
>>>>>> ok
>>>>>> ok
>>>>>> ok
>>>>>> On 5/1/2020 11:37 AM, Aki Tuomi wrote:
>>>>>>>> On 05/01/2020 6:31 PM Sergio P Cesar via Pdns-users
>>>>>>>> <pdns-users at mailman.powerdns.com> wrote:
>>>>>>>>     I am new with pdns, just installed a resolver 4.3.0-rc2 to
>>>>>>>> learn and all
>>>>>>>> seems to work but stumbled into an issue I cant resolve.
>>>>>>>> My mailserver failed to deliver email to a few domains, in
>>>>>>>> tracking it I
>>>>>>>> found that their DNS will drop the first packet on every new
>>>>>>>> query  but
>>>>>>>> will respond on a second query ok and every one after that. (5
>>>>>>>> minutes
>>>>>>>> timeout) it will drop the 1st packet again.
>>>>>>>> I was expecting the recursor to query the 2nd and 3rd server in
>>>>>>>> their
>>>>>>>> list but it does not look like it is doing that.
>>>>>>>> It seems like it is caching the failure and does not query
>>>>>>>> at all
>>>>>>>> for a while.
>>>>>>>> I changed packetcache-servfail-ttl=0 and now it looks like
>>>>>>>> the 3rd
>>>>>>>> query attempt it will work as the far end server now respond.
>>>>>>>> Not sure this is correct setting  or I will have adverse effect
>>>>>>>> setting
>>>>>>>> this to 0.
>>>>>>>> Perhaps I have not set something else that will tell the
>>>>>>>> to try
>>>>>>>> the next server if the first one fail to respond or send a
>>>>>>>> packet
>>>>>>>> or a retry.
>>>>>>>> I used bind to test and it gets a response on the first try. I
>>>>>>>> did not
>>>>>>>> try to trace the packets from a bind query.
>>>>>>>> Thanks
>>>>>>> Try `rec_control trace-regex domain.com` and post that. Without
>>>>>>> censoring the results.
>>>>>>> Aki
>> _______________________________________________
>> Pdns-users mailing list
>> Pdns-users at mailman.powerdns.com
>> https://mailman.powerdns.com/mailman/listinfo/pdns-users
>Remi Gacogne
>PowerDNS.COM BV - https://www.powerdns.com/

Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/pdns-users/attachments/20200504/34ef5904/attachment-0001.htm>

More information about the Pdns-users mailing list