<div dir="ltr">Thanks David!<div>That answers my question. Didn't know how the cache work.</div><div><br></div><div>This messed up our debugging, because android queries where different from our cli testing.</div><div><br></div><div>The situation whit the names going bad is not really the problem, It;s just miss-configuration of the domain</div><div>itself which guest negatively cached.</div><div><br></div><div>Thanks again.</div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Aug 10, 2018 at 11:27 PM David <<a href="mailto:opendak@shaw.ca">opendak@shaw.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 2018-08-10 3:03 PM, Nico wrote:<br>

> I need some help, if posible, to understand some strange situation.<br>

> Unfortunately we can give a method to reproduce it, but we have some <br>

> hard data.<br>

> <br>

> We have a couple of dnsdist servers. Half 1.1.0 and half 1.3.2, moving <br>

> from old to new.<br>

> The 1.1.0 are still getting most of the traffic and the problem happens <br>

> there.<br>

> The user base is 100% mobile, and we serve more than 200kqps<br>

> <br>

> We received complains about domain names unresolved which do exist.<br>

> first time, ignored, second time some checks, third time more checks.<br>

> The problem gets solved expunging the cache.<br>

> <br>

> All fine BUT, during our checks we noticed inconsistent behavior of the <br>

> cache regarding this names.<br>

> Android chrome access to page ->  fails.<br>

> AndroDNS (dns tools) query standard ->  empty answer<br>

>                 query over TCP -> correct answer<br>

>                 query whith DO -> correct answer<br>

>                 query whith CD -> correct answer<br>

> Checking from Linux:<br>

> host command: -> empty answer<br>

>          host over TCP -> correct answer<br>

> dig command -> correcto<br>

> <br>

> When the cache is cleared, all works OK.<br>

> We asume that there is some situation with the domain which create wrong <br>

> cached entries,<br>

> but why we have different answers from UDP than from TCP?<br>

> the query flags are exactly the same (0x0100)<br>

> <br>

> And why the difference between host and dig (the only difference at <br>

> paquet level is the AD bit set on DIG, 0x0100 vs 0x0120)<br>

> <br>

<br>

The packet caches in both dnsdist and powerdns recursor look at the full <br>

packet/request details, minus the ID as a result AD/vs no-AD is a <br>

different packet cache entry and would store separate responses from <br>

your downstream. The same is true for EDNS options, and also for TCP vs <br>

UDP queries.<br>

<br>

What does your downstream servers say about these names when they go <br>

bad? Can you dump the cache out there and inspect it?<br>

<br>

<br>

> <br>

> If anybody can help a little.....<br>

> <br>

> Thanks!!<br>

> <br>

> <br>

> <br>

> <br>

> <br>

> _______________________________________________<br>

> dnsdist mailing list<br>

> <a href="mailto:dnsdist@mailman.powerdns.com" target="_blank">dnsdist@mailman.powerdns.com</a><br>

> <a href="https://mailman.powerdns.com/mailman/listinfo/dnsdist" rel="noreferrer" target="_blank">https://mailman.powerdns.com/mailman/listinfo/dnsdist</a><br>

> <br>

<br>

_______________________________________________<br>

dnsdist mailing list<br>

<a href="mailto:dnsdist@mailman.powerdns.com" target="_blank">dnsdist@mailman.powerdns.com</a><br>

<a href="https://mailman.powerdns.com/mailman/listinfo/dnsdist" rel="noreferrer" target="_blank">https://mailman.powerdns.com/mailman/listinfo/dnsdist</a><br>

</blockquote></div>