<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
On 09/03/2022 07:08, Daniel Miller via Pdns-users wrote:<br>
<blockquote type="cite"
cite="mid:AMFES.1067ab8ca6.em20c2d7a3-adf7-4bf8-be94-0e72f2f070db@028e5525.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<style id="css_styles">blockquote.cite { margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc }blockquote.cite2 {margin-left: 5px; margin-right: 0px; padding-left: 10px; padding-right:0px; border-left: 1px solid #cccccc; margin-top: 3px; padding-top: 0px; }a img { border: 0px; }li[style='text-align: center;'], li[style='text-align: center; '], li[style='text-align: right;'], li[style='text-align: right; '] { list-style-position: inside;}body { font-family: 'Segoe UI'; font-size: 12pt; }.quote { margin-left: 1em; margin-right: 1em; border-left: 5px #ebebeb solid; padding-left: 0.3em; }</style>
<div>Anyway, after all that - when I make a change to a domain
record using pdnsutil or an external tool using the API - the
changes are immediately applied to the zone but are not
immediately visible through the recursor. To make that happen I
need to either flush the cache or just restart the recursor.</div>
<div><br>
</div>
<div>This is an issue when creating/updating ACME challenge
records - I haven't been able to totally automate the process. I
need to introduce lengthy delays, try manually applying the
changes, restart the servers, whatever.</div>
</blockquote>
<p>That doesn't really make sense as an explanation of whatever
problem you see.<br>
</p>
<p>1. LetsEncrypt will be talking to your authoritative server, not
your recursor.</p>
<p>2. Even if it were talking to the recursor, it would be querying
_acme-challenge.somedomain TXT. Unless that query had been made
recently, it won't be in the recursor's cache.<br>
</p>
<p>If you're hitting a caching problem here, it's not to do with the
recursor, but either the packet cache or the query cache in
pdns-authoritative. See:
<a class="moz-txt-link-freetext" href="https://doc.powerdns.com/authoritative/performance.html#packet-cache">https://doc.powerdns.com/authoritative/performance.html#packet-cache</a></p>
<p>If LetsEncrypt had queried _acme-challenge.somedomain TXT a few
seconds before you had changed the zone, and then again
afterwards, it could see the old data. However, that shouldn't be
happening: you should be inserting the TXT record *before*
LetsEncrypt does the query. Therefore, although you can disable
those caches, you shouldn't really need to do so.</p>
<p>The most likely problem I can think of is that your authoritative
zones are replicated, and there's some delay in updates to the
primary getting replicated to the secondaries. Remember that
LetsEncrypt could query *any* of your auth nameservers with equal
probability.</p>
<p>One solution is to ensure that notifies are working properly, and
then insert a short (say 5 second) delay in your ACME process to
ensure it has had time to complete.<br>
</p>
<p>Another solution is to get LetsEncrypt to talk to a single
instance, by putting a single NS record wherever you need:</p>
<p>_acme-challenge.www.example.com. NS ns-primary.example.com.</p>
<p>If you wish, this approach also lets you have a completely
separate authoritative server, dedicated to handling ACME
challenges. That in turn can be something that accepts dynamic
updates, without having to allow dynamic updates on your main
infrastructure.</p>
<p>If you need to debug this further, I suggest you capture the data
between LetsEncrypt and your authoritative servers, with query
logging or at worst using tcpdump, to work out what's going on.</p>
<p><br>
</p>
<blockquote type="cite"
cite="mid:AMFES.1067ab8ca6.em20c2d7a3-adf7-4bf8-be94-0e72f2f070db@028e5525.com">
<div><br>
</div>
<div>is there a way to make changes in the auth server immediately
visible in the recursor?</div>
</blockquote>
<p>You mean, clients using your local recursor are querying local
zones and seeing stale data? That's a completely different matter:
that's just standard recursor caching, and it's how the DNS is
designed.</p>
<p>You can avoid that by setting a low TTL on the records in your
zone, and for negative caching using the "minimum" parameter in
the SOA record. In the extreme, you'd set those to zero, and then
the recursor would directly forward all queries to the
authoritative server - but something like 60 seconds is more
system friendly. You might as well get *some* benefit from the
recursor cache.<br>
</p>
<p>Or else, whenever you bump the auth zone, you can flush the
corresponding recursor zone - but that's a step you'd have to do
yourself.<br>
</p>
<p><br>
</p>
</body>
</html>