[Pdns-users] Immediate update visibility
Brian Candler
b.candler at pobox.com
Wed Mar 9 07:37:45 UTC 2022
On 09/03/2022 07:08, Daniel Miller via Pdns-users wrote:
> Anyway, after all that - when I make a change to a domain record using
> pdnsutil or an external tool using the API - the changes are
> immediately applied to the zone but are not immediately visible
> through the recursor. To make that happen I need to either flush the
> cache or just restart the recursor.
>
> This is an issue when creating/updating ACME challenge records - I
> haven't been able to totally automate the process. I need to introduce
> lengthy delays, try manually applying the changes, restart the
> servers, whatever.
That doesn't really make sense as an explanation of whatever problem you
see.
1. LetsEncrypt will be talking to your authoritative server, not your
recursor.
2. Even if it were talking to the recursor, it would be querying
_acme-challenge.somedomain TXT. Unless that query had been made
recently, it won't be in the recursor's cache.
If you're hitting a caching problem here, it's not to do with the
recursor, but either the packet cache or the query cache in
pdns-authoritative. See:
https://doc.powerdns.com/authoritative/performance.html#packet-cache
If LetsEncrypt had queried _acme-challenge.somedomain TXT a few seconds
before you had changed the zone, and then again afterwards, it could see
the old data. However, that shouldn't be happening: you should be
inserting the TXT record *before* LetsEncrypt does the query. Therefore,
although you can disable those caches, you shouldn't really need to do so.
The most likely problem I can think of is that your authoritative zones
are replicated, and there's some delay in updates to the primary getting
replicated to the secondaries. Remember that LetsEncrypt could query
*any* of your auth nameservers with equal probability.
One solution is to ensure that notifies are working properly, and then
insert a short (say 5 second) delay in your ACME process to ensure it
has had time to complete.
Another solution is to get LetsEncrypt to talk to a single instance, by
putting a single NS record wherever you need:
_acme-challenge.www.example.com. NS ns-primary.example.com.
If you wish, this approach also lets you have a completely separate
authoritative server, dedicated to handling ACME challenges. That in
turn can be something that accepts dynamic updates, without having to
allow dynamic updates on your main infrastructure.
If you need to debug this further, I suggest you capture the data
between LetsEncrypt and your authoritative servers, with query logging
or at worst using tcpdump, to work out what's going on.
>
> is there a way to make changes in the auth server immediately visible
> in the recursor?
You mean, clients using your local recursor are querying local zones and
seeing stale data? That's a completely different matter: that's just
standard recursor caching, and it's how the DNS is designed.
You can avoid that by setting a low TTL on the records in your zone, and
for negative caching using the "minimum" parameter in the SOA record.
In the extreme, you'd set those to zero, and then the recursor would
directly forward all queries to the authoritative server - but something
like 60 seconds is more system friendly. You might as well get *some*
benefit from the recursor cache.
Or else, whenever you bump the auth zone, you can flush the
corresponding recursor zone - but that's a step you'd have to do yourself.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.powerdns.com/pipermail/pdns-users/attachments/20220309/b5845436/attachment.htm>
More information about the Pdns-users
mailing list