[Pdns-users] Records going missing in 3.4.4
moseleymark at gmail.com
Fri May 1 18:13:22 UTC 2015
This is going to be necessarily vague because I'm not even 50% sure what's
going on. Setup is Ubuntu Precise 64-bit, temporarily running 3.4.4. We
were (and are) running 3.4.2 previously. The scenario below was painful
enough that we didn't try 3.4.3 yet, but we can.
We've got a pretty big database of records (180M+) and mountains of legacy
garbage in it -- which made upgrading from 2.9.x lots of fun.
Of all the things I cleaned up, one thing I *didn't* clean up a lot of
records with trailing dots in the content field (for NS/MX/CNAME records).
When we upgraded from 3.4.2 to 3.4.4, within a few hours we noticed a weird
phenomenon: records that existed would suddenly start returning NXDOMAIN. A
restart of pdns would get them back and then some time later (5 mins, 15
mins, whatever), they'd go NXDOMAIN. If you left it long enough, they'd
eventually return correctly for a while and then eventually later go back
to NXDOMAIN. And I repeatedly verified that the records were still in the
One additional interesting piece of info is that after we reverted back to
3.4.2, it'd take an hour or more before things went back to normal (and our
TTLs are typically 1 hour). That is, even after moving back to 3.4.2, we'd
lose those records periodically, on and off -- but after an hour, all was
I *think* it's related to trailing dots on records, esp NS and CNAMEs. But
I've been completely unable to replicate the issue. In some cases, the
missing record had CNAMEs pointing to it (and the CNAME 'content' had a
trailing dot). In other cases, it seemed like the trailing dots on NS
records in a domain was at issue. But again, I'm not even remotely positive
about that. Nothing would show up in the powerdns logs either (though with
debugging at the 'normal' level).
I've stared at pcaps of the DNS traffic and for the mysql queries, trying
to correlate them from when records would disappear without much luck.
Initially I did see some mysql queries where the trailing dot version was
actually getting used in the powerdns query, which is what put me on the
trail of trailing dots in the first place.
We're in the middle of a big cleanup to eradicate these trailing dots and
are back on 3.4.2 for the time being till we can get it done. But I was
curious if a) this was a known issue; or b) anyone's seen it before, since
the trailing dots part could be a red herring.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Pdns-users