[Pdns-users] Slave AXFR not working 100% at high rates
Martijn Grendelman
martijn at pocos.nl
Tue Jul 18 15:38:58 UTC 2006
Hi,
First off, I apologize for the length of this message.
I run three PowerDNS servers, one master and two slaves, all with their
own MySQL backend. The slaves know the master as 'supermaster' and this
appears to work.
Yesterday, I added a whole bunch (about 250) of domains to the master,
and the slaves were notified for each of them. However, the AXFR of the
domains, didn't work entirely as it should have, and I can't quite
explain what I see here:
On the master:
Jul 17 17:02:12 ilsia051 pdns[2152]: Queued notification of domain
'alkmaarrulez.nl' to 62.69.177.12
Jul 17 17:02:12 ilsia051 pdns[2152]: Queued notification of domain
'alkmaarrulez.nl' to 62.69.184.65
Jul 17 17:02:12 ilsia051 pdns[2152]: Queued notification of domain
'alkmaarrulez.nl' to 80.79.42.110
Jul 17 17:02:24 ilsia051 pdns[2152]: Received unsuccesful notification
report for 'alkmaarrulez.nl' from 62.69.177.12, rcode: 4
On one of the slaves::
Jul 17 17:02:23 ilsia251 pdns[19405]: Received NOTIFY for
alkmaarrulez.nl from 62.69.177.12 for which we are not authoritative
Jul 17 17:02:25 ilsia251 pdns[19405]: Created new slave zone
'alkmaarrulez.nl' from supermaster 62.69.177.12, queued axfr
Jul 17 17:02:25 ilsia251 pdns[949]: gmysql Connection succesful
Jul 17 17:02:25 ilsia251 pdns[949]: AXFR started for 'alkmaarrulez.nl',
transaction started
Jul 17 17:02:25 ilsia251 pdns[949]: AXFR done for 'alkmaarrulez.nl',
zone committed
It's the last line in the log file on the master that I don't
understand. What does 'rcode: 4' mean here, I mean more than "not
implemented" ?
The problem is, that the zone transfers really don't work 100%. In this
particular case, all the new domains were queued for notification at a
really high rate. It seems like either the slave or the master just
couldn't keep up. In the end, I was just missing records.
Today, I updated the serials for _all_ 1589 domains on the master, and
another "notify storm" was triggered (well, that's what I wanted). After
everything was calm again, I looked at the number of records in the
databases:
domains records
master 1589 22822
slave1 1589 20770
slave2 1589 20780
Still missing records!
A log extract regarding this action:
On the master:
Jul 18 16:27:02 ilsia051 pdns[32722]: AXFR of domain
'wageningenrulez.nl' initiated by 62.69.184.65
Jul 18 16:27:02 ilsia051 pdns[32722]: gmysql Connection succesful
Jul 18 16:27:02 ilsia051 pdns[32722]: AXFR of domain
'wageningenrulez.nl' to 62.69.184.65 finished
Jul 18 16:27:03 ilsia051 pdns[2152]: Received unsuccesful notification
report for 'wageningenrulez.nl' from 62.69.177.12, rcode: 4
On the slave:
Jul 18 16:27:02 ilsia251 pdns[949]: AXFR started for
'wageningenrulez.nl', transaction started
Jul 18 16:27:02 ilsia251 pdns[949]: AXFR done for 'wageningenrulez.nl',
zone committed
...and nothing else.
Now, with a script, I explicitly queued notifies for all domains (with
pdns_control) at 1 second intervals. A lot of domains on the slaves were
up to date, but also a lot of domains were not and AXFRs were started
for those.
After this had finished, the records count on all three servers was
22822, as it should have been a long time ago.
Can anyone shed some light on this?
Best regards,
Martijn Grendelman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3233 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.powerdns.com/pipermail/pdns-users/attachments/20060718/2f963b6d/attachment.bin>
More information about the Pdns-users
mailing list