[Pdns-users] PowerDNS authoritative server random timeouts
Netsons - Federico Chiacchiaretta
f.chiacchiaretta at netsons.com
Tue Sep 17 14:09:32 UTC 2019
Hi,
we have a PowerDNS cluster of authoritative servers running on 4 nodes:
OS: CentOS 7.6.1810 (fully updated)
Version: pdns-4.1.13-1pdns.el7.x86_64
Backend: mysql - MariaDB-server-10.1.41-1.el7.centos.x86_64
Backend is configured with 1 master and 3 slaves.
We perform recurring checks (every 30s) to check if DNS server is
working, and these checks randomly time out.
Check are performed both from:
* an external tool (Pingdom) with a timeout of 30s
* a bash scripts on each node, which performs a dig on the public IP
address of that node (default time out of 5 seconds).
When a timeout occurs, it occurs only on one check mechanism (pingdom
or script), never on both simultaneously.
Output from our script is simply:
";; connection timed out; no servers could be reached"
Logs from pdns.service reports a lot of these messages
set 17 06:00:00 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:14 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:29 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
set 17 06:00:34 dns4.netsons.net pdns_server[10277]: TCP Connection
Thread died because of network error: Timeout reading data
but these messages do not match timeout on our checks (though I'd like
to understand why they get logged).
Do you have any hint about what I can check to further troubleshoot the
issue?
Thanks.
Best,
--
Federico Chiacchiaretta
System Administrator
Netsons S.r.l. - https://www.netsons.com
More information about the Pdns-users
mailing list