[Pdns-users] Issue with communications hanging, version 2.9.17-13sarge1
Dave Taylor
davetaylor at frontiernet.net
Thu Oct 20 20:35:29 UTC 2005
Following is more information that may be relevant to the issue that we are
experiencing.
The process that checks for slave domains that need to be refreshed
(CommunicatorClass::slaveRefresh()) seems to just stop working after a
certain period of time.
An strace -p on this process shows the following over and over when working:
recvfrom(11, 0xbf5ff364, 1500, 0, 0xbf5ff944, 0xbf5ff204) = -1 EAGAIN
(Resource temporarily unavailable)
time(NULL) = 1129831123
time(NULL) = 1129831123
rt_sigprocmask(SIG_BLOCK, [CHLD], [RTMIN], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})
But only shows this when it has stopped working (1 time, not over and over):
recvfrom(11,
(the file descriptor for socket 11 shows: "pdns_serv 19540 pdns 11u
IPv4 912515 UDP *:10006")
And that is it. I can still force a retrieve with pdns_control and the
command will return, but nothing happens.
When it's working properly, pdns will
1) make it's connection to the mysql db
2) make a connection to the master server of the slave zone (as shown here
in the strace)
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 16
fcntl64(16, F_GETFL) = 0x2 (flags O_RDWR)
fcntl64(16, F_SETFL, O_RDWR|O_NONBLOCK) = 0
connect(16, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("x.x.x.x")}, 16) = -1 EINPROGRESS (Operation now in
progress)
select(17, [16], [16], NULL, {10, 0}) = 1 (out [16], left {10, 0})
getsockopt(16, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
fcntl64(16, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl64(16, F_SETFL, O_RDWR) = 0
writev(16, [{"\0\33", 2},
{"\16\351\0\0\0\1\0\0\0\0\0\0\00510nbc\3com\0\0\374\0\1", 27}], 2) = 29
* 2 bytes in buffer 0
| 00000 00 1b ..
|
* 27 bytes in buffer 1
| 00000 0e e9 00 00 00 01 00 00 00 00 00 00 05 31 30 6e ........
.....10n |
| 00010 62 63 03 63 6f 6d 00 00 fc 00 01 bc.com.. ...
|
select(17, [16], NULL, NULL, {10, 0}) = 1 (in [16], left {9, 990000})
3) query the local DB for the zone information.
4) query the master for it's information.
5) compare info and update as needed.
When this isn't working, step 2 above is not happening and therefore steps 4
and 5 never happen. It simply shuts down the mysql connection and closes.
I realize that this is what is happening after it has already stopped
working. I have not been able to pinpoint what might be making it stop
working.
Is there other information that may be helpful in figuring this out? I
would be glad to gather more info, but I'm not 100% sure of where to go from
here.
More information about the Pdns-users
mailing list