[Pdns-users] pdns_recursor stops getting queries on Solaris 10 sparc

Alex Kiernan alex.kiernan at gmail.com
Fri Sep 14 13:11:49 UTC 2007


On 14/09/2007, bert hubert <bert.hubert at netherlabs.nl> wrote:
> On Fri, Sep 14, 2007 at 01:51:19PM +0100, Alex Kiernan wrote:
>
> > Prepare to be surprised... I added instrumention so the code looked like this:
>
> Very cool you discovered this!
>
> This is pretty amazing. However, what I don't get is how this actually
> causes problems - we should get ret=0 on the next call to run() and hence
> port_getn.
>
> Or am I missing something?
>

I think the state machine in portfs says you've been told about these
events (edge vs. level) and doesn't bother telling you again until you
reset it by doing something with the event (which we'll never do).

I've just changed the code so it looks like this, which I'm testing w/
know (whilst leaving all the extra logging in):

  if(ret < 0) {
    if(errno!=EINTR && errno!=ETIME)
      throw FDMultiplexerException("completion port_getn returned
error: "+stringerror());
    // EINTR and ETIME are not really errors
    if (errno==EINTR)
      return 0;
  }


> What happens after these lines:
>

It just gets timeouts on every call and returns no events (~two
timeouts/s as we've a 500ms timeout):

Sep 14 12:35:35 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:35 2:ret=-1,errno=62,numevents=0
Sep 14 12:35:35 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:35 2:ret=-1,errno=62,numevents=0
Sep 14 12:35:36 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:36 2:ret=-1,errno=62,numevents=0
Sep 14 12:35:36 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:36 2:ret=-1,errno=62,numevents=0
Sep 14 12:35:37 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:37 2:ret=-1,errno=62,numevents=0
Sep 14 12:35:37 1:ret=-1,errno=62,numevents=0
Sep 14 12:35:37 2:ret=-1,errno=62,numevents=0
...

> > type messages, then it all goes silent as I get the workload pushed up
> > - then, when it all goes wrong, I see:
> >
> > Sep 14 12:35:34 1:ret=-1,errno=62,numevents=2
> > Sep 14 12:35:34 2:ret=-1,errno=62,numevents=2
>
> ? Does this loop? It is odd to see multiple timeouts in 1 second..
>

Its just the two tests I added each side of the gettimeofday call (1:
before, 2: after), just in case errno was getting stamped on (which it
isn't).

-- 
Alex Kiernan


More information about the Pdns-users mailing list