[Pdns-users] pdns_recursor stops getting queries on Solaris 10 sparc

Alex Kiernan alex.kiernan at gmail.com
Mon Sep 17 10:10:50 UTC 2007


On 14/09/2007, Alex Kiernan <alex.kiernan at gmail.com> wrote:
> On 14/09/2007, Juergen Georgi <georgi at belwue.de> wrote:
> > Hello,
> >
> > today the same blackout happend here on our Sun-Fire-V240 with
> > Solaris 10: No answers on the main interface, only on 127.0.0.1.
> > Same truss output:
> >
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > ...
> >
> > This happened a couple of times before in test settings, see
> >
> > http://mailman.powerdns.com/pipermail/pdns-users/2006-December/004008.html
> >
> > This time it hit one of our production servers.
> >
> > I hope your patch will prove to be the cure.
> >
>
> For reference, this is the patch against 3.1.4 I've just applied to
> our tree and have just set building:
>
> Index: pdns-recursor/portsmplexer.cc
> ===================================================================
> RCS file: /cvsroot/upstream/pdns-recursor/portsmplexer.cc,v
> retrieving revision 1.1.1.1
> diff -u -r1.1.1.1 portsmplexer.cc
> --- pdns-recursor/portsmplexer.cc       12 Nov 2006 16:56:13 -0000      1.1.1.1
> +++ pdns-recursor/portsmplexer.cc       14 Sep 2007 13:28:58 -0000
> @@ -91,10 +91,15 @@
>
>   gettimeofday(now,0);
>
> -  if(ret < 0 && errno!=EINTR && errno!=ETIME)
> -    throw FDMultiplexerException("completion port_getn returned
> error: "+stringerror());
> +  if(ret < 0) {
> +    if(errno!=EINTR && errno!=ETIME)
> +      throw FDMultiplexerException("completion port_getn returned
> error: "+stringerror());
> +    // EINTR and ETIME are not really errors
> +    if(errno==EINTR)
> +      return 0;
> +  }
>
> -  if((ret < 0 && errno==ETIME) || numevents==0) // nothing
> +  if(!numevents) // nothing
>     return 0;
>
>   d_inrun=true;
>
> Shout if it gets mangled in transit and I'll stick it on a website somewhere.
>

I looked at this some more and decided the logic wasn't entirely
obvious (entering the first if clause for ETIME gratuitously), so I
rewrote it and added some coments. The patch against 3.1.4 is here:

http://www.alexk1.demon.co.uk/patch-3.1.4

And against the current trunk:

http://www.alexk1.demon.co.uk/patch-trunk

-- 
Alex Kiernan


More information about the Pdns-users mailing list