[Pdns-users] pdns_recursor stops getting queries on Solaris 10 sparc

Alex Kiernan alex.kiernan at gmail.com
Mon Oct 8 06:14:17 UTC 2007


On 17/09/2007, Alex Kiernan <alex.kiernan at gmail.com> wrote:
> On 14/09/2007, Alex Kiernan <alex.kiernan at gmail.com> wrote:
> > On 14/09/2007, Juergen Georgi <georgi at belwue.de> wrote:
> > > Hello,
> > >
> > > today the same blackout happend here on our Sun-Fire-V240 with
> > > Solaris 10: No answers on the main interface, only on 127.0.0.1.
> > > Same truss output:
> > >
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730)   = 0 [62]
> > > ...
> > >
> > > This happened a couple of times before in test settings, see
> > >
> > > http://mailman.powerdns.com/pipermail/pdns-users/2006-December/004008.html
> > >
> > > This time it hit one of our production servers.
> > >
> > > I hope your patch will prove to be the cure.
> > >
> >
> > For reference, this is the patch against 3.1.4 I've just applied to
> > our tree and have just set building:
> >
> > Index: pdns-recursor/portsmplexer.cc
> > ===================================================================
> > RCS file: /cvsroot/upstream/pdns-recursor/portsmplexer.cc,v
> > retrieving revision 1.1.1.1
> > diff -u -r1.1.1.1 portsmplexer.cc
> > --- pdns-recursor/portsmplexer.cc       12 Nov 2006 16:56:13 -0000      1.1.1.1
> > +++ pdns-recursor/portsmplexer.cc       14 Sep 2007 13:28:58 -0000
> > @@ -91,10 +91,15 @@
> >
> >   gettimeofday(now,0);
> >
> > -  if(ret < 0 && errno!=EINTR && errno!=ETIME)
> > -    throw FDMultiplexerException("completion port_getn returned
> > error: "+stringerror());
> > +  if(ret < 0) {
> > +    if(errno!=EINTR && errno!=ETIME)
> > +      throw FDMultiplexerException("completion port_getn returned
> > error: "+stringerror());
> > +    // EINTR and ETIME are not really errors
> > +    if(errno==EINTR)
> > +      return 0;
> > +  }
> >
> > -  if((ret < 0 && errno==ETIME) || numevents==0) // nothing
> > +  if(!numevents) // nothing
> >     return 0;
> >
> >   d_inrun=true;
> >
> > Shout if it gets mangled in transit and I'll stick it on a website somewhere.
> >
>
> I looked at this some more and decided the logic wasn't entirely
> obvious (entering the first if clause for ETIME gratuitously), so I
> rewrote it and added some coments. The patch against 3.1.4 is here:
>
> http://www.alexk1.demon.co.uk/patch-3.1.4
>
> And against the current trunk:
>
> http://www.alexk1.demon.co.uk/patch-trunk
>

I've 3 weeks of uptime now w/ this patch w/o problems.

-- 
Alex Kiernan


More information about the Pdns-users mailing list