[Pdns-users] pdns_recursor stops getting queries on Solaris 10 sparc
Alex Kiernan
alex.kiernan at gmail.com
Mon Sep 17 10:10:50 UTC 2007
On 14/09/2007, Alex Kiernan <alex.kiernan at gmail.com> wrote:
> On 14/09/2007, Juergen Georgi <georgi at belwue.de> wrote:
> > Hello,
> >
> > today the same blackout happend here on our Sun-Fire-V240 with
> > Solaris 10: No answers on the main interface, only on 127.0.0.1.
> > Same truss output:
> >
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > port_getn(9, 0x0012B4E0, 1024, 1, 0xFFBEF730) = 0 [62]
> > ...
> >
> > This happened a couple of times before in test settings, see
> >
> > http://mailman.powerdns.com/pipermail/pdns-users/2006-December/004008.html
> >
> > This time it hit one of our production servers.
> >
> > I hope your patch will prove to be the cure.
> >
>
> For reference, this is the patch against 3.1.4 I've just applied to
> our tree and have just set building:
>
> Index: pdns-recursor/portsmplexer.cc
> ===================================================================
> RCS file: /cvsroot/upstream/pdns-recursor/portsmplexer.cc,v
> retrieving revision 1.1.1.1
> diff -u -r1.1.1.1 portsmplexer.cc
> --- pdns-recursor/portsmplexer.cc 12 Nov 2006 16:56:13 -0000 1.1.1.1
> +++ pdns-recursor/portsmplexer.cc 14 Sep 2007 13:28:58 -0000
> @@ -91,10 +91,15 @@
>
> gettimeofday(now,0);
>
> - if(ret < 0 && errno!=EINTR && errno!=ETIME)
> - throw FDMultiplexerException("completion port_getn returned
> error: "+stringerror());
> + if(ret < 0) {
> + if(errno!=EINTR && errno!=ETIME)
> + throw FDMultiplexerException("completion port_getn returned
> error: "+stringerror());
> + // EINTR and ETIME are not really errors
> + if(errno==EINTR)
> + return 0;
> + }
>
> - if((ret < 0 && errno==ETIME) || numevents==0) // nothing
> + if(!numevents) // nothing
> return 0;
>
> d_inrun=true;
>
> Shout if it gets mangled in transit and I'll stick it on a website somewhere.
>
I looked at this some more and decided the logic wasn't entirely
obvious (entering the first if clause for ETIME gratuitously), so I
rewrote it and added some coments. The patch against 3.1.4 is here:
http://www.alexk1.demon.co.uk/patch-3.1.4
And against the current trunk:
http://www.alexk1.demon.co.uk/patch-trunk
--
Alex Kiernan
More information about the Pdns-users
mailing list