[Pdns-users] Problem with pdns-recursor on Solaris 10 x86

Alex Kiernan alex.kiernan at gmail.com
Fri Nov 14 06:30:09 UTC 2008


I've found a problem with pdns-recursor on Solaris 10 x86 (32 bit
userland on 64 bit kernel). It smells like it ought to be something
like this bug:

http://bugs.opensolaris.org/view_bug.do?bug_id=6268715

On Solaris 10 x86 (32 bit), you see failures under load (but not
single queries) which causes port_getn to return with ret ==
0xfece4ce5 (though I suspect this changes randomly), errno == 0; I
know this from instrumenting the code exactly after the port_getn code
with:

  int e = errno;

and then:

      char c[100];
      sprintf(c, "%x/%d/%d\n", ret, e, errno);
      write(creat("/tmp/edebug", O_WRONLY), c, strlen(c));

just before the throw.

If you build 64 bit the problem goes away.

The problem appears to be there *only* on Intel, not AMD and down to
the optimised libc which Solaris picks on a per-platform basis. On AMD
(libc_hwcap2), the function entry to _portfs looks like this:

_portfs:                        movl   $0xb6,%eax
_portfs+5:                      syscall
_portfs+7:                      jb     -0x84677 <__cerror>
_portfs+0xd:                    ret

on Intel (libc_hwcap1) it's this:

_portfs:                        call   +0x5     <_portfs+5>
_portfs+5:                      popl   %edx
_portfs+6:                      movl   $0xb6,%eax
_portfs+0xb:                    movl   %esp,%ecx
_portfs+0xd:                    addl   $0x10,%edx
_portfs+0x13:                   sysenter
_portfs+0x15:                   jb     -0x847d5 <__cerror>
_portfs+0x1b:                   ret

I'm going to try and get our support folks to open a ticket with Sun.

-- 
Alex Kiernan


More information about the Pdns-users mailing list