[Pdns-users] powerdns segfaulting?

Daniel Greene - (mt) Media Temple daniel at mediatemple.net
Thu Jul 31 16:29:27 UTC 2008


Very strange.. we just had one of our 3-node pdns clusters go down, each box with a different issue. One had segfaulted, one had hung on a FUTEX call (from strace, though there was a bit of a rush to get the cluster back so I was unable to look further at the time), and the process had simply vanished on the third.

version: PowerDNS 2.9.21 (C) 2001-2006 PowerDNS.COM BV (Apr  9 2008, 10:37:43, gcc 4.2.3 (Ubuntu 4.2.3-2ubuntu7))

relevant pdns.conf for the 'questions waiting':
max-queue-length=20000
queue-limit=1500
distributor-threads=10

1 - Has anyone seen a similar signal 11 issue, or suggestions for what to look into immediately if this happens? I apologize again, we were in a crunch to get things back up.

Jul 31 07:30:00 pdns12 pdns[8315]: 20018 questions waiting for database attention. Limit is 20000, respawning
Jul 31 07:30:00 pdns12 pdns[15887]: Our pdns instance exited with code 1
Jul 31 07:30:00 pdns12 pdns[15887]: Respawning
Jul 31 07:30:01 pdns12 pdns[15887]: Got a signal 11, attempting to print trace: 
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server [0x479610]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6 [0x7f36b737f100]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6(fgets+0x44) [0x7f36b73b0514]
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server [0x478a7b]
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server(_ZN11DynListener11theListenerEv+0x501) [0x48a621]
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server(_ZN11DynListener17theListenerHelperEPv+0x9) [0x48b669]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libpthread.so.0 [0x7f36b76b53f7]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6(clone+0x6d) [0x7f36b7424b2d]

2 - Should we up the number of distributor threads?

* The load/memory usage remained fairly constant on the servers in question (per sar)

* mysql -e "show status like 'Max%'";
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Max_used_connections | 16    | 
+----------------------+-------+

which is quite low

* qsize-q still peaked above 20k

Everything I've read indicates that only a relatively low number of distributor threads are necessary, but in this case I'm venturing that it may help?

Thanks in advance for any insight you guys may have.

Daniel


More information about the Pdns-users mailing list