[Pdns-users] PDNS-Recursor Segfaults

Imre Gergely gimre at narancs.net
Tue May 20 21:16:22 UTC 2014


backtrace attached.

[root at c605 pdns-recursor]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 3873
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 8192
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 3873
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


On 05/21/2014 12:00 AM, Aki Tuomi wrote:
> Can you install the debuginfo package and run it with gdb to get stack trace? Also, can 
> you give us ulimit -a?
>
> Aki
>
> On Tue, May 20, 2014 at 11:41:21PM +0300, Imre Gergely wrote:
>> [root at c605 ~]# cat /etc/pdns-recursor/recursor.conf |grep -v "^#" | grep
>> -v "^$"
>> setuid=pdns-recursor
>> setgid=pdns-recursor
>> daemon=no
>> local-address=127.0.0.1
>> threads=1
>>
>> [root at c605 ~]# strace /usr/sbin/pdns_recursor >
>> /tmp/strace-pdns-recursor.txt 2>&1
>> Segmentation fault
>> [root at c605 ~]#
>>
>> [root at c605 ~]# ip a |grep inet |wc
>>    4576   27451  233679
>>
>> Attached. If this is not what you had in mind, please let me know.
>>
>> On 05/20/2014 11:31 PM, bert hubert wrote:
>>> Imre,
>>>
>>> Can you strace the startup with threads=1?
>>>
>>>     Bert
>>>
>>> On May 20, 2014 10:25 PM, Imre Gergely <gimre at narancs.net> wrote:
>>>> Hi
>>>>
>>>> I did manage to reproduce this in a VM. Installed a CentOS 6.5, and recursor 3.5.3 from EPEL. Then I did this:
>>>>
>>>> for i in `seq 1 16`; do for j in `seq 1 254`; do ip a a 10.0.$i.$j/16 dev eth0; done; done
>>>>
>>>> Then I started the recursor, everything went just fine, did a bunch of digs, no problems.
>>>>
>>>> Then I added some more IPs:
>>>>
>>>> for i in `seq 17 32`; do for j in `seq 1 254`; do ip a a 10.0.$i.$j/16 dev eth0; done; done
>>>>
>>>> And then init.d/pdns-recursor restart:
>>>>
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: PowerDNS recursor 3.5.3 (C) 2001-2013 PowerDNS.COM BV (Feb 10 2014, 17:26:52, gcc 4.4.7 20120313 (Red Hat 4.4.7-4))
>>>>  starting up
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Operating in 32 bits mode
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Reading random entropy from '/dev/urandom'
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Only allowing queries from: 127.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 169.254.0.0/16, 192.168.0.0/16, 172.16.0.0/12, ::1/128, fe80::/10
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Will not send queries to: 127.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 169.254.0.0/16, 192.168.0.0/16, 172.16.0.0/12, ::1/128, fe80::/10, 0.0.0.0, ::
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: NOT using IPv6 for outgoing queries - set 'query-local-address6=::' to enable
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Inserting rfc 1918 private space zones
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Listening for UDP queries on 127.0.0.1:53
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Enabled TCP data-ready filter for (slight) DoS protection
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Listening for TCP queries on 127.0.0.1:53
>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Calling daemonize, going to background
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Set effective group id to 499
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Set effective user id to 498
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Launching 2 threads
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Done priming cache with root hints
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Enabled 'epoll' multiplexer
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Done priming cache with root hints
>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Refreshed . records
>>>> May 20 23:18:25 c605 kernel: pdns_recursor[21345]: segfault at ffff01d4 ip 080b1626 sp b6397890 error 4 in pdns_recursor[8048000+112000]
>>>>
>>>> [root at c605 ~]# ip a |grep inet |wc
>>>>    4322   25927  220579
>>>> [root at c605 ~]# /etc/init.d/pdns-recursor start
>>>> Starting pdns-recursor:                                    [  OK  ]  <-- starts OK
>>>> [root at c605 ~]# /etc/init.d/pdns-recursor stop
>>>> Stopping pdns-recursor:                                    [  OK  ]
>>>>
>>>> Adding one more /24:
>>>>
>>>> [root at c605 ~]# for j in `seq 1 254`; do ip a a 10.0.18.$j/16 dev eth0; done
>>>> [root at c605 ~]# /etc/init.d/pdns-recursor start
>>>> Starting pdns-recursor:                                    [  OK  ]
>>>> [root at c605 ~]# ip a |grep inet |wc
>>>>    4576   27451  233679
>>>>
>>>> It says it starts, but it doesn't, just segfaults.
>>>>
>>>> [root at c605 ~]# file /bin/bash
>>>> /bin/bash: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
>>>>
>>>>
>>>> On 05/20/2014 10:58 PM, James Baer wrote:
>>>>> Hi All - I'm experiencing an issue that I am unsure if it is a bug or just something I need to adjust on my systems to account for. 
>>>>>
>>>>> I have 2 servers, both running pdns_recursor (3.5.3) on Centos 6.5, installed from epel repository. The recursor is only listening on localhost on each system. 
>>>>>
>>>>> I am experiencing somewhat random crashes of the recursor with the following error: 
>>>>>
>>>>> kernel: pdns_recursor[21993]: segfault at 200001fc8 ip 0000000000472780 sp 00007f3f9c03f690 error 4 in pdns_recursor[400000+111000] 
>>>>>
>>>>> Both servers have a large number of ip addresses bound to them, in the range of 3-4k. I was able to replicate the segfaults on one of the servers by adding additional ip addresses. When I got to around 4k ip addresses the recursor simply would not even start, just segafulted right away. I was able to get it to start again, by removing some ip addresses, so i know it has something to do with how many addresses I have bound the server. 
>>>>>
>>>>> Any body have an ideas what I can do to correct this problem? I really don't see a reason why the recursor would care how many ip addresses I have on a system. 
>>>>>
>>>>> thank you 
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________ 
>>>>> Pdns-users mailing list 
>>>>> Pdns-users at mailman.powerdns.com 
>>>>> http://mailman.powerdns.com/mailman/listinfo/pdns-users 
>>>>>
>>>> -- 
>>>>
>>>> Imre Gergely
>>>>
>>>> http://havaz.net
>>>>
>>>> gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
>>>>
>> -- 
>> Imre Gergely
>> http://havaz.net
>> gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
>>
>
>> _______________________________________________
>> Pdns-users mailing list
>> Pdns-users at mailman.powerdns.com
>> http://mailman.powerdns.com/mailman/listinfo/pdns-users

-- 
Imre Gergely
http://havaz.net
gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305

-------------- next part --------------
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/pdns_recursor...Reading symbols from /usr/lib/debug/usr/sbin/pdns_recursor.debug...done.
done.
[?1034h(gdb) run
Starting program: /usr/sbin/pdns_recursor 
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x003b629e in make_request () from /lib/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.i686 libgcc-4.4.7-4.el6.i686 libstdc++-4.4.7-4.el6.i686 lua-5.1.4-4.1.el6.i686
(gdb) backtrace
#0  0x003b629e in make_request () from /lib/libc.so.6
#1  0x003b64e9 in __check_pf () from /lib/libc.so.6
#2  0x003754af in getaddrinfo () from /lib/libc.so.6
#3  0x0807af9b in makeIPv6sockaddr (addr="2001:503:ba3e::2:30", ret=0xb7ff5d20) at misc.cc:717
#4  0x080ba3d5 in ComboAddress (rr=...) at iputils.hh:122
#5  DNSRR2String (rr=...) at recursor_cache.cc:75
#6  0x080bca00 in MemRecursorCache::replace (this=0x81655c8, now=1400620340, qname="a.root-servers.net.", qt=..., content=std::set with 1 elements = {...}, auth=true) at recursor_cache.cc:249
#7  0x0805a2b2 in SyncRes::doResolveAt (this=0xb7ff6b60, nameservers=std::set with 13 elements = {...}, auth=".", flawedNSSet=false, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:1043
#8  0x08056472 in SyncRes::doResolve (this=0xb7ff6b60, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:440
#9  0x08062847 in SyncRes::beginResolve (this=0xb7ff6b60, qname=".", qtype=..., qclass=1, ret=std::vector of length 0, capacity 0) at syncres.cc:126
#10 0x08095cdc in houseKeeping () at pdns_recursor.cc:1179
#11 0x080ac2be in MTasker<PacketID, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::threadWrapper (self1=0, self2=135687208, tf=0x8095bc0 <houseKeeping(void*)>, tid=0, val1=0, val2=0) at mtasker.cc:380
#12 0x002e9b9b in makecontext () from /lib/libc.so.6
#13 0x00000000 in ?? ()
(gdb) 
#0  0x003b629e in make_request () from /lib/libc.so.6
#1  0x003b64e9 in __check_pf () from /lib/libc.so.6
#2  0x003754af in getaddrinfo () from /lib/libc.so.6
#3  0x0807af9b in makeIPv6sockaddr (addr="2001:503:ba3e::2:30", ret=0xb7ff5d20) at misc.cc:717
#4  0x080ba3d5 in ComboAddress (rr=...) at iputils.hh:122
#5  DNSRR2String (rr=...) at recursor_cache.cc:75
#6  0x080bca00 in MemRecursorCache::replace (this=0x81655c8, now=1400620340, qname="a.root-servers.net.", qt=..., content=std::set with 1 elements = {...}, auth=true) at recursor_cache.cc:249
#7  0x0805a2b2 in SyncRes::doResolveAt (this=0xb7ff6b60, nameservers=std::set with 13 elements = {...}, auth=".", flawedNSSet=false, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:1043
#8  0x08056472 in SyncRes::doResolve (this=0xb7ff6b60, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:440
#9  0x08062847 in SyncRes::beginResolve (this=0xb7ff6b60, qname=".", qtype=..., qclass=1, ret=std::vector of length 0, capacity 0) at syncres.cc:126
#10 0x08095cdc in houseKeeping () at pdns_recursor.cc:1179
#11 0x080ac2be in MTasker<PacketID, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::threadWrapper (self1=0, self2=135687208, tf=0x8095bc0 <houseKeeping(void*)>, tid=0, val1=0, val2=0) at mtasker.cc:380
#12 0x002e9b9b in makecontext () from /lib/libc.so.6
#13 0x00000000 in ?? ()
(gdb) quit
A debugging session is active.

	Inferior 1 [process 5896] will be killed.

Quit anyway? (y or n) 


More information about the Pdns-users mailing list