[Pdns-users] PDNS-Recursor Segfaults
    Imre Gergely 
    gimre at narancs.net
       
    Tue May 20 21:54:32 UTC 2014
    
    
  
Nothing changes if I set query-local-address, it still crashes.
The recursor is not used, this is just a test VM. If the number of IPs
is lower, the recursor does start normally, but I just noticed that it
doesn't work, and it crashed at first query, with the following:
[root at c605 ~]# ip a |grep inet |wc
   4322   25927  220579
Program received signal SIGSEGV, Segmentation fault.
0x0031cfb4 in malloc_consolidate (av=0x4403a0) at malloc.c:5212
5212                if (!nextinuse) {
Missing separate debuginfos, use: debuginfo-install
libgcc-4.4.7-4.el6.i686 libstdc++-4.4.7-4.el6.i686 lua-5.1.4-4.1.el6.i686
(gdb) backtrace
#0  0x0031cfb4 in malloc_consolidate (av=0x4403a0) at malloc.c:5212
#1  0x0031fccd in _int_malloc (av=0x4403a0, bytes=103728) at malloc.c:4406
#2  0x0032102e in __libc_malloc (bytes=103728) at malloc.c:3664
#3  0x003b639b in make_request (fd=17, pid=6491, seen_ipv4=0x819c2eb,
seen_ipv6=0x819c2ea, in6ai=0x819c2e0, in6ailen=0x819c2dc) at
../sysdeps/unix/sysv/linux/check_pf.c:215
#4  0x003b6509 in __check_pf (seen_ipv4=0x819c2eb, seen_ipv6=0x819c2ea,
in6ai=0x819c2e0, in6ailen=0x819c2dc) at
../sysdeps/unix/sysv/linux/check_pf.c:272
#5  0x003754cf in getaddrinfo (name=0x819e2a4 "2001:503:a83e::2:30",
service=<value optimized out>, hints=0x819c324, pai=0x819c344) at
../sysdeps/posix/getaddrinfo.c:2315
#6  0x0807af9b in makeIPv6sockaddr (addr="2001:503:a83e::2:30",
ret=0x819c3c0) at misc.cc:717
#7  0x080ba3d5 in ComboAddress (rr=...) at iputils.hh:122
#8  DNSRR2String (rr=...) at recursor_cache.cc:75
#9  0x080bca00 in MemRecursorCache::replace (this=0x81655b0,
now=1400622756, qname=
, qt=..., content=std::set with 1 elements = {...}, auth=false) at
recursor_cache.cc:249
#10 0x0805a2b2 in SyncRes::doResolveAt (this=0x819d26c,
nameservers=std::set with 13 elements = {...}, auth=".",
flawedNSSet=false, qname="www.narancs.net.", qtype=..., ret=std::vector
of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements =
{...}) at syncres.cc:1043
#11 0x08056472 in SyncRes::doResolve (this=0x819d26c,
qname="www.narancs.net.", qtype=..., ret=std::vector of length 0,
capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at
syncres.cc:440
#12 0x08062847 in SyncRes::beginResolve (this=0x819d26c,
qname="www.narancs.net.", qtype=..., qclass=1, ret=std::vector of length
0, capacity 0) at syncres.cc:126
#13 0x0808f4f8 in startDoResolve (p=0x8166bc8) at pdns_recursor.cc:536
#14 0x080ac2be in MTasker<PacketID, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > >::threadWrapper
(self1=0, self2=135687232, tf=0x808eb40 <startDoResolve(void*)>, tid=1,
val1=0, val2=135687112) at mtasker.cc:380
#15 0x002e9b9b in makecontext () at
../sysdeps/unix/sysv/linux/i386/makecontext.S:88
#16 0x00000000 in ?? ()
On 05/21/2014 12:40 AM, Aki Tuomi wrote:
> Thank you, this is pretty much what happens. Just need to figure
> out why it crashes. No obvious reason stands out, other than why
> it's using make_request for numeric host is not something I 
> understand, and why it happens if you increase the amount of 
> local addresses. 
>
> is the recursor used by localhost? does anything change if you
> set query-local-address to some IP address? 
>
> Aki
>
> On Wed, May 21, 2014 at 12:16:22AM +0300, Imre Gergely wrote:
>> backtrace attached.
>>
>> [root at c605 pdns-recursor]# ulimit -a
>> core file size          (blocks, -c) 0
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 3873
>> max locked memory       (kbytes, -l) 64
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 8192
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 10240
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 3873
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>>
>> On 05/21/2014 12:00 AM, Aki Tuomi wrote:
>>> Can you install the debuginfo package and run it with gdb to get stack trace? Also, can 
>>> you give us ulimit -a?
>>>
>>> Aki
>>>
>>> On Tue, May 20, 2014 at 11:41:21PM +0300, Imre Gergely wrote:
>>>> [root at c605 ~]# cat /etc/pdns-recursor/recursor.conf |grep -v "^#" | grep
>>>> -v "^$"
>>>> setuid=pdns-recursor
>>>> setgid=pdns-recursor
>>>> daemon=no
>>>> local-address=127.0.0.1
>>>> threads=1
>>>>
>>>> [root at c605 ~]# strace /usr/sbin/pdns_recursor >
>>>> /tmp/strace-pdns-recursor.txt 2>&1
>>>> Segmentation fault
>>>> [root at c605 ~]#
>>>>
>>>> [root at c605 ~]# ip a |grep inet |wc
>>>>    4576   27451  233679
>>>>
>>>> Attached. If this is not what you had in mind, please let me know.
>>>>
>>>> On 05/20/2014 11:31 PM, bert hubert wrote:
>>>>> Imre,
>>>>>
>>>>> Can you strace the startup with threads=1?
>>>>>
>>>>>     Bert
>>>>>
>>>>> On May 20, 2014 10:25 PM, Imre Gergely <gimre at narancs.net> wrote:
>>>>>> Hi
>>>>>>
>>>>>> I did manage to reproduce this in a VM. Installed a CentOS 6.5, and recursor 3.5.3 from EPEL. Then I did this:
>>>>>>
>>>>>> for i in `seq 1 16`; do for j in `seq 1 254`; do ip a a 10.0.$i.$j/16 dev eth0; done; done
>>>>>>
>>>>>> Then I started the recursor, everything went just fine, did a bunch of digs, no problems.
>>>>>>
>>>>>> Then I added some more IPs:
>>>>>>
>>>>>> for i in `seq 17 32`; do for j in `seq 1 254`; do ip a a 10.0.$i.$j/16 dev eth0; done; done
>>>>>>
>>>>>> And then init.d/pdns-recursor restart:
>>>>>>
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: PowerDNS recursor 3.5.3 (C) 2001-2013 PowerDNS.COM BV (Feb 10 2014, 17:26:52, gcc 4.4.7 20120313 (Red Hat 4.4.7-4))
>>>>>>  starting up
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: PowerDNS comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it according to the terms of the GPL version 2.
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Operating in 32 bits mode
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Reading random entropy from '/dev/urandom'
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Only allowing queries from: 127.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 169.254.0.0/16, 192.168.0.0/16, 172.16.0.0/12, ::1/128, fe80::/10
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Will not send queries to: 127.0.0.0/8, 10.0.0.0/8, 100.64.0.0/10, 169.254.0.0/16, 192.168.0.0/16, 172.16.0.0/12, ::1/128, fe80::/10, 0.0.0.0, ::
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: NOT using IPv6 for outgoing queries - set 'query-local-address6=::' to enable
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Inserting rfc 1918 private space zones
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Listening for UDP queries on 127.0.0.1:53
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Enabled TCP data-ready filter for (slight) DoS protection
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Listening for TCP queries on 127.0.0.1:53
>>>>>> May 20 23:18:24 c605 pdns_recursor[21341]: Calling daemonize, going to background
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Set effective group id to 499
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Set effective user id to 498
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Launching 2 threads
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Done priming cache with root hints
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Enabled 'epoll' multiplexer
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Done priming cache with root hints
>>>>>> May 20 23:18:24 c605 pdns_recursor[21342]: Refreshed . records
>>>>>> May 20 23:18:25 c605 kernel: pdns_recursor[21345]: segfault at ffff01d4 ip 080b1626 sp b6397890 error 4 in pdns_recursor[8048000+112000]
>>>>>>
>>>>>> [root at c605 ~]# ip a |grep inet |wc
>>>>>>    4322   25927  220579
>>>>>> [root at c605 ~]# /etc/init.d/pdns-recursor start
>>>>>> Starting pdns-recursor:                                    [  OK  ]  <-- starts OK
>>>>>> [root at c605 ~]# /etc/init.d/pdns-recursor stop
>>>>>> Stopping pdns-recursor:                                    [  OK  ]
>>>>>>
>>>>>> Adding one more /24:
>>>>>>
>>>>>> [root at c605 ~]# for j in `seq 1 254`; do ip a a 10.0.18.$j/16 dev eth0; done
>>>>>> [root at c605 ~]# /etc/init.d/pdns-recursor start
>>>>>> Starting pdns-recursor:                                    [  OK  ]
>>>>>> [root at c605 ~]# ip a |grep inet |wc
>>>>>>    4576   27451  233679
>>>>>>
>>>>>> It says it starts, but it doesn't, just segfaults.
>>>>>>
>>>>>> [root at c605 ~]# file /bin/bash
>>>>>> /bin/bash: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
>>>>>>
>>>>>>
>>>>>> On 05/20/2014 10:58 PM, James Baer wrote:
>>>>>>> Hi All - I'm experiencing an issue that I am unsure if it is a bug or just something I need to adjust on my systems to account for. 
>>>>>>>
>>>>>>> I have 2 servers, both running pdns_recursor (3.5.3) on Centos 6.5, installed from epel repository. The recursor is only listening on localhost on each system. 
>>>>>>>
>>>>>>> I am experiencing somewhat random crashes of the recursor with the following error: 
>>>>>>>
>>>>>>> kernel: pdns_recursor[21993]: segfault at 200001fc8 ip 0000000000472780 sp 00007f3f9c03f690 error 4 in pdns_recursor[400000+111000] 
>>>>>>>
>>>>>>> Both servers have a large number of ip addresses bound to them, in the range of 3-4k. I was able to replicate the segfaults on one of the servers by adding additional ip addresses. When I got to around 4k ip addresses the recursor simply would not even start, just segafulted right away. I was able to get it to start again, by removing some ip addresses, so i know it has something to do with how many addresses I have bound the server. 
>>>>>>>
>>>>>>> Any body have an ideas what I can do to correct this problem? I really don't see a reason why the recursor would care how many ip addresses I have on a system. 
>>>>>>>
>>>>>>> thank you 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________ 
>>>>>>> Pdns-users mailing list 
>>>>>>> Pdns-users at mailman.powerdns.com 
>>>>>>> http://mailman.powerdns.com/mailman/listinfo/pdns-users 
>>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Imre Gergely
>>>>>>
>>>>>> http://havaz.net
>>>>>>
>>>>>> gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
>>>>>>
>>>> -- 
>>>> Imre Gergely
>>>> http://havaz.net
>>>> gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
>>>>
>>>> _______________________________________________
>>>> Pdns-users mailing list
>>>> Pdns-users at mailman.powerdns.com
>>>> http://mailman.powerdns.com/mailman/listinfo/pdns-users
>> -- 
>> Imre Gergely
>> http://havaz.net
>> gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
>>
>> GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
>> Copyright (C) 2010 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "i686-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from /usr/sbin/pdns_recursor...Reading symbols from /usr/lib/debug/usr/sbin/pdns_recursor.debug...done.
>> done.
>> [?1034h(gdb) run
>> Starting program: /usr/sbin/pdns_recursor 
>> [Thread debugging using libthread_db enabled]
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x003b629e in make_request () from /lib/libc.so.6
>> Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.132.el6.i686 libgcc-4.4.7-4.el6.i686 libstdc++-4.4.7-4.el6.i686 lua-5.1.4-4.1.el6.i686
>> (gdb) backtrace
>> #0  0x003b629e in make_request () from /lib/libc.so.6
>> #1  0x003b64e9 in __check_pf () from /lib/libc.so.6
>> #2  0x003754af in getaddrinfo () from /lib/libc.so.6
>> #3  0x0807af9b in makeIPv6sockaddr (addr="2001:503:ba3e::2:30", ret=0xb7ff5d20) at misc.cc:717
>> #4  0x080ba3d5 in ComboAddress (rr=...) at iputils.hh:122
>> #5  DNSRR2String (rr=...) at recursor_cache.cc:75
>> #6  0x080bca00 in MemRecursorCache::replace (this=0x81655c8, now=1400620340, qname="a.root-servers.net.", qt=..., content=std::set with 1 elements = {...}, auth=true) at recursor_cache.cc:249
>> #7  0x0805a2b2 in SyncRes::doResolveAt (this=0xb7ff6b60, nameservers=std::set with 13 elements = {...}, auth=".", flawedNSSet=false, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:1043
>> #8  0x08056472 in SyncRes::doResolve (this=0xb7ff6b60, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:440
>> #9  0x08062847 in SyncRes::beginResolve (this=0xb7ff6b60, qname=".", qtype=..., qclass=1, ret=std::vector of length 0, capacity 0) at syncres.cc:126
>> #10 0x08095cdc in houseKeeping () at pdns_recursor.cc:1179
>> #11 0x080ac2be in MTasker<PacketID, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::threadWrapper (self1=0, self2=135687208, tf=0x8095bc0 <houseKeeping(void*)>, tid=0, val1=0, val2=0) at mtasker.cc:380
>> #12 0x002e9b9b in makecontext () from /lib/libc.so.6
>> #13 0x00000000 in ?? ()
>> (gdb) 
>> #0  0x003b629e in make_request () from /lib/libc.so.6
>> #1  0x003b64e9 in __check_pf () from /lib/libc.so.6
>> #2  0x003754af in getaddrinfo () from /lib/libc.so.6
>> #3  0x0807af9b in makeIPv6sockaddr (addr="2001:503:ba3e::2:30", ret=0xb7ff5d20) at misc.cc:717
>> #4  0x080ba3d5 in ComboAddress (rr=...) at iputils.hh:122
>> #5  DNSRR2String (rr=...) at recursor_cache.cc:75
>> #6  0x080bca00 in MemRecursorCache::replace (this=0x81655c8, now=1400620340, qname="a.root-servers.net.", qt=..., content=std::set with 1 elements = {...}, auth=true) at recursor_cache.cc:249
>> #7  0x0805a2b2 in SyncRes::doResolveAt (this=0xb7ff6b60, nameservers=std::set with 13 elements = {...}, auth=".", flawedNSSet=false, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:1043
>> #8  0x08056472 in SyncRes::doResolve (this=0xb7ff6b60, qname=".", qtype=..., ret=std::vector of length 0, capacity 0, depth=0, beenthere=std::set with 1 elements = {...}) at syncres.cc:440
>> #9  0x08062847 in SyncRes::beginResolve (this=0xb7ff6b60, qname=".", qtype=..., qclass=1, ret=std::vector of length 0, capacity 0) at syncres.cc:126
>> #10 0x08095cdc in houseKeeping () at pdns_recursor.cc:1179
>> #11 0x080ac2be in MTasker<PacketID, std::basic_string<char, std::char_traits<char>, std::allocator<char> > >::threadWrapper (self1=0, self2=135687208, tf=0x8095bc0 <houseKeeping(void*)>, tid=0, val1=0, val2=0) at mtasker.cc:380
>> #12 0x002e9b9b in makecontext () from /lib/libc.so.6
>> #13 0x00000000 in ?? ()
>> (gdb) quit
>> A debugging session is active.
>>
>> 	Inferior 1 [process 5896] will be killed.
>>
>> Quit anyway? (y or n) 
-- 
Imre Gergely
http://havaz.net
gpg --keyserver subkeys.pgp.net --recv-keys 0x34525305
    
    
More information about the Pdns-users
mailing list