[Pdns-users] CPU consumption of pdns_recursor

Nejedlo, Mark Mark.Nejedlo at tdstelecom.com
Tue Apr 6 16:25:44 UTC 2021


On Tuesday, April 6, 2021 10:04 AM, Remi Gacogne wrote:
> On 4/6/21 4:18 PM, Nejedlo, Mark via Pdns-users wrote:
> > Would additional distributor threads really cause additional worker
> CPU usage?
> 
> That could happen if they have to fight for the incoming socket. Do you
> have reuseport=yes in your configuration?

Either I'm deeply misunderstanding something (quite possible), or we may be talking about different things.  If I understand correctly, thundering herd problems should only show up on the distributor threads, but my distributors are not very busy.  It is the workers doing the actual DNS processing that show the high CPU, and I would think the distributors address the workers individually, not via a shared port (but maybe I'm wrong?)

Dropping the distributors to one, which I'm planning to do anyway, will eliminate the problem on the front end socket.  If the workers do share the connection to the distributors, adding reuseport isn't hard.

> > Does the maintenance function block the worker while it's running?
> 
> Yes.

That's unfortunate.  While I don't think timeouts are the source of my problems, I'll still have to think about how to address this just for general health of the service.

> >> I see that XPF is enabled between dnsdist and the recursor, which
> likely
> >> kills the recursor's packet cache. That might explain the bad
> >> performance results.
> >
> > Even with a short edns-subnet-whitelist?
> 
> I'm afraid so, yes, and since some of your responses depends on the
> client IP (EDNS Client Subnet is enabled for some domains) you can't
> really enable the packet cache in dnsdist, unless you know for sure that
> only a few domains are using EDNS Client Subnet, and that there is no
> CNAME to them from other domains. Then you could perhaps enable the
> packet cache in dnsdist and disable it for these domains only.

Sounds like I'll need to do some more detailed learning about the packet caches and how they interact.

> Do you really XPF, by the way? You are passing the initial client IP in
> EDNS Client Subnet already, so that might be enough?

This is probably a misunderstanding on my part.  I was under the impression that useClientSubnet=true told dnsdist that it needed to pass the client IP, and addXPF/proxy protocol told it how to do so.  If I'm wrong, dropping XPF is easy enough.  Although, it sounds like I also want to drop useClientSubnet in favor of the proxy protocol.

> > Both 4.4/5 and proxy protocol were on my radar, but my priority was to
> address the CPU usage.  If there's performance gains to be had in
> upgrading, I can certainly do that.  Is 4.5GA likely to happen soon?
> 
> The proxy protocol adds a header outside of the DNS payload, so it would
> not kill your packet cache. If you get rid of EDNS Client Subnet and XPF
> between dnsdist and the recursor so you should get much better
> performance.
> Even if you need to keep EDNS Client Subnet between dnsdist and the
> recursor, you could then try enabling dnsdist's packet cache with the
> EDNS zero scope feature [1] which let dnsdist know when it can cache an
> answer for all clients.

This probably goes back to my confusion on the previous point.  I need client IP aware responses, not specifically useClientSubnet between dnsdist and pdns_recursor.  Proxy protocol should be fine.

Thanks,
Mark 



More information about the Pdns-users mailing list