[Pdns-users] CPU Usage Regression in Recursor 4.9.1?

Otto Moerbeek otto at drijf.net
Mon Sep 4 09:34:23 UTC 2023


On Mon, Sep 04, 2023 at 10:49:23AM +0200, Otto Moerbeek via Pdns-users wrote:

> On Mon, Sep 04, 2023 at 10:30:38AM +0200, Christoph via Pdns-users wrote:
> 
> > 
> > > Thanks, recursor is now running with aggressive-nsec-cache-size=0
> > > and I'll report my findings after a few days.
> > 
> > Already after less than a day I can say that this setting mitigates
> > the problem, thank you very much!
> > The CPU usage is significantly lower and stopped growing after 12 hours at a
> > lower level than without the setting. Also the drop rate is back to an usual
> > level.
> > 
> > 'Timeout while waiting for the health check response from backend'
> > event counts also got reduced drastically:
> > 
> >    4529 2023-09-03
> >      65 2023-09-04 (10hours only)
> > 
> > 
> > Is this related to this 4.9.1 changelog entry?
> > > Replace data in the aggressive cache if new data becomes available.
> > > 
> > > References: #13106, pull request 13161
> 
> Yes, that is very likely. I do not understand the issue completely
> yet, in my testing the changes do not cause any significant change in
> CPU time. But I'm on it.
> 
> 	-Otto

Would it be possible to give me some stats on the aggresisve cache on the node(s)
showing the issue? Specifcially, the values over time of

aggressive-nsec-cache-entries
aggressive-nsec-cache-nsec-hits
aggressive-nsec-cache-nsec-wc-hits	
aggressive-nsec-cache-nsec3-hits
aggressive-nsec-cache-nsec3-wc-hits

The first is the most interesting.

	-Otto




More information about the Pdns-users mailing list