[Pdns-users] CPU Usage Regression in Recursor 4.9.1?
Otto Moerbeek
otto at drijf.net
Mon Sep 4 09:34:23 UTC 2023
On Mon, Sep 04, 2023 at 10:49:23AM +0200, Otto Moerbeek via Pdns-users wrote:
> On Mon, Sep 04, 2023 at 10:30:38AM +0200, Christoph via Pdns-users wrote:
>
> >
> > > Thanks, recursor is now running with aggressive-nsec-cache-size=0
> > > and I'll report my findings after a few days.
> >
> > Already after less than a day I can say that this setting mitigates
> > the problem, thank you very much!
> > The CPU usage is significantly lower and stopped growing after 12 hours at a
> > lower level than without the setting. Also the drop rate is back to an usual
> > level.
> >
> > 'Timeout while waiting for the health check response from backend'
> > event counts also got reduced drastically:
> >
> > 4529 2023-09-03
> > 65 2023-09-04 (10hours only)
> >
> >
> > Is this related to this 4.9.1 changelog entry?
> > > Replace data in the aggressive cache if new data becomes available.
> > >
> > > References: #13106, pull request 13161
>
> Yes, that is very likely. I do not understand the issue completely
> yet, in my testing the changes do not cause any significant change in
> CPU time. But I'm on it.
>
> -Otto
Would it be possible to give me some stats on the aggresisve cache on the node(s)
showing the issue? Specifcially, the values over time of
aggressive-nsec-cache-entries
aggressive-nsec-cache-nsec-hits
aggressive-nsec-cache-nsec-wc-hits
aggressive-nsec-cache-nsec3-hits
aggressive-nsec-cache-nsec3-wc-hits
The first is the most interesting.
-Otto
More information about the Pdns-users
mailing list