[Pdns-users] MySQL/MariaDb Scaling

Thomas Mieslinger miesi at mail.com
Mon Jun 7 07:20:42 UTC 2021


Hi Frank,

thanks for noticing the possible speedup by using lmdb Backend instead
of g*sql backends.

Is anyone doing lmdb with millions of Zones? How do you keep them in
sync with the master? Is there also a simple way like mysql replication?
Is it feasible to have a slave servers which check the SOA of millions
of zones on a master DNS Server?

Cheers Thomas

Am 04.06.21 um 10:32 schrieb Frank Louwers:
> Hi,
>
> As Thomas said: your setup looks sane, and if it currently works for you, there's no need to change anything.
>
> If you do have zones that are getting hit by a random-subdomain-lookup attack, I would recommend to have a separate NS with a BIND or LMDB backend ready to serve only those domains. You'll be able to withstand the attacks for a longer time (LMDB/bind lookups for random records are way quicker than on MySQL), but also separate the domain under attack from your other domains, so the blast radius is greatly reduced.
>
> Kind Regards,
>
> Frank
>
>> On Jun 4, 2021, at 8:48 AM, Thomas Mieslinger via Pdns-users <pdns-users at mailman.powerdns.com> wrote:
>>
>> Hi,
>>
>> as it seems to work for you, why change. Sounds like you are using all
>> modern technologies which are available.
>>
>> I run a setup for millions of zones which was designed when dnsdist was
>> not yet written.
>>
>> Instead of separating dnsdist, pdns authoritative and mariadb on
>> separate vms, I run mariadb, pdns and quagga on hardware.
>>
>> The mariadbs are slaves to one failover master pair. Everything is
>> replicated with mysql replication.
>>
>> Whenever I want do scale, I buy servers, send them to the locations
>> (typically racks in CoLos), get bgp to the routers up adn running, an
>> serve requests.
>>
>> These servers test themselves whether pdns is able to answer or not and
>> then bgp anannonuce (or not) the pdns instance IPs.
>>
>> Just my cents.
>>
>> Thomas
>>
>>
>> Am 02.06.21 um 18:49 schrieb Mailing Lists via Pdns-users:
>>> Hi all
>>>
>>> I run a reasonably sized PowerDNS setup (high millions of domains across
>>> a few instances). So far the way I have been scaling it is working fine
>>> but I would like to get some addition suggestions in case I missed
>>> something. When we need extra capacity currently its a matter of adding
>>> a dnadist server for the front end or PowerDNS with MariaDB for backend
>>>
>>> Dnsdist answers a large number of queries from cache which reduces the
>>> load nicely but every now and then we will get an attack which will
>>> punch through the caching with random subdomains and then cause a high
>>> load on the PowerDNS auth servers. If that occurs our strategy has been
>>> to add the domain to a pre defined suffix match group on dnsdist which
>>> applies stricter rate limiting which works well enough. We use other
>>> rules to limit QPS from prefixes of certain sizes which does help
>>> sometimes but for the latest attacks they seem to be all spoofed IP's
>>> not in any particularly easy to limit prefix.
>>>
>>> The setup we use is:
>>>
>>> * 2 sets of MariaDB "master" VM's (2 clusters in 2 geographically
>>> separated locations) which are active/active and replicate from/to each
>>> other. All write queries are directed to these.
>>>
>>> * 3 PowerDNS "delayed slave" auth VM's geographically distributed, each
>>> of which has its own MariaDB install which acts as a read only slave to
>>> the master servers. These servers are configured with a replication
>>> delay for DR purposes, they do not normally get any traffic.
>>>
>>> * Multiple PowerDNS auth VM's geographically distributed (in at least
>>> pairs) with the same setup as the delayed slave servers. They do not
>>> have any replication delay configured and they are the servers that
>>> receive traffic from dnsdist normally.
>>>
>>> * Multiple dnsdist servers in geographically distributed areas. Queries
>>> prefer to be sent to the local auth servers if they are available, if
>>> not then remote auth servers if they are available followed by the
>>> delayed DR servers. For stability the IP's dnsdist listens on for
>>> queries is bound to loopback adapter and it is advertised to the rest of
>>> the network with bgp.
>>>
>>> The servers are all on SSD's except 2 (waiting for hardware refresh...)
>>> With a reasonable amount of RAM and CPU resources. During the attacks
>>> the biggest bottleneck seems to be the DB. I plan on doing some
>>> simulated benchmarks directly on the DB to see what numbers I am getting
>>> without the overhead of PowerDNS parsing the quest, generating query,
>>> waiting for answer etc.
>>>
>>> I would be curious if there is already a tool which could perform the
>>> test I mentioned above or if I will have to end up writing on. If I do
>>> write one my goal would be to run test, change setting (from MariaDB or
>>> PowerDNS) and repeat.
>>>
>>> Also if you know of any other relevant OS related tuning or MariaDB
>>> related tuning that would help. I would be happy to run additional
>>> benchmarks to see what the impact would be and publish them later.
>>>
>>> _______________________________________________
>>> Pdns-users mailing list
>>> Pdns-users at mailman.powerdns.com
>>> https://mailman.powerdns.com/mailman/listinfo/pdns-users
>>>
>> _______________________________________________
>> Pdns-users mailing list
>> Pdns-users at mailman.powerdns.com
>> https://mailman.powerdns.com/mailman/listinfo/pdns-users
>


More information about the Pdns-users mailing list