[Pdns-users] Re: Domains with binary (e.g. UTF-8) labels

Stephane Bortzmeyer bortzmeyer at nic.fr
Sat Dec 16 21:37:05 UTC 2006


On Sat, Dec 16, 2006 at 10:17:23PM +0100,
 bert hubert <bert.hubert at netherlabs.nl> wrote 
 a message of 29 lines which said:

> To encode utf-8 domains so that they work, use 'IDN'.

IDN is mandatory for host names but should not be for domain names
without hosts.
 
> Read for example paragraph 3.5 of RFC 1035, which contains: "The
> labels must follow the rules for ARPANET host names."

It is 2.3.1 and it says so only as a *preference* and indicated as
such. RFC 2181 makes very clear that the DNS is 8-bits clean:

   The DNS itself places only one restriction on the particular labels
   that can be used to identify resource records.  That one restriction
   relates to the length of the label and the full name.  The length of
   any one label is limited to between 1 and 63 octets.  A full domain
   name is limited to 255 octets (including the separators).  The zero
   length full name is defined as representing the root of the DNS tree,
   and is typically written and displayed as ".".  Those restrictions
   aside, any binary string whatever can be used as the label of any
   resource record.  Similarly, any binary string can serve as the value
   of any record that includes a domain name as some or all of its value
   (SOA, NS, MX, PTR, CNAME, and any others that may be added).
   Implementations of the DNS protocols must not place any restrictions
   on the labels that can be used.  In particular, DNS servers must not
   refuse to serve a zone because it contains labels that might not be
   acceptable to some DNS client programs.  A DNS server may be
   configurable to issue warnings when loading, or even to refuse to
   load, a primary zone containing labels that might be considered
   questionable, however this should not happen by default.

IMHO, PowerDNS is deeply wrong here.

> Even if we would support arbitrary values, things are unlikely to work as
> intended. IDN was invented for a reason.

Not this one. BIND or NSD work fine with 8-bits labels. IDN was
invented for two reasons:

* most domain names contain host names and host names indeed do have
the restriction (RFC 1123). That's also the reason why all the domain
registries I know of prevent non-LDH labels registration (LDH =
letters/digits/hyphen).

* the most important problem with Unicode in domain names is not the
fact that 8-bits label work or not (they work with BIND or NSD). It is
the *canonicalization*. ASCII labels have only one canonicalization
rule and a very simple one ("case does not matter"). For Unicode,
things are more complicated, you need a much more complicated
algorithm for canonicalization and the IETF thought it should be only
in the applications, not in the DNS servers.



More information about the Pdns-users mailing list