[Pdns-dev] Re: [Pdns-users] Re: Re: Domains with binary (e.g. UTF-8) labels

bert hubert bert.hubert at netherlabs.nl
Wed Dec 20 10:06:02 CET 2006


On Wed, Dec 20, 2006 at 09:45:09AM +0100, Benny Amorsen wrote:
> That won't happen. In UTF-8, all multibyte characters have the high
> bit set in every byte.

If I understand correctly, what the new RFCs promise us, is that DNS is
binary safe, including dots within a label, with the provision that queries
for labels containing [a-z] characters also match labels containing [A-Z] in
that same place.

So, a query for "\x00\x01\0x02Q\x03" would match the label
"\x00\0x1\0x02q\x03".

Which in general means that, even if the world would be according to those
newer RFCs, you cannot store *arbitrary* binary labels in DNS, since some of
them might be different to you, but equal to DNS (they only differ in the
'case bit' of one or more [a-zA-Z] octets).

This restriction however does allow for safe transport of UTF-8 through DNS,
as seen from RFC 2181 and the case sensitivity one mentioned.

An UTF-8 octet is either "7-bit clean", in which case case sensitivity is,
well, not a real problem, or it has the high bit set, in which case the
octet is outside of the [a-zA-Z] range.

However, for a reality check, do realise the DNSSEC people decided not to
take advantage of binary labels, but use base32 encoding, and packet size is
of concern to them. They might've used base204 encoding, which would've led
to lots smaller packets.

End to end UTF-8 DNS w/o IDN is not around the corner, and not just because
of PowerDNS.

	Bert

-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://netherlabs.nl              Open and Closed source services


More information about the Pdns-dev mailing list