INTERNET-DRAFT Eric A. Hall Document: draft-hall-dns-datatypes-00.txt June 2002 Expires: December 2002 Domain Name Data-Types Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 1. Abstract This document defines syntax and structural rules for a namespace of internationalized domain names, and also clarifies the syntax and structural rules for the existing DNS namespace. Furthermore, this document defines syntax and structural rules for specific types of labels and domain names, and also defines usage rules for specific resource records within the domain name system. This document specifically does not describe any mechanisms for interacting with these namespaces, domain names or resource records, but instead focuses exclusively on the syntax and structural rules. INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Table of Contents 1. Abstract.................................................1 2. Introduction.............................................3 3. Background and Overview..................................3 4. The Namespaces...........................................7 4.1. The Class IN Hierarchy.................................7 4.2. The DNS Namespace......................................9 4.2.1. Length Restrictions in the DNS Namespace.............9 4.2.2. Characters Restrictions in the DNS Namespace........10 4.2.3. The DNS Namespace Escape Syntax.....................11 4.3. The Internationalized Namespace.......................13 4.3.1. Length Restrictions in the i18n Namespace...........14 4.3.2. Character Restrictions in the i18n Namespace........15 5. The DNS Data-Types......................................16 5.1. Syntax Validation.....................................16 5.2. Defining New Data-Types...............................17 5.3. The Root Label and Domain Name........................18 5.4. The Hostname Labels and Domain Names..................19 5.4.1. Legacy Hostnames....................................19 5.4.2. Internationalized Hostnames.........................20 5.5. The Octet Label and Domain Name.......................21 5.6. The Mailbox Labels and Domain Names...................22 5.6.1. Legacy Mailboxes....................................23 5.6.2. Internationalized Mailboxes.........................24 5.7. The Service Locator Labels and Domain Names...........24 5.7.1. Legacy Service Locators.............................24 5.7.2. Internationalized Service Locators..................25 6. Resource Records and Query Types........................25 6.1. Resource Records......................................25 6.2. Query Types...........................................34 7. Security Considerations.................................35 8. IANA Considerations.....................................35 9. References..............................................36 10. Acknowledgements........................................38 11. Author's Address........................................39 Hall I-D Expires: December 2002 [page 2] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 2. Introduction The IDN working group has been developing mechanisms for supporting and interacting with internationalized domain names, although a prerequisite to the completion of any such work is the description of the internationalized namespace itself. During this work, it has also been determined that certain clarifications to the existing DNS namespace are also necessary. Encodings, protocols and other mechanisms for accessing domain names and resource records within the internationalized namespace are purposefully not described in this document. Discussion of this document and related work items is currently being held on the "idn@ops.ietf.org" mailing list. To join the list, send a message to with the single word "subscribe" in the body of the message. Subsequent versions of this draft will be brought to DNSEXT for standards-track development, and will be discussed on the "namedroppers@ops.ietf.org" mailing list. To join that list, send a message to with the single word of "subscribe" in the body of the message. 3. Background and Overview The Internet (and the ARPANET before it) has had a formal namespace of network resources since RFC 608 [RFC608]. Over the years, however, the syntax rules associated with the global namespace have been changed, with various updates and clarifications being provided in RFC 810 [RFC810], RFC 882 [RFC882], RFC 952 [RFC952], RFC 1034 [RFC1034], RFC 1123 [RFC1123] and RFC 2181 [RFC2181], with each revision expanding upon the namespace syntax to accommodate a more flexible usage model. The original namespace of network resources defined in [RFC608] used a flat HOSTS.TXT database as a simple list of systems and their network addresses, using a limited subset from the seven-bit US-ASCII charset [ASCII] for the system names. Essentially, the database format was the namespace, with all network services using this one-dimensional namespace for the purpose of specifying systems by name, regardless of whether these hostnames were used for intra-protocol services or for subsequent lookup operations. Hall I-D Expires: December 2002 [page 3] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 The format of the underlying database (and thus the namespace) was redefined by [RFC810] to reflect the coexistence of ARPANET and Internet networks and nodes, redefined again by [RFC952] to allow for multi-label hostnames, and updated in [RFC1123] to slightly expand the allowable character repertoire. Throughout these revisions, the syntax of the namespace was changed somewhat, although it continued to be one-dimensional in nature, reflecting the limitations of the underlying HOSTS.TXT database file. Once the Domain Name System (DNS) specifications were published (first in [RFC882] and RFC 883 [RFC883], then later in [RFC1034] and RFC 1035 [RFC1035]), the database of network resources and the hostname syntax separated into distinct entities. Although DNS uses an eight-bit syntax internally and allows any of the eight- bit codepoint values to be used for any purposes, most applications and protocols restrict their usage of the namespace to well-known data-types which only use subsets of the available namespace. This has resulted in a distinctly layered namespace, where applications and protocols use the data-type subsets, while the DNS itself uses the full range of characters. For example, [RFC1034] states that "the old rules for HOSTS.TXT should be followed" for domain names which reference host systems, and most of the application protocols have followed this advice. For all practical purposes, this means that the legacy hostname syntax is implicitly a strong data-type with formal syntax rules, even though it only represents a subset of the global DNS namespace. Meanwhile, a variety of protocol-specific data-types have also been defined for network resources which are not hosts, and these data-types have also been implemented by applications which work with those kinds of resources. In the end, the DNS namespace essentially exists as two separate layers, with the namespace at large being defined by the underlying DNS service, but with applications and protocols using the data-types and syntax rules which reflect their usage. Moving forward, the IDN working group has developed an internationalized namespace which uses characters from the Universal Character Set (UCS) [ISO-10646] (a.k.a. Unicode [UNICODE])]. Note that the UCS (and thus the namespace) only defines characters and their logical codepoint values, while external codecs are required to encode the canonical UCS characters into sequences which are suitable for specific environments. As such, the canonical UCS characters cannot be Hall I-D Expires: December 2002 [page 4] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 exchanged in their raw form, but instead can be only exchanged in their encoded form. Furthermore, there are no characters in the UCS character repertoire for "octet value 0xHH", so the internationalized namespace cannot directly support the eight-bit values used in the DNS namespace. For these reasons, the internationalized and DNS namespaces have to be defined and managed separately, with the internationalized namespace representing logical UCS characters, and with the DNS namespace representing raw and uninterpreted eight-bit values. However, these namespaces can coexist within the class IN database hierarchy as long as the underlying domain names are encoded in a compatible and consistent form. At that point, the only substantive difference between the two namespaces is in the canonical characters which are used by the presentation-layer namespaces at large. Cumulatively, this means that the internationalized namespace operates at three distinct layers. First of all, applications and protocols have to choose an internationalized data-type which is capable of supporting the characters they need for their domain names, while the protocols also have to choose the encoding formats they will use for the domain names that they exchange. Meanwhile, the end-system applications may need to use another encoding format whenever they map these domain names to the available lookup service(s). | HOSTS.TXT | DNS Names | IDNs | --------------+---------------+---------------+---------------+ Application | database | protocol | data-type | Presentation | subset | subset | specific | | | | (UCS range) | --------------+---------------+---------------+---------------+ Protocol | database | data-type | protocol | Transfer | subset | specific | encoding | | | (7- or 8-bit) | (7- or 8-bit) | --------------+---------------+---------------+---------------+ Subsequent | HOSTS.TXT | 8-bit DNS | 8-bit DNS | Lookups | subset | or HOSTS.TXT | or HOSTS.TXT | | | subset | subset | --------------+---------------+---------------+---------------+ Figure 1: Logical namespace layers and their representations. The different models which have been described in this section are illustrated in Figure 1 above. As can be seen, the hostname syntax Hall I-D Expires: December 2002 [page 5] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 defined for use with the HOSTS.TXT database essentially provided a monolithic and one-dimensional namespace for applications to use, while the DNS namespace provides multiple data-types for applications and protocols to use, with these data-types representing logical subsets of the underlying eight-bit namespace. Finally, the internationalized namespace also uses logical data-types for the applications and protocols, but also requires that protocols define encoding formats for the domain names they use internally, and for the domain names they pass to the underlying lookup services. Collectively, there are a variety of different data elements discussed above which require definitions or clarifications. Originally, this document was only meant to provide definitions for the internationalized namespace and its associated data-types, although several issues with the DNS namespace and data-types have also been encountered which require clarifications and definitions of their own. In particular, since the internationalized namespace is unable to use the eight-bit codepoint values from the DNS namespace, the legacy hostname data-type must be defined with an explicit syntax and its usage must be restricted to specific scenarios in order for those domain names to be accessible from the internationalized namespace. Subsequently, this requires that DNS resource records also be redefined to use the data-types which are appropriate to the data they represent, thereby ensuring that host resources in the common hierarchy are always accessible to both namespaces. On the surface, some of this work may appear to be a reversal of existing standards, since the DNS specifications and the clarifications made in [RFC2181] explicitly allow eight-bit codepoint values to be used with any domain name. In truth, however, this redefinition is a codification of existing practices and recommendations. Specifically, this document encourages applications to define their own data-types and syntax rules when needed but also requires that the common hostname syntax be supported in those places where hosts are specifically referenced, which is essentially a restatement of the [RFC1034] requirements. The only real difference here is that this document also requires that resource records be explicitly restricted to use the appropriate data-types, rather than being allowed to diverge at will. While this is a reversal of policy as defined in [RFC2181], this is necessary in order to ensure basic interoperability across the different namespaces, and is also necessary in order to Hall I-D Expires: December 2002 [page 6] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 prevent basic interoperability problems from developing due to fragmentation of the class IN hierarchy. 4. The Namespaces Conceptually, the DNS namespace and the internationalized namespace are separate, in that they allow for different ranges of characters, with the namespaces containing different identifiers. However, both of the namespaces reside in the class IN hierarchy and share a common root in that class, and are effectively the same namespace when the identifiers have been encoded into a compatible and consistent form. Applications and protocols which specifically utilize any of the data-types defined in this document MUST conform to the syntax rules associated with the parent namespace for that data-type. For example, if an application specifically claims to support the internationalized hostname data-type then that application MUST conform to the requirements associated with the internationalized namespace at large, while applications which claim to conform to the legacy mailbox data-type MUST conform to the requirements of the DNS namespace. If a protocol supports some other external namespace (such as LDAP directories), then the syntax rules for that protocol SHOULD define specific handling rules which clearly state how that protocol will use each of the namespaces defined here. 4.1. The Class IN Hierarchy The IN class use a hierarchical structure, where each domain name is represented by a series of labels, and where the entire sequence of labels represents a globally-unique domain name in the hierarchy. Essentially, the IN class hierarchy represents the database portion of the DNS, and therefore represents the storage, transfer and processing services which are used to construct the DNS namespace. Note that the namespace syntax (such as length and character restrictions) is discussed separately in section 4.2. Also note that alternative classes may have their own structural rules which are different from those used in the IN class. When a domain name from the IN class is used in the DNS, the constituent labels are typically treated as binary sequences (but not always), with each label being prefaced by a length indicator. Hall I-D Expires: December 2002 [page 7] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 The labels are ordered from right-to-left, with the least-specific label at the right edge of the domain name, and with the most- specific labels at the left edge. The right-most label will have a length indicator with the value of zero (it is truly a null label), and represents the root of the class IN hierarchy which is by definition the least-specific label in the hierarchy. When a domain name is used for lookups, the entire sequence of labels act as a lookup key against a globally-distributed hierarchy of database partitions and their leaf-nodes. Some or all of the labels will identify a specific database partition in the hierarchy, while any remaining labels will identify a particular leaf-node within a partition. As a lookup is processed, the input domain name is matched against the contents of the current partition, with the results of this comparison operation either being a referral to another partition, an answer, or an error. If a referral is returned, the matching process is restarted at the referenced partition, with this process repeating until either an answer or an error is returned. Domain names from the DNS namespace which are written out in longhand form are usually written as character sequences, with the labels typically being separated by a Full-Stop character (0x2E) from [ASCII]. When used with longhand domain names in the internationalized namespace, the separator mark is either a trailing Full-Stop (U+002E), an Ideographic Full-Stop (U+3002), an Ideographic Full-Width Full-Stop (U+FF0E), or an Ideographic Half- Width Full-Stop (U+FF61) from the UCS. When any of the ideographic forms are used, they MUST be converted to the traditional Full- Stop character when the domain name labels are normalized, and MUST NOT be exchanged with other applications or protocols in their provided form. Although the separator frequently appears to represent the length indicators in the domain name system, this is not always true. For example, domain names which are written out in longhand form do not typically use a Full-Stop character at the beginning of the domain name to represent the length indicator from the first label, nor do they typically provide a trailing Full-Stop character to represent the root of the hierarchy. Outside of the domain name system, domain names are typically treated as simple identifiers, with no database context being implied. Humans normally treat domain names as simple identifiers of named network resources, without concern for the leaf-node or partitions which may be referenced. Meanwhile, most applications Hall I-D Expires: December 2002 [page 8] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 and protocols also treat domain names as simple identifiers, although many of them apply syntactical analysis to the domain name before performing any additional processing (even in these situations, however, the domain name will not normally be analyzed for database context). Note that a one-to-one match between labels, partitions and leaf- nodes is neither required nor implied. Multiple labels may be used to refer to a partition or a leaf-node, as desired. In most cases, the specific database context of a domain name cannot be determined without querying DNS directly. 4.2. The DNS Namespace These rules represent the allowable syntax of all domain names within the IN class hierarchy as defined by [RFC1034] and [RFC1035]. Alternative classes may define their own namespaces and rules. Subsets of these rules are defined for specific labels and domain names in section 5. The rules provided in this section specifically apply to the DNS namespace at large, and do not define any formal data-types. 4.2.1. Length Restrictions in the DNS Namespace A label from the DNS namespace is restricted to a minimum of one octet and a maximum of 63 octets, inclusive. A domain name from the DNS namespace is restricted to a minimum of one octet and a maximum of 255 octets, inclusive. Any number of labels may be provided in a domain name, but the maximum length restriction MUST NOT be exceeded. Note that many delegation bodies have defined their own minimum length rules for their zones. For example, the historic generic top-level domains (such as com, net and org) require a minimum of two characters for all immediate delegations, while some of the newer generic TLDs have three- and four-character minimums. Since these rules only affect the delegations within those zones (and not the subordinate delegations from the child zones), this usage is not in conflict with any of the other rules defined in this document, and is expressly allowed. Whenever a domain name is written in longhand form, it SHOULD be restricted to a maximum length which allows a direct conversion to Hall I-D Expires: December 2002 [page 9] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 the DNS format. In particular, a longhand domain name SHOULD allow a length indicator to be added to the first label in the DNS domain name, and SHOULD allow a length indicator to be added to the end of the domain name (representing the root domain), if necessary. In the common case, a longhand domain name will only allow for 253 octets of data so that it can be directly converted to the DNS format. If a longhand domain name uses the escape syntax described in section 4.2.3, the allowable length of the longhand domain name MAY be extended to accommodate the escape sequence, since a multi- byte escape sequence will generally collapse to a single octet in the DNS message. Applications and protocols which need to specify the root domain explicitly MUST allow a single Full-Stop character to be specified in the longhand domain name for this purpose. Other application- or protocol-specific syntaxes MAY also be supported for this purpose, if necessary. Note that this usage is not required to be supported unless the application needs to explicitly reference the root domain for some purpose, but since the root domain is not addressable as a host system, this is not a common scenario. 4.2.2. Characters Restrictions in the DNS Namespace The lower seven-bit range of values (0x00 through 0x7F) from the DNS namespace MUST be interpreted as characters from [ASCII]. The eight-bit range of values (0x80 through 0xFF) are defined in [RFC1034] as opaque octets, with no default character assignments. Therefore, eight-bit values from the DNS namespace MUST NOT be interpreted as any specific charset, characters or encoding, and SHOULD NOT be rendered as such unless the protocol in use has defined a specific data-type which explicitly states otherwise. When a domain name from the DNS namespace is stored, transferred or compared, the capitalization of the [ASCII] characters in that domain name MUST be preserved as they were provided to the current operation. Secondarily, whenever two domain names from the DNS namespace are compared, the [ASCII] characters in the domain names MUST be treated as case-neutral for the purposes of comparison. Note that these rules combine such that the capitalization of the input domain name will be preserved across a search operation. For example, the search input of "A.example.COM" MUST match the stored Hall I-D Expires: December 2002 [page 10] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 domain name of "a.EXAMPLE.com", and the capitalization of the input domain name MUST be used for the resulting output. However, if the queried domain name referenced "HOST.c.EXAMPLE.net", that domain name MUST be provided in its original capitalization. In those scenarios where the input and output domain names are different but they exist in the same branch of the class IN hierarchy, the secondary references to the common domains MUST inherit the capitalization of the input domain name. For example, if the search input of "A.example.COM" matches with "a.EXAMPLE.com" but that domain name only exists as an alias for the domain name of "HOST.b.EXAMPLE.com", then the output domain name MUST be provided as "HOST.b.example.COM", with the capitalization of the input domain name being used to construct the overlapping domain names in the output. These rules guarantee that the output from the compression algorithm defined in [RFC1035] is always valid. Unless a protocol specifically states otherwise, these rules MUST be followed for all applications and protocols which use domain names and labels from the DNS namespace as specific data. 4.2.3. The DNS Namespace Escape Syntax Although DNS uses raw eight-bit codepoint values, less than half of the codepoint values have defined character equivalents from [ASCII] which can be rendered, which means that most of the codepoint values cannot be written in a longhand domain name which supports those values. For example, the octet label data-type supports 256 possible codepoint values (0x00 through 0xFF), while the mailbox label data-type supports all 128 of the seven-bit character codes defined in [ASCII] (0x00 through 0x7F), but only the printable subset of characters from [ASCII] have defined character representations (0x21 through 0x7E). As such, longhand domain names which use these data-types are generally restricted to the printable subset. Furthermore, some of these domain names make use of characters which are "confusing" to applications and/or their resolvers. For example, many email addresses make use of the Full-Stop character within the local-part element, although this character can easily be misinterpreted as a label separator rather than an embedded Full-Stop character. Hall I-D Expires: December 2002 [page 11] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 [RFC1035] defined a syntax for escaping these characters within the zone database, but it does not require (nor imply) that this mechanism should be supported in other applications, with the result being that most of these applications do not adequately support the necessary syntax. This document corrects this shortcoming by requiring that any application which supports the octet label or mailbox label data-types MUST allow longhand domain names to use the escaping syntax defined herein. Note that these syntax rules only apply to the octet label and mailbox label data-types. The remaining data-types have tighter character ranges, and do not contain characters which require escaping. Furthermore, applications MUST NOT allow these other data-types to use the escaping syntax whatsoever, as this could result in unexpected characters being inserted into the label or domain name, thereby triggering unexpected failures in other applications or systems. The escape syntax uses the Reverse-Solidus (0x5C) character as an escape flag, with this flag preceding a printable [ASCII] character or a three-digit decimal value for a specific codepoint value. For example, if a label contains an embedded Full-Stop character, that character may be escaped as either "\." or "\046" (where "46" is the decimal value of the Full-Stop character's codepoint value from [ASCII]). This escape syntax MUST be used to encapsulate Full-Stop (0x2E), Reverse-Solidus (0x5C), Double-Quote (0x22), a non-printing character from [ASCII] (0x00 through 0x20, or 0x7F) or any of the eight-bit codepoint values (0x80 through 0xFF) whenever one of these domain names is written in longhand form. Protocols which exchange the octet or mailbox data-types as textual data MUST support the use of this escaping syntax within that data. Other application- or protocol-specific syntaxes MAY also be supported for this purpose, if necessary. However, DNS messages MUST NOT contain the escape sequences, and MUST always use the raw octet value of the escaped character. As such, the escape syntax MUST be interpreted by an application or a resolver (depending on the resolver's capabilities) before these characters are passed into DNS. Hall I-D Expires: December 2002 [page 12] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 4.3. The Internationalized Namespace These rules represent the allowable syntax of all domain names within the internationalized IN class namespace. Alternative classes may define their own namespaces and rules. Subsets of these rules are defined for specific labels and domain names in section 5. Conversely, the rules provided in this section apply to the internationalized namespace at large, and do not define any formal data-types. Note that the internationalized namespace is a logical namespace, and does not exist in the same way that the DNS namespace exists. Instead of being a direct mapping to the underlying database, the internationalized namespace is defined as a range of UCS character codes which may be accessed or represented with any of several different encoding mechanisms. Mechanisms which have been discussed for this purpose include codecs that convert the UCS character codes into seven-bit [ASCII] sequences compatible with the legacy hostname syntax, UCS transfer encodings such as UTF-8, and legacy charsets which can be mapped to the UCS repertoire. Any of these mechanisms (or any others) can be used to represent, store, transfer and compare domain names in the internationalized namespace, as is necessary for the application or protocol at hand. For example, an internationalized protocol may use UTF-8 domain names as protocol data or arguments, although it may be necessary to convert a domain name into a hostname-compatible encoding whenever a lookup operation is performed. In this scenario, the logical internationalized namespace will be accessed through two different mechanisms, although the canonical domain name will still be represented as the UCS characters. Similarly, conversion between two or more access mechanisms will likely require an intermediate conversion to UCS first, and in this regard, the canonical UCS characters will represent the logical namespace. Note that this document does not define or describe any codecs or namespace-access mechanisms. However, these different mechanisms have affected the structure and syntax rules of the internationalized namespace at large. Hall I-D Expires: December 2002 [page 13] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 4.3.1. Length Restrictions in the i18n Namespace For the purposes of defining boundary conditions, a label in the internationalized namespace is restricted to a minimum of one UCS character and a maximum of 63 UCS characters, while a domain name in the internationalized namespace is restricted to a maximum of 255 UCS characters, inclusive. Note that this rule specifically refers to the canonical UCS characters, rather than any encoded form. Encoding will often result in labels and domain names with a fewer or greater number of octets, depending on the encoding algorithm in use. For example, an application which uses UCS-4 to represent internationalized domain names will require 32 bits for each UCS character, while an application which uses UTF-8 can require up to 48 bits for each character. The canonical UCS characters will frequently require conversion into a transfer encoding which will be subject to its own length requirements. For example, the ASCII-compatible encoding mechanism defined in [PUNYCODE] and [IDNA] is subject to the length restrictions inherent in the DNS namespace. Although all encodings have their own requirements, [IDNA] is the only encoding which is known to have hard limits beyond its control. As such, labels in the internationalized namespace MUST be restricted to lengths which can be encoded in their [IDNA] encoded form, with the resulting sequence being limited to the DNS namespace restrictions defined in section 4.2.1. This rule MUST be enforced regardless of any other codecs or access mechanisms which may be available that offer larger sizes. These rules combine so that an application can restrict the input of a label or domain name (such as in a form) to a certain number of characters, but once the internationalized domain name has been provided and normalized into its canonical form, any subsequent verification of that domain name MUST ensure that the [IDNA] encoded form of that domain name will comply with the DNS namespace limits, which is a maximum of 63 octets for a label, and 255 octets for a domain name. As with the DNS namespace, domain names written in longhand form MUST leave room for any omitted length indicators when these boundary conditions are tested. In particular, this includes the length indicator at the beginning of the first label and the length indicator at the end of the domain name, both of which are frequently omitted from longhand domain names. Hall I-D Expires: December 2002 [page 14] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 4.3.2. Character Restrictions in the i18n Namespace The logical internationalized namespace uses the entire UCS, including character codes which are currently unassigned. Currently the UCS occupies a 21-bit range of character code values, containing tens of thousands of assigned characters, and hundreds of thousands of unassigned characters, although this repertoire is expected to grow in size over time. Internationalized label and domain name data-types MUST declare their own specific ranges of supportable UCS characters, and MUST also define any normalization, case-conversion, and any other transformations which are needed for that data-type. The guidelines and rules for the development of these internationalized domain name data-types are provided in STRINGPREP [STRINGPREP]. In the normal scenario, the data-type rules will result in labels and domain names containing case-specific, strongly-normalized UCS characters. As a result, the internationalized namespace is case- specific, meaning that all storage, transfer, comparison and conversion operations MUST always preserve the capitalization and normalization of the data as it is processed. Since the domain name provided to the original application can be significantly different from the domain name which is subsequently passed to the underlying protocol, the original domain name MUST NOT be provided to any other applications or protocols if at all possible. Note that certain situations are unavoidable, such as a user copying the hostname from a manually-entered URL into an email message, where the original sequence will not reflect any subsequent normalization. However, this type of bleed-over MUST NOT occur if it can be presented by an application with simple measures. If a label ONLY contains character codes from the seven-bit [ASCII] range of characters (U+0000 through U+007F), then that label MUST be treated as a legacy label from the DNS namespace for the purposes of comparison. In other words, labels which only contain seven-bit [ASCII] MUST be compared as case-neutral sequences, while all other labels MUST be compared as case-exact sequences. In all situations, the output capitalization MUST reflect the capitalization of the input domain name (since these labels will have been capitalized and normalized according to Hall I-D Expires: December 2002 [page 15] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 their domain name syntax rules, the answer data will also be provided in the appropriate form if this rule is always followed). 5. The DNS Data-Types As part of the efforts towards strongly defining strict data- types, this document defines syntax rules for different kinds of domain names and labels. Some of the domain name data-types will only contain labels of a single data-type, while some domain name data-types may contain multiple label data-types. For example, a legacy hostname domain name can only contain legacy hostname labels, while an internationalized service locator domain name can contain a mix of different label types. When one of the internationalized data-types is used with an internationalized protocol, one or more encoding syntaxes MUST be specified by the underlying protocol before the data can be exchanged in a meaningful form. Note that RFC 2277 [RFC2277] states that the preferred encoding for internationalized protocol data is UTF-8. When one of the internationalized data-types is used with a legacy protocol which only has explicit support for [ASCII], the internationalized data-type MUST be encoded into an ASCII- compatible form before the data can exchanged. The [IDNA] specification describes one such mechanism, and is the preferred encoding form whenever legacy protocols are required to be used. 5.1. Syntax Validation Each label in a domain name has specific syntax rules which reflect on the data provided in that label. Therefore, applications which validate domain names against a particular data-type SHOULD apply the appropriate syntax rules to each label, as well as validating the domain name in its entirety. While it may appear that this model is overtly ambiguous, the process of determining the appropriate syntax can be fairly simple if the data-types are consistently enforced. For example, most of the domain name data-types use relatively simple sequences of label data-types, and most applications only support a single domain name data-type for any particular protocol usage. Thus, it is a simple matter to determine if the domain name is valid by comparing it to the syntax rules for the expected usage. Hall I-D Expires: December 2002 [page 16] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 In those cases where the application or protocol allows multiple kinds of domain name to be used, it is still possible to apply some fairly simple lexical analysis to determine the domain name which is in use (a service location label is different from a hostname label, while mailbox labels and octet labels may contain specific escape sequences, and so forth). In those cases where multiple data-types are supported and the domain name data-type cannot be determined (this may happen with an application such as "dig" or "nslookup", for example), then the application MAY choose to simply validate the domain name against the relevant namespace and leave it at that. However, this level of detachment SHOULD NOT be the default behavior; applications SHOULD attempt to validate the labels and domain names to the best of their ability using the available syntax rules. Specifically, the syntax of a label SHOULD be validated whenever a resource record is added to the replication master for a zone, or whenever an application first creates a domain name for use within that application or its associated network service. These rules are specifically designed to avoid garbage-in, garbage-out syndrome. Subsequent applications and protocol end-points MAY perform syntax validation of any domain names, and specific application protocols MAY require verification by the application end-points, although this will not be required if the participating end-points have performed the necessary validation. 5.2. Defining New Data-Types Application protocols MAY define any new label or domain name data-types which are needed, although these data-types MUST conform to the rules which govern the controlling namespace, as described in section 4. Application protocols SHOULD reuse one of the hostname syntaxes if at all possible, since these syntaxes have the widest deployment, and this will facilitate faster adoption of the protocol. If a protocol needs to support a broader syntax for uses other than referring to hostnames in the DNS or internationalized namespaces, then the protocol SHOULD indicate which operations or external syntaxes require specific exceptions, and document those exception syntaxes separately if possible. For example, if a protocol is capable of using DNS and LDAP equally, this SHOULD be stated explicitly when the protocol-specific data-types are Hall I-D Expires: December 2002 [page 17] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 defined, and a subset data-type which applies specifically to DNS lookups SHOULD also be defined. 5.3. The Root Label and Domain Name Whenever the DNS message format is used to transport a domain name, [RFC1034] requires the root domain to be specified. However, most applications and protocols do not require or even allow the root domain to be specified as part of the domain names they use internally. In these cases, the applications and protocols will typically leave it up to the local resolver to fully-qualify the domain name as provided, although this process can fail and cause unexpected domain name to be used (see RFC 1535 [RFC1535] for an example and a discussion of this problem). There are two separate considerations here. First is that an application may need to fully-qualify the domain name in order to prevent misinterpretation. Secondarily, some applications and protocols require that the root domain be provided as the complete domain name, or as part of a domain name (such as a service locator domain name associated with the root zone). The root label is defined to suit both of those purposes, where this is needed. It can be used to explicitly terminate a fully-qualified domain name, and it can be used to explicitly represent the root domain as a standalone entity. In a multi-label domain name, the root label is represented by a trailing separator mark. The root label MAY be used at the end of any domain name, regardless of its data-type. In those cases where the root label is provided by itself, the separator mark will specifically represent the root domain of the class IN hierarchy. Note that the root domain is not currently defined as a host (it does not have an IP address), so it cannot currently be used as a connection identifier. Application protocols SHOULD support the use of a standalone root label as an explicit domain name, although this is specifically not required. Where a protocol defines this usage, it SHOULD NOT be a mandatory requirement for all implementations. Other application- or protocol-specific syntaxes MAY also be supported for this purpose, if necessary. Applications MUST follow the protocol-specific guidelines on this subject. Hall I-D Expires: December 2002 [page 18] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 5.4. The Hostname Labels and Domain Names Hostnames identify specific systems and zone partitions by name, and are the most widely used of all the data-types. Hostnames are used as the owner name and data values of almost all the common resource records, are used as the connection identifier for almost all protocol-specific syntaxes and generalized applications, and are used for storing system names in local hosts databases, among other purposes. Since hostnames have such a broad number of uses and potential storage formats, they also have the strictest syntax rules. A hostname is not valid unless each label and the entire domain validate successfully. 5.4.1. Legacy Hostnames A legacy hostname domain name is a sequence of one or more legacy hostname labels, with an optional root label at the end. Applications which use legacy "hostnames" as specific data-types MUST validate the hostname domain name and label sequences separately and cumulatively. Note that the syntax for legacy hostnames has undergone many subtle and varied shifts over the years, with multiple updates and revisions allowing for slightly different syntaxes. This document unifies these definitions and clarifies some ambiguities, and is to be considered the definitive reference on the definition of a valid legacy hostname label and domain name. A legacy hostname label may only contain the Hyphen (0x2D), the numerals "0" through "9" (0x30 through 0x39), the uppercase letters "A" through "Z" (0x41 through 0x5A), and the lowercase letters "a" through "z" (0x61 through 0x7A) from [ASCII]. The first and last character in a hostname label MUST NOT be a Hyphen character, but any other character sequence is valid within the confines of a hostname label. A legacy hostname domain name is a sequence of one or more legacy hostname labels. However, at least one of the labels MUST contain at least one alphabetic character (a domain name which consists entirely of numeric values has the potential to be confused with an IP address, and this rule prevents this ambiguity). A legacy hostname domain name MAY contain an optional root label at the end Hall I-D Expires: December 2002 [page 19] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 of the domain name, although this usage MUST be explicitly allowed by the application protocol in use. With the exception of the optional root label at the end of a domain name, a legacy hostname domain name MUST NOT contain any other label data-types. If any of the labels do not validate as legacy hostname labels, or if the entire domain name does not validate as a legacy hostname domain name, then the entire domain name MUST be rejected for use as a legacy hostname domain name. The length rules associated with the DNS namespace are specifically adopted as the length rules for the legacy hostname label and domain name data-types. This definition updates the definitions provided in [RFC952] and [RFC1123], which had set these lengths at different values. As such, any system which implements a HOSTS.TXT database (or a local equivalent, such as the "/etc/hosts" file on traditional UNIX systems) MUST conform to the length restrictions defined in section 4.2.1. The syntax rules defined above are somewhat tighter than the syntax allowed in [RFC2181]. However, no standards-track network services have defined hostname syntax rules to use the allowable syntax from [RFC2181]. Instead, almost all application protocols and network services use stricter rules which are highly similar to those defined here. Applications and protocols which need to support internationalized hostnames MUST use the syntax defined in section 5.4.2. Applications and protocols which need to support eight-bit octets in their domain names MUST either use the octet label syntax described in section 5.5 or define a new syntax specifically for use with that network service. In either event, new resource records are also likely to be required. 5.4.2. Internationalized Hostnames An internationalized hostname domain name is a sequence of one or more internationalized hostname labels, with an optional root label at the end. Applications MUST validate the domain name and label sequences separately and cumulatively. The syntax for internationalized hostname labels is defined in NAMEPREP [NAMEPREP], while this document defines the syntax for internationalized hostname domain names. Hall I-D Expires: December 2002 [page 20] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 The international hostname label rules defined in NAMEPREP require that a label be lowercased and normalized with Unicode normalization form KC prior to use. Due to the massive number of characters which are available for use with internationalized hostname labels, this document cannot summarize the entire set. Note that an internationalized hostname label which only contains seven-bit codepoint values from the [ASCII] range MUST also validate as a legacy hostname label, using the rules described in section 5.4.1. This is necessary in order for a label to be reusable in both namespaces. An internationalized hostname domain name is a sequence of internationalized hostname labels, with the additional requirement that at least one of the labels MUST contain a non-numeric character. An internationalized hostname domain name MAY contain an optional root label at the end of the domain name, although this usage MUST be explicitly allowed by the application protocol in use. With the exception of the optional root label at the end of a domain name, an internationalized hostname domain name MUST NOT contain any other label data-types. If any of the labels do not validate as internationalized hostname labels, or if the entire domain name does not validate as an internationalized hostname domain name, then the entire domain name MUST be rejected for use as an internationalized hostname domain name. 5.5. The Octet Label and Domain Name The DNS namespace allows labels to contain eight-bit codepoint values, although no standardized representation or interpretation of these values is defined. While [RFC2181] allows these codepoint values to be used with any domain name or label, this document restricts the usage of these values to the octet label data-type. A octet label essentially provides a direct pass-thru mapping to the underlying DNS namespace. No additional restrictions or interpretations are defined. Multiple octet labels may be used in conjunction with multiple legacy hostname labels (with an optional root label at the end of the end of the domain name) to form an octet domain name. Hall I-D Expires: December 2002 [page 21] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 The UCS character repertoire does not provide any mechanisms for specifying raw octet values, but instead only identifies characters and their codepoint values. As such, eight-bit codepoint values are not accessible to applications which use the internationalized namespace. Instead, those applications will be required to use the DNS namespace directly whenever an octet domain name contains eight-bit codes. However, if the domain name only contains seven-bit characters, then that label can be accessed from the internationalized namespace. For this reason, applications and protocols SHOULD give preference to the range of characters defined for legacy hostname labels, as this allows the domain name to be accessed from the largest number of sources. However, applications MUST allow the full eight-bit range of values to be specified if the octet domain name data-type is required for the protocol at hand, with the caveat that these labels will be inaccessible from the internationalized namespace. If a octet label contains a Full-Stop (0x2E), Reverse-Solidus (0x5C), Double-Quote (0x22), a non-printing character from [ASCII] (0x00 through 0x20, or 0x7F) or any of the eight-bit codepoint values (0x80 through 0xFF), then those characters MUST be escaped (using the syntax rules provided in section 4.2.3) whenever the domain name is written in longhand form. Any number of octet labels may be assigned to a leaf-node in the DNS namespace. However, zone delegations use NS and SOA resource records which use hostname labels, so a fully-qualified domain name outside of the root zone MUST contain at least one legacy hostname label. Since this usage allows for a variable number of octet labels, and since applications outside of the domain name system cannot determine the database context of any given label, this can result in some ambiguity. However, no standards-track network services outside of the DNS currently require the use of octets, so this is fairly narrow area. 5.6. The Mailbox Labels and Domain Names A variety of resource records make use of mailbox label and domain name data-types in order to encapsulate email addresses into the domain name system (this model allows email addresses to use the DNS compression service). Although there are no known application protocols outside of DNS which use this data-type, the label type still has to be defined for use with DNS, and is therefore defined with its own syntax in this document. Hall I-D Expires: December 2002 [page 22] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 5.6.1. Legacy Mailboxes The legacy mailbox domain name consists of a single legacy mailbox label followed by one or more legacy hostname labels. In this model, the legacy mailbox label represents the local-part element from an RFC 2822 [RFC2822] email address, while the legacy hostname labels represent the mail domain element from an [RFC2822] email address. When a legacy mailbox domain name is expanded and mapped to an [RFC2822] email address, the legacy mailbox label goes on the left of the "@" separator, while the hostname labels go on the right of the "@" separator. The local-part element is defined in [RFC2822] as being either a dot-atom or a quoted-string. The dot-atom syntax allows for a relatively complete set of [ASCII] punctuation, numbers and alphabetic characters, while the quoted-string syntax allows for nearly all of the other characters from [ASCII] (certain control characters are globally prohibited in [RFC2822], and these apply to the quoted-string syntax as well). In order to accommodate as many email addresses as possible, these characters are defined as valid for a legacy mailbox label as well. However, if a legacy mailbox label contains a Full-Stop (0x2E), Reverse-Solidus (0x5C), Double-Quote (0x22), or any non- printing character from [ASCII] (0x00 through 0x20, or 0x7F), then those characters MUST be escaped using the syntax rules provided in section 4.2.3 whenever the domain name is written out in longhand form. Note that [RFC2821] defines the maximum length of a local-part element as 64 characters, although the maximum length of a legacy label is 63 characters. As a result, not all local-parts can be supported by the legacy mailbox label. Also note that [RFC2822] also defines the syntax of a "mail domain" as the dot-atom data-type, which allows for a larger subset of [ASCII] characters than the legacy hostname data-type allows. However, [RFC2821] also requires that mail domains be queried through DNS with MX and A resource records, both of which specify host systems. As such, the dot-atom syntax has never been usable with the legacy hostname data-type for the purpose of mail routing. The requirements for stronger data-types and syntax checks defined in this document do not affect this fundamental conflict other than to highlight its presence. Hall I-D Expires: December 2002 [page 23] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 5.6.2. Internationalized Mailboxes Legacy mailbox labels MAY be used with internationalized hostname labels to form an internationalized mailbox domain name. In this model, the legacy mailbox label represents a legacy local-part, while the internationalized hostname labels represent an international mail domain. However, note that an internationalized local-part syntax has not yet been defined, and until such a time, an internationalized mailbox label syntax cannot be defined. 5.7. The Service Locator Labels and Domain Names The service locator label and domain name syntax is used to provide service-specific redirection functions for a particular domain name. This usage is specific to DNS, so it does not have general applicability outside of the domain name system, although the label type still has to be supported, and is therefore defined with its own syntax in this document. 5.7.1. Legacy Service Locators The service locator label and domain name syntax is defined in RFC 2782 [RFC2782]. In summary, a legacy service locator domain name consists of two legacy service locator labels which uniquely identify a specific network service, with the remainder of the domain name containing hostname and/or root labels. The service locator labels are identical to the legacy hostname label syntax, with the additional requirement that a service locator label MUST begin with an Underscore (0x5F) character (this usage prevents the SRV resource record's owner domain name from colliding with other owner domain names). In this model, a registered network service is assigned a _service._proto label sequence, with this sequence being appended to the left of a legacy hostname domain name. Application clients can then issue queries for the fully-qualified service locator domain name, with the resulting answer data providing indicating which hosts offer that service on behalf of the queried domain name. Hall I-D Expires: December 2002 [page 24] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Note that zone delegation requires the use of legacy hostname labels, so a fully-qualified domain name for an SRV resource record associated with a domain name outside of the root zone MUST contain at least one legacy hostname label. However, an application can also query for an SRV resource record associated with the root domain itself, and in that scenario, the fully- qualified domain name would be "_service._proto.", with the trailing separator mark explicitly representing the root domain. 5.7.2. Internationalized Service Locators Service locator labels MAY be used with internationalized hostname labels. In this model, the legacy service locator labels represent a known service associated with an international domain. Note that [RFC2277] says that protocol identifiers do not need to be internationalized. As such, there is no requirement foreseen to allow non-ASCII characters in the service locator label syntax. 6. Resource Records and Query Types This section describes the domain name data-types in use with all of the registered resource records and query-types. These definitions include the valid owner domain names for a particular kind of resource record, and also include the valid domain names for the resource record data sections. Note that each of these rules only use domain name data-types, while those data-types are defined by constituent sets of label data-types. 6.1. Resource Records Resource records may be provided in any section of a DNS message. When they are provided in the question section as the query question, they only have owner names. When they are provided in any other section, they have owner names and resource record data. All resource records have owner name syntax rules, while those resource records which also provide domain names in resource record data also have syntax rules for those domain names. All new resource records MUST be defined with syntax rules appropriate to that resource record. Hall I-D Expires: December 2002 [page 25] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: IPv4 Address Resource Record Mnemonic: A Resource Record Code: 1 Defined In: [RFC1035] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: Name Server Resource Record Mnemonic: NS Resource Record Code: 2 Defined In: [RFC1035] Owner Name: Hostname (delegated zone partition) Resource Record Data: Hostname (authoritative server) Note: All domain delegations MUST use the hostname data-type. Resource Record Name: Mail Destination Resource Record Mnemonic: MD Resource Record Code: 3 Defined In: [RFC1035] Owner Name: Hostname (mail domain) Resource Record Data: Hostname (delivery server) Note: Obsoleted and deprecated by RFC1035 in favor of MX Resource Record Name: Mail Forwarder Resource Record Mnemonic: MF Resource Record Code: 4 Defined In: [RFC1035] Owner Name: Hostname (mail domain) Resource Record Data: Hostname (relay server) Note: Obsoleted and deprecated by RFC1035 in favor of MX Resource Record Name: Canonical Name Resource Record Mnemonic: CNAME Resource Record Code: 5 Defined In: [RFC1035] Owner Name: inherited from target owner name Resource Record Data: inherited from target owner name Note: The owner domain name and resource record data are both inherited from the target of the CNAME resource record. For example, if a CNAME resource record references an A resource record, then the owner name and the resource record data both use the Hostname domain name data-type. However, if a CNAME resource record references an SRV resource record, then the owner name and the resource record data both use the Service Locator domain name data-type. Hall I-D Expires: December 2002 [page 26] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Start-of-Authority Resource Record Mnemonic: SOA Resource Record Code: 6 Defined In: [RFC1035] Owner Name: Hostname Resource Record Data: multiple fields (see below) MNAME: Hostname (replication master server) RNAME: Mailbox (administrator's email address) SERIAL: (no domain name data-types) REFRESH: (no domain name data-types) RETRY: (no domain name data-types) EXPIRE: (no domain name data-types) MINIMUM: (no domain name data-types) Resource Record Name: Mailbox Resource Record Mnemonic: MB Resource Record Code: 7 Defined In: [RFC1035] Owner Name: Mailbox (recipient email address) Resource Record Data: Hostname (delivery server) Resource Record Name: Mail Group Resource Record Mnemonic: MG Resource Record Code: 8 Defined In: [RFC1035] Owner Name: Mailbox (original email address) Resource Record Data: Mailbox (expanded email address) Resource Record Name: Mail Rename Resource Record Mnemonic: MR Resource Record Code: 9 Defined In: [RFC1035] Owner Name: Mailbox (original email address) Resource Record Data: Mailbox (new email address) Resource Record Name: Null Resource Record Mnemonic: NULL Resource Record Code: 10 Defined In: [RFC1035] Owner Name: Legacy Octets Resource Record Data: (no domain name data-types) Hall I-D Expires: December 2002 [page 27] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Well-Known Services Resource Record Mnemonic: WKS Resource Record Code: 11 Defined In: [RFC1035] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: Pointer Resource Record Mnemonic: PTR Resource Record Code: 12 Defined In: [RFC1035] Owner Name: inherited from target owner name Resource Record Data: inherited from target owner name Note: Although PTR resource records are most often used to provide reverse-lookup mappings, the data can be used for any domain name which needs to point to another domain name. As such, the owner name and the resource record data must both inherit the domain name data-type in use with the destination. Resource Record Name: Host Information Resource Record Mnemonic: HINFO Resource Record Code: 13 Defined In: [RFC1035] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: Mail List Information Resource Record Mnemonic: MINFO Resource Record Code: 14 Defined In: [RFC1035] Owner Name: Mailbox (mailing list primary address) Resource Record Data: multiple fields (see below) RMAILBOX: Mailbox / Root (responsible party address) EMAILBOX: Mailbox / Root (error-handler mailbox) Note: MINFO defines an application-specific interpretation for the root domain in the resource record as an alternative to the Mailbox data-type. Hall I-D Expires: December 2002 [page 28] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Mail Exchange Resource Record Mnemonic: MX Resource Record Code: 15 Defined In: [RFC1035] Owner Name: Hostname (mail domain) Resource Record Data: multiple fields (see below) PREFERENCE: (no domain name data-types) DESTINATION: Hostname (mail server) Resource Record Name: Text Resource Record Mnemonic: TXT Resource Record Code: 16 Defined In: [RFC1035] Owner Name: Legacy Octets Resource Record Data: (no domain name data-types) Note: The TXT resource record is commonly used as a proving ground for new resource records, and this must continue to be supported. Resource Record Name: Responsible Person Resource Record Mnemonic: RP Resource Record Code: 17 Defined In: RFC 1183 [RFC1183] Owner Name: Hostname Resource Record Data: multiple fields (see below) RMAILBOX: Mailbox (responsible person contact address) DETAILS: Legacy Octets (pointer to TXT record) Note: Binding this resource record to the hostname data-type may artificially limits its usefulness, although it results in greater predictability and consistency in the internationalized namespace. Resource Record Name: AFS Database Entry Resource Record Mnemonic: AFSDB Resource Record Code: 18 Defined In: [RFC1183] Owner Name: Hostname Resource Record Data: multiple fields (see below) PREFERENCE: (no domain name data-types) DESTINATION: Hostname (AFS server) Hall I-D Expires: December 2002 [page 29] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: X.25 Number Resource Record Mnemonic: X.25 Resource Record Code: 19 Defined In: [RFC1183] Owner Name: Hostname (host) Resource Record Data: (no domain name data-types) Resource Record Name: ISDN Number Resource Record Mnemonic: ISDN Resource Record Code: 20 Defined In: [RFC1183] Owner Name: Hostname (host) Resource Record Data: (no domain name data-types) Resource Record Name: Route-Through Resource Record Mnemonic: RT Resource Record Code: 21 Defined In: [RFC1183] Owner Name: Hostname (host) Resource Record Data: multiple fields (see below) PREFERENCE: (no domain name data-types) DESTINATION: Hostname (next-hop host) Resource Record Name: OSI NSAP Address Resource Record Mnemonic: NSAP Resource Record Code: 22 Defined In: RFC 1706 [RFC1706] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: OSI NSAP Pointer Resource Record Mnemonic: NSAP-PTR Resource Record Code: 23 Defined In: [RFC1706] Owner Name: Hostname (hexadecimal NSAP address) Resource Record Data: Hostname (host) Resource Record Name: Signature Resource Record Mnemonic: SIG Resource Record Code: 24 Defined In: RFC 2535 [RFC2535] and RFC 2931 [RFC2931] Owner Name: Hostname (?) Resource Record Data: (no domain name data-types) Hall I-D Expires: December 2002 [page 30] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Public Key Resource Record Mnemonic: KEY Resource Record Code: 25 Defined In: [RFC2535] Owner Name: Legacy Octets Resource Record Data: (no domain name data-types) Note: The owner name must be treated as unstructured, since the KEY resource record may be bound to any domain name. Resource Record Name: Pointer to X.400 Mapping Resource Record Mnemonic: PX Resource Record Code: 26 Defined In: RFC 2163 [RFC2163] Owner Name: Hostname (encoded X.400 mail domain) Resource Record Data: multiple fields (see below) RFC822-MAIL: Hostname (mail domain) X400-MAIL: Hostname (mail domain) Resource Record Name: Geographical Position Resource Record Mnemonic: GPOS Resource Record Code: 27 Defined In: RFC 1712 [RFC1712] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: IPv6 Simple Address Resource Record Mnemonic: AAAA Resource Record Code: 28 Defined In: RFC 1886 [RFC1886] Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: Location Resource Record Mnemonic: LOC Resource Record Code: 29 Defined In: RFC 1876 [RFC1876] Owner Name: Hostname Resource Record Data: (no domain name data-types) Hall I-D Expires: December 2002 [page 31] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Next Record Resource Record Mnemonic: NXT Resource Record Code: 30 Defined In: [RFC2535] Owner Name: Legacy Octets Resource Record Data: multiple fields (see below) NEXT-OWNER: Legacy Octets (the next domain name) TYPE: (no domain name data-types) Note: The owner name must be treated as unstructured, since the NXT resource record may be bound to any domain name. Resource Record Name: Service Locator Resource Record Mnemonic: SRV Resource Record Code: 33 Defined In: [RFC2782] Owner Name: Service Locator Resource Record Data: multiple fields (see below) PRIORITY: (no domain name data-types) WEIGHT: (no domain name data-types) PORT: (no domain name data-types) TARGET: Hostname (target server) Resource Record Name: ATM Address Resource Record Mnemonic: ATMA Resource Record Code: 34 Defined In: N/A (see ATM Forum standards) Owner Name: Hostname Resource Record Data: (no domain name data-types) Resource Record Name: Naming Authority Pointer Resource Record Mnemonic: NAPTR Resource Record Code: 35 Defined In: RFC 2915 [RFC2915] Owner Name: Hostname Resource Record Data: multiple fields (see below) ORDER: (no domain name data-types) PREFERENCE: (no domain name data-types) FLAGS: (no domain name data-types) SERVICE: (no domain name data-types) REGEXPS: (no domain name data-types) REPLACEMENT: Hostname / Service Locator (see notes) Note: The domain name provided in the REPLACEMENT sub-field can reference a NAPTR, SRV or A resource record by its owner name, depending on the value of the FLAGS sub-field. Hall I-D Expires: December 2002 [page 32] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Key Exchange Resource Record Mnemonic: KX Resource Record Code: 36 Defined In: RFC 2230 [RFC2230] Owner Name: Legacy Octets (any domain name) Resource Record Data: multiple fields (see below) PREFERENCE: (no domain name data-types) DESTINATION: Hostname (key server) Note: The owner name must be treated as unstructured, since the KX resource record may be bound to any domain name. Resource Record Name: Certificate Resource Record Mnemonic: CERT Resource Record Code: 37 Defined In: RFC 2538 [RFC2538] Owner Name: Hostname Resource Record Data: (no domain name data-types) Note: The owner name must be treated as unstructured, since the CERT resource record may be bound to any domain name. Resource Record Name: IPv6 Complex Address Resource Record Mnemonic: A6 Resource Record Code: 38 Defined In: RFC 2874 [RFC2874] Owner Name: Hostname (host) Resource Record Data: (no domain name data-types) Resource Record Name: Domain Name Redirection Resource Record Mnemonic: DNAME Resource Record Code: 39 Defined In: RFC 2672 [RFC2672] Owner Name: Hostname Resource Record Data: Hostname Resource Record Name: Extended Option Resource Record Mnemonic: OPT Resource Record Code: 41 Defined In: RFC 2671 [RFC2671] Owner Name: Root Resource Record Data: (no domain name data-types) Hall I-D Expires: December 2002 [page 33] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Resource Record Name: Transaction Key Resource Record Mnemonic: TKEY Resource Record Code: 249 Defined In: RFC 2930 [RFC2930] Owner Name: Legacy Octets Resource Record Data: (no domain name data-types) Note: The owner name must be treated as unstructured, since the TKEY resource record may be bound to any domain name. Resource Record Name: Transaction Signature Resource Record Mnemonic: TSIG Resource Record Code: 250 Defined In: RFC 2845 [RFC2845] Owner Name: Legacy Octets Resource Record Data: (no domain name data-types) Note: The owner name must be treated as unstructured, since the TSIG resource record may be bound to any domain name. 6.2. Query Types Apart from the resource records defined in section 6.1 above, there are also a handful of query types. Query types are only provided in the question section of a DNS message, and do not have resource record data. However, their owner names have domain name data-types which require standardization. All new query-types MUST be defined with syntax rules appropriate to that query-type. Query Name: Incremental Transfer Query Mnemonic: IXFR Query Code: 251 Defined In: RFC 1995 [RFC1995] Owner Name: Hostname Query Name: Zone Transfer Query Mnemonic: AXFR Query Code: 252 Defined In: [RFC1035] Owner Name: Hostname Hall I-D Expires: December 2002 [page 34] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 Query Name: Mailbox Records Query Mnemonic: MAILB Query Code: 253 Defined In: [RFC1035] Owner Name: Mailbox Note: This query-type requests all of the Mailbox, Mail Group, Mail Rename and Mail List resource records associated with an email address. Query Name: Mail Transfer Records Query Mnemonic: MAILA Query Code: 254 Defined In: [RFC1035] Owner Name: Hostname Note: Obsoleted and deprecated by RFC1035 in favor of MX, but used to request all of the Mail Forwarder and Mail Destination resource records associated with a mail domain. Query Name: All Records Query Mnemonic: "*" or ALL Query Code: 255 Defined In: [RFC1035] Owner Name: Hostname 7. Security Considerations This document does not change any on-the-wire formats, and therefore does not introduce any new security risks within the affected protocols. However, it is the author's hope that by defining strict syntaxes for domain names and labels that overall security can be improved as a result of higher predictability and better development practices. 8. IANA Considerations This document requires the use of an EDNS extended label type identification code. This document uses the b000011 ELT code. Hall I-D Expires: December 2002 [page 35] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 9. References [RFC608] RFC 608, "HOST NAMES ON-LINE", M. Kudlick, January 1974. [RFC810] RFC 810, "DoD INTERNET HOST TABLE SPECIFICATION", E. Feinler et al, March 1982. [RFC882] RFC 882, "DOMAIN NAMES - CONCEPTS and FACILITIES", P. Mockapetris, November 1983. [RFC952] RFC 952, "DOD INTERNET HOST TABLE SPECIFICATION", K. Harrenstien et al, October 1985. [RFC1034] RFC 1034, "DOMAIN NAMES - CONCEPTS and FACILITIES", P. Mockapetris, November 1987. [RFC1123] RFC 1123, "Requirements for Internet Hosts -- Application and Support", R. Braden, October 1989. [RFC2181] RFC 2181, "Clarifications to the DNS Specification", R. Elz et al, July 1997. [ASCII] "ANSI X3.4-1968. USA Standard Code for Information Interchange", ANSI. [RFC883] RFC 883, "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION", P. Mockapetris, November 1983. [RFC1035] RFC 1035, "DOMAIN NAMES - IMPLEMENTATION AND SPECIFICATION", P. Mockapetris, November 1987. [ISO-10646] "ISO/IEC 10646-1:2000. International Standard -- Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane" and "Part 2: Supplementary Planes", ISO. [UNICODE] "The Unicode Consortium, The Unicode Standard, Version 3.0", Addison-Wesley: Reading, MA, 2000. Update to version 3.1, 2001. Update to version 3.2, 2002. [PUNYCODE] Internet-Draft, "Punycode:An encoding of Unicode for use with IDNA", A. Costello, May 2002. Hall I-D Expires: December 2002 [page 36] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 [IDNA] Internet-Draft, "Internationalizing Domain Names In Applications (IDNA)", P. Faltstrom et al, May 2002. [STRINGPREP] Internet-Draft, "Preparation of Internationalized Strings", P. Hoffman et al, May 2002. [NAMEPREP] Internet-Draft, "Nameprep: A Stringprep Profile for Internationalized Domain Names", P. Hoffman et al, May 2002. [RFC1535] RFC 1535, "A Security Problem and Proposed Correction With Widely Deployed DNS Software", E. Gavron, October 1993. [RFC2821] RFC 2821, "Simple Mail Transfer Protocol", J. Klensin, April 2001. [RFC2822] RFC 2822, "Internet Message Format", P. Resnick, April 2001. [RFC2782] RFC 2782, "A DNS RR for specifying the location of services (DNS SRV)", A. Gulbrandsen et al, February 2000. [RFC2277] RFC 2277, "IETF Policy on Character Sets and Languages", H. Alvestrand, January 1998. [RFC1183] RFC 1183, "New DNS RR Definitions", C. Everhart et al, October 1990. [RFC1706] RFC 1706, "DNS NSAP Resource Records", B. Manning et al, October 1994. [RFC2535] RFC 2535, "Domain Name System Security Extensions", D. Eastlake, March 1999. [RFC2931] RFC 2931, "DNS Request and Transaction Signatures ( SIG(0)s )", D. Eastlake, September 2000. [RFC2163] RFC 2163, "Using the Internet DNS to Distribute MIXER Conformant Global Address Mapping (MCGAM)", C. Allocchio, January 1998. [RFC1712] RFC 1712, "DNS Encoding of Geographical Location", C. Farrell et al, November 1994. Hall I-D Expires: December 2002 [page 37] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 [RFC1886] RFC 1886, "DNS Extensions to support IP version 6", S. Thomson et al, December 1995. [RFC1876] RFC 1876, "A Means for Expressing Location Information in the Domain Name System", C. Davis et al, January 1996. [RFC2915] RFC 2915, "The Naming Authority Pointer (NAPTR) DNS Resource Record", M. Mealling et al, September 2000. [RFC2230] RFC 2230, "Key Exchange Delegation Record for the DNS", R. Atkinson, November 1997. [RFC2538] RFC 2538, "Storing Certificates in the Domain Name System (DNS)", D. Eastlake et al, March 1999. [RFC2874] RFC 2874, "DNS Extensions to Support IPv6 Address Aggregation and Renumbering", M. Crawford et al, July 2000. [RFC2672] RFC 2672, "Non-Terminal DNS Name Redirection", M. Crawford, August 1999. [RFC2671] RFC 2671, "Extension Mechanisms for DNS (EDNS0)", P. Vixie, August 1999. [RFC2930] RFC 2930, "Secret Key Establishment for DNS (TKEY RR)", D. Eastlake, September 2000. [RFC2845] RFC 2845, "Secret Key Transaction Authentication for DNS (TSIG)", P. Vixie et al, May 2000. [RFC1995] RFC 1995, "Incremental Zone Transfer in DNS", M. Ohta, August 1996. 10. Acknowledgements The author made multiple attempts at avoiding this work. David Hopwood and Mark Andrews are credited with arguing that it needed to be done, and John Klensin is credited with providing helpful feedback on how it should be done. Hall I-D Expires: December 2002 [page 38] INTERNET-DRAFT draft-hall-dns-datatypes-00.txt June 2002 11. Author's Address Eric A. Hall ehall@ehsco.com Hall I-D Expires: December 2002 [page 39]