W3C

XHTML™ 2.0

W3C Working Draft 27 May 2005

This version:
http://www.w3.org/TR/2005/WD-xhtml2-20050527
Latest version:
http://www.w3.org/TR/xhtml2
Previous version:
http://www.w3.org/TR/2004/WD-xhtml2-20040722
Diff-marked version:
xhtml2-diff.html
Editors:
Jonny Axelsson, Opera Software
Mark Birbeck, x-port.net
Micah Dubinko, Invited Expert
Beth Epperson, Websense
Masayasu Ishikawa , W3C
Shane McCarron , Applied Testing and Technology
Ann Navarro, WebGeek, Inc.
Steven Pemberton , CWI ( HTML Working Group Chair)

This document is also available in these non-normative formats: Single XHTML file , PostScript version , PDF version , ZIP archive , and Gzip'd TAR archive .


Abstract

XHTML 2 is a general-purpose markup language designed for representing documents for a wide range of purposes across the World Wide Web. To this end it does not attempt to be all things to all people, supplying every possible markup idiom, but to supply a generally useful set of elements.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is the seventh public Working Draft of this specification. It should in no way be considered stable, and should not be normatively referenced for any purposes whatsoever. This version includes an early implementation of XHTML 2.0 in RELAX NG [ RELAXNG ], but does not include the implementations in DTD or XML Schema form. Those will be included in subsequent versions, once the content of this language stabilizes.

Formal issues and error reports on this specification shall be submitted to www-html-editor@w3.org ( archive ). It is inappropriate to send discussion email to this address. Public discussion may take place on www-html@w3.org ( archive ). To subscribe send an email to www-html-request@w3.org with the word subscribe in the subject line.

This document has been produced by the W3C HTML Working Group ( members only ) as part of the W3C HTML Activity . The goals of the HTML Working Group are discussed in the HTML Working Group charter .

This document was produced under the 24 January 2002 CPP as amended by the W3C Patent Policy Transition Procedure . The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy .

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than "work in progress."

Quick Table of Contents

List of Issues

  1. [PR #7723] On introducing a fallback attribute
  2. [PR #7336] Identifying XHTML version in ansence of DTDs
  3. [PR #7442] xml:id
  4. [PR #7661] [XHTML2] Constraining attribute relationship
  5. Need a normative definition for the version attribute
  6. [PR #7665] [XHTML2] Proposal: block[@kind] element
  7. [PR #7714] Concering XHTML 2.0's blockquote and blockcode
  8. [PR #7662] Navigation Lists
  9. [PR #7663] [XHTML2] 11.3. The ol , and ul elements
  10. [PR #7664] WD-xhtml2-20040722: Some navigation list requirements (IMHO)
  11. [PR #670] Entity management: do we still need it?
  12. [PR #671] Character entities: do we still need them?

Full Table of Contents

1. Introduction

This section is informative .

1.1. What is XHTML 2?

XHTML 2 is a general purpose markup language designed for representing documents for a wide range of purposes across the World Wide Web. To this end it does not attempt to be all things to all people, supplying every possible markup idiom, but to supply a generally useful set of elements, with the possibility of extension using the class and role attributes on the span and div elements in combination with style sheets, and attributes from the metadata attributes collection.

1.1.1. Design Aims

In designing XHTML 2, a number of design aims were kept in mind to help direct the design. These included:

1.1.2. Backwards compatibility

Because earlier versions of HTML were special-purpose languages, it was necessary to ensure a level of backwards compatibility with new versions so that new documents would still be usable in older browsers. However, thanks to XML and style sheets, such strict element-wise backwards compatibility is no longer necessary, since an XML-based browser, of which at the time of writing means more than 95% of browsers in use, can process new markup languages without having to be updated. Much of XHTML 2 works already in existing browsers; much, but not all: just as when forms and tables were added to HTML, and people had to wait for new version of browsers before being able to use the new facilities, some parts of XHTML 2, principally XForms and XML Events, still require user agents that understand that functionality.

1.1.3. XHTML 2 and Presentation

The very first version of HTML was designed to represent the structure of a document, not its presentation. Even though presentation-oriented elements were later added to the language by browser manufacturers, HTML is at heart a document structuring language. XHTML 2 takes HTML back to these roots, by removing all presentation elements, and subordinating all presentation to style sheets. This gives greater flexibility, greater accessibility, more device independence, and more powerful presentation possibilities, since style sheets can do more than the presentational elements of HTML ever did.

1.1.4. XHTML 2 and Linking

The original versions of HTML relied upon built-in knowledge on the part of User Agents and other document processors. While much of this knowledge had to do with presentation (see above), the bulk of the remainder had to do with the relationships between documents — so called "linking".

A variety of W3C and other efforts, most notably [ XLINK ], attempted to create a grammar for defining the characteristings of linking. Unfortunately, these grammars all fall short of the requirements of XHTML. The community is continuing in its efforts to create a comprehensive grammar that describes link characteristics.

The HTML Working Group has determined that such a grammar, while generally useful, is not required for the definition of XHTML 2. Instead, this document is explicit in the characteristics of the elements and attributes that are used to connect to other resources. The Working Group has taken this course because 1) the problem with XHTML 2 is well bounded, 2) the general solution is slow in coming, and 3) it will be easier for implementors to support and users to rely upon.

1.2. Major Differences with XHTML 1

XHTML 2 is designed to be recognizable to the HTML and XHTML 1 author, while correcting errors and insufficiencies identified in earlier versions of the HTML family, and taking the opportunity to make improvements.

The most visible changes are the following:

1.3. What are the XHTML 2 Modules?

XHTML 2 is a member of the XHTML Family of markup languages. It is an XHTML Host Language as defined in XHTML Modularization. As such, it is made up of a set of XHTML Modules that together describe the elements and attributes of the language, and their content model. XHTML 2 updates many of the modules defined in XHTML Modularization 1.0 [ XHTMLMOD ], and includes the updated versions of all those modules and their semantics. XHTML 2 also uses modules from Ruby [ RUBY ], XML Events [ XMLEVENTS ], and XForms [ XFORMS ].

The modules defined in this specification are largely extensions of the modules defined in XHTML Modularization 1.0. This specification also defines the semantics of the modules it includes. So, that means that unlike earlier versions of XHTML that relied upon the semantics defined in HTML 4 [ HTML4 ], all of the semantics for XHTML 2 are defined either in this specification or in the specifications that it normatively references.

Even though the XHTML 2 modules are defined in this specification, they are available for use in other XHTML family markup languages. Over time, it is possible that the modules defined in this specification will migrate into the XHTML Modularization specification.

2. Terms and Definitions

This section is normative .

While some terms are defined in place, the following definitions are used throughout this document. Familiarity with the W3C XML 1.0 Recommendation [ XML ] is highly recommended.

abstract module
a unit of document type specification corresponding to a distinct type of content, corresponding to a markup construct reflecting this distinct type.
content model
the declared markup structure allowed within instances of an element type. XML 1.0 differentiates two types: elements containing only element content (no character data) and mixed content (elements that may contain character data optionally interspersed with child elements). The latter are characterized by a content specification beginning with the "#PCDATA" string (denoting character data).
deprecated
a feature marked as deprecated is in the process of being removed from this recommendation. Portable documents should not use features marked as deprecated.
document model
the effective structure and constraints of a given document type. The document model constitutes the abstract representation of the physical or semantic structures of a class of documents.
document type
a class of documents sharing a common abstract structure. The ISO 8879 [ SGML ] definition is as follows: "a class of documents having similar characteristics; for example, journal, article, technical manual, or memo. (4.102)"
document type definition (DTD)
a formal, machine-readable expression of the XML structure and syntax rules to which a document instance of a specific document type must conform; the schema type used in XML 1.0 to validate conformance of a document instance to its declared document type. The same markup model may be expressed by a variety of DTDs.
driver
a generally short file used to declare and instantiate the modules of a DTD. A good rule of thumb is that a DTD driver contains no markup declarations that comprise any part of the document model itself.
element
an instance of an element type.
element type
the definition of an element, that is, a container for a distinct semantic class of document content.
entity
an entity is a logical or physical storage unit containing document content. Entities may be composed of parseable XML markup or character data, or unparsed (i.e., non-XML, possibly non-textual) content. Entity content may be either defined entirely within the document entity ("internal entities") or external to the document entity ("external entities"). In parsed entities, the replacement text may include references to other entities.
entity reference
a mnemonic string used as a reference to the content of a declared entity (e.g., "&amp;" for "&", "&lt;" for "<", "&copy;" for "©".)
facilities
Facilities are elements , attributes , and the semantics associated with those elements and attributes .
focusable
Elements are considered "focusable" if they are visible (e.g., have the equivalent of the [ CSS2 ] property of "display" with a value other than none ) not disabled (see [ XFORMS ]), and either 1) have an href attribute or 2) are considered a form control as defined in [ XFORMS ].
generic identifier
the name identifying the element type of an element. Also, element type name.
hybrid document
A hybrid document is a document that uses more than one XML namespace. Hybrid documents may be defined as documents that contain elements or attributes from hybrid document types.
instantiate
to replace an entity reference with an instance of its declared content.
markup declaration
a syntactical construct within a DTD declaring an entity or defining a markup structure. Within XML DTDs, there are four specific types: entity declaration defines the binding between a mnemonic symbol and its replacement content; element declaration constrains which element types may occur as descendants within an element (see also content model); attribute definition list declaration defines the set of attributes for a given element type, and may also establish type constraints and default values; notation declaration defines the binding between a notation name and an external identifier referencing the format of an unparsed entity.
markup model
the markup vocabulary (i.e., the gamut of element and attribute names, notations, etc.) and grammar (i.e., the prescribed use of that vocabulary) as defined by a document type definition (i.e., a schema) The markup model is the concrete representation in markup syntax of the document model, and may be defined with varying levels of strict conformity. The same document model may be expressed by a variety of markup models.
module
an abstract unit within a document model expressed as a DTD fragment, used to consolidate markup declarations to increase the flexibility, modifiability, reuse and understanding of specific logical or semantic structures.
modularization
an implementation of a modularization model; the process of composing or de-composing a DTD by dividing its markup declarations into units or groups to support specific goals. Modules may or may not exist as separate file entities (i.e., the physical and logical structures of a DTD may mirror each other, but there is no such requirement).
modularization model
the abstract design of the document type definition (DTD) in support of the modularization goals, such as reuse, extensibility, expressiveness, ease of documentation, code size, consistency and intuitiveness of use. It is important to note that a modularization model is only orthogonally related to the document model it describes, so that two very different modularization models may describe the same document type.
parameter entity
an entity whose scope of use is within the document prolog (i.e., the external subset/DTD or internal subset). Parameter entities are disallowed within the document instance.
parent document type
A parent document type of a hybrid document is the document type of the root element.
tag
descriptive markup delimiting the start and end (including its generic identifier and any attributes) of an element.

3. Conformance Definition

This section is normative .

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [ RFC2119 ].

3.1. Document Conformance

In this document, the use of the word 'schema' refers to any definition of the syntax of XHTML 2, regardless of the definition language used.

3.1.1. Strictly Conforming Documents

A strictly conforming XHTML 2.0 document is a document that requires only the facilities described as mandatory in this specification. Such a document must meet all the following criteria:

  1. The document must conform to the constraints expressed in the schemas in Appendix B - XHTML 2.0 RELAX NG Definition , Appendix D - XHTML 2.0 Schema and Appendix F - XHTML 2.0 Document Type Definition .

  2. The local part of the root element of the document must be html .

  3. The start tag of the root element of the document must explicitly contain an xmlns declaration for the XHTML 2.0 namespace [ XMLNS ]. The namespace URI for XHTML 2.0 is defined to be http://www.w3.org/2002/06/xhtml2/ .

    The start tag must also contain an xsi:schemaLocation attribute. The schema location for XHTML 2.0 is defined to be http://www.w3.org/MarkUp/SCHEMA/xhtml2.xsd .

    Sample root element

    <html xmlns="http://www.w3.org/2002/06/xhtml2/" xml:lang="en"
    
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://www.w3.org/2002/06/xhtml2/
                              http://www.w3.org/MarkUp/SCHEMA/xhtml2.xsd"
    
    >
    
  4. There should be a DOCTYPE declaration in the document prior to the root element. If present, the public identifier included in the DOCTYPE declaration must reference the DTD found in Appendix F using its Public Identifier. The system identifier may be modified appropriately.

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
        "http://www.w3.org/MarkUp/DTD/xhtml2.dtd">
    
    

Example of an XHTML 2.0 document

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" 
                    href="http://www.w3.org/MarkUp/style/xhtml2.css"?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 2.0//EN"
    "http://www.w3.org/MarkUp/DTD/xhtml2.dtd">
<html xmlns="http://www.w3.org/2002/06/xhtml2/" xml:lang="en"

      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2002/06/xhtml2/
                          http://www.w3.org/MarkUp/SCHEMA/xhtml2.xsd"

>
  <head>
    <title>Virtual Library</title>
  </head>
  <body>
    <p>Moved to <a href="http://example.org/">example.org</a>.</p>

  </body>
</html></pre>

Note that in this example, the XML declaration is included. An XML declaration like the one above is not required in all XML documents. XHTML document authors should use XML declarations in all their documents. XHTML document authors must use an XML declaration when the character encoding of the document is other than the default UTF-8 or UTF-16 and no encoding is specified by a higher-level protocol.

3.2. XHTML Family User Agent Conformance

A conforming user agent must meet all of the following criteria:

  1. The user agent must parse and evaluate an XHTML 2 document for well-formedness. If the user agent claims to be a validating user agent, it must also validate documents against a referenced schema according to [ XML ].

  2. When the user agent claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities' definition.

  3. A user agent must only recognize attributes of type ID (e.g., the id attribute on most XHTML 2 elements) as fragment identifiers.

  4. If a user agent encounters an element it does not recognize, it must continue to process the content of that element.

  5. If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).

  6. If a user agent encounters an attribute value it doesn't recognize, it must use the default attribute value.

  7. When rendering content, user agents that encounter characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.

  8. White space must be handled according to the rules of [ XML ]. All XHTML 2 elements preserve whitespace.

    The user agent must use the definition from CSS for processing white space characters [ CSS3-TEXT ].

  9. In the absence of a style-sheet, including user agents that do not process style sheets, the default visual presentation should be as if the user agent used the CSS style sheet specified in Appendix H.

3.3. Issues

On introducing a fallback attribute PR #7723
State: Open
Resolution: None
User: None

Notes:

4. The XHTML 2.0 Document Type

This section is normative .

The XHTML 2.0 document type is a fully functional document type with rich semantics. It is a collection of XHTML-conforming modules (most of which are defined in this specification). The Modules and their elements are listed here for information purposes, but the definitions in their base documents should be considered authoritative. In the on-line version of this document, the module names in the list below link into the definitions of the modules within the relevant version of the authoritative specification.

Document Module
body, head, html, title
Structural Module
address, blockcode, blockquote, div, h, h1, h2, h3, h4, h5, h6, p, pre, section, separator
Text Module
abbr, cite, code, dfn, em, kbd, l, quote, samp, span, strong, sub, sup, var
Hypertext Module
a
List Module
dl, dt, dd, label, nl, ol, ul, li
Core Attributes Module
class , id , and title attributes
Hypertext Attributes Module
href , hreftype , cite , target , rel , rev , access , nextfocus , prevfocus , and xml:base attributes
Internationalization Attribute Module
xml:lang attribute
Bi-directional Text Module
dir attribute
Edit Attributes Module
edit and datetime attributes
Embedding Attributes Module
src and type attributes
Handler Module
handler
Image Map Attributes Module
usemap, ismap, shape , and coords attributes
Media Attribute Module
media attribute
Metainformation Attributes Module
about , content , datatype , property , rel , resource , restype , and rev attributes
Metainformation Module
meta , link
Object Module
object, param, standby
Style Attribute Module
style attribute
Stylesheet Module
style element
Tables Module
caption, col, colgroup, summary, table, tbody, td, tfoot, th, thead, tr

XHTML 2.0 also uses the following externally defined modules:

Ruby Annotation Module [ RUBY ]
ruby, rbc, rtc, rb, rt, rp
XForms Module [ XFORMS ]
action, alert, bind, case, choices, copy, delete, dispatch, extension, filename, group, help, hint, input, insert, instance, item, itemset, label, load, mediatype, message, model, output, range, rebuild, recalculate, refresh, repeat, reset, revalidate, secret, select, select1, send, setfocus, setindex, setvalue, submission, submit, switch, textarea, toggle, trigger, upload , and value elements, and repeat-model, repeat-bind, repeat-nodeset, repeat-startindex , and repeat-number attributes
XML Events Module [ XMLEVENTS ]
listener element, and defaultAction, event, handler, objserver, phase, propagate , and target attributes in the [ XMLEVENTS ] namespace

An implementation of this document type as a RELAX NG grammar is defined in Appendix B , as an XML Schema in Appendix D , and as a DTD in Appendix F .

4.1. Issues

Identifying XHTML version in ansence of DTDs PR #7336
State: Suspended
Resolution: Defer
User: None

Notes:
BAE F2F: for the present DTD's are required for entity resolution. This is a tricky issue, and the working group needs to resolve it quickly. We are asking for input from the Hypertext Coordination Group and others in our quest to sort it out.

5. Module Definition Conventions

This section is normative .

This document defines a variety of XHTML modules and the semantics of those modules. This section describes the conventions used in those module definitions.

5.1. Module Structure

Each module in this document is structured in the following way:

5.2. Abstract Module Definitions

An abstract module is a definition of an XHTML module using prose text and some informal markup conventions. While such a definition is not generally useful in the machine processing of document types, it is critical in helping people understand what is contained in a module. This section defines the way in which XHTML abstract modules are defined. An XHTML-conforming module is not required to provide an abstract module definition. However, anyone developing an XHTML module is encouraged to provide an abstraction to ease in the use of that module.

5.3. Syntactic Conventions

The abstract modules are not defined in a formal grammar. However, the definitions do adhere to the following syntactic conventions. These conventions are similar to those of XML DTDs, and should be familiar to XML DTD authors. Each discrete syntactic element can be combined with others to make more complex expressions that conform to the algebra defined here.

element name
When an element is included in a content model, its explicit name will be listed.
content set
Some modules define lists of explicit element names called content sets . When a content set is included in a content model, its name will be listed.
expr ?
Zero or one instances of expr are permitted.
expr +
One or more instances of expr are required.
expr *
Zero or more instances of expr are permitted.
a , b
Expression a is required, followed by expression b .
a | b
Either expression a or expression b is required.
a - b
Expression a is permitted, omitting elements in expression b.
parentheses
When an expression is contained within parentheses, evaluation of any subexpressions within the parentheses take place before evaluation of expressions outside of the parentheses (starting at the deepest level of nesting first).
extending pre-defined elements
In some instances, a module adds attributes to an element. In these instances, the element name is followed by an ampersand ( & ).
defining required attributes
When an element requires the definition of an attribute, that attribute name is followed by an asterisk ( * ).
defining the type of attribute values
When a module defines the type of an attribute value, it does so by listing the type in parentheses after the attribute name.
defining the legal values of attributes
When a module defines the legal values for an attribute, it does so by listing the explicit legal values (enclosed in quotation marks), separated by vertical bars ( | ), inside of parentheses following the attribute name. If the attribute has a default value, that value is followed by an asterisk ( * ). If the attribute has a fixed value, the attribute name is followed by an equals sign ( = ) and the fixed value enclosed in quotation marks.

5.4. Content Types

Abstract module definitions define minimal, atomic content models for each module. These minimal content models reference the elements in the module itself. They may also reference elements in other modules upon which the abstract module depends. Finally, the content model in many cases requires that text be permitted as content to one or more elements. In these cases, the symbol used for text is PCDATA (processed characted data). This is a term, defined in the XML 1.0 Recommendation, that refers to processed character data. A content type can also be defined as EMPTY , meaning the element has no content in its minimal content model.

5.5. Attribute Types

In some instances, it is necessary to define the types of attribute values or the explicit set of permitted values for attributes. The following attribute types (defined in the XML 1.0 Recommendation) are used in the definitions of the abstract modules:

Attribute Type Definition
CDATA Character data
ID A document-unique identifier
IDREF A reference to a document-unique identifier
IDREFS A space-separated list of references to document-unique identifiers
NMTOKEN A name composed of only name tokens as defined in XML 1.0 [ XML ].
NMTOKENS One or more white space separated NMTOKEN values
NUMBER Sequence of one or more digits ([0-9])

In addition to these pre-defined data types, XHTML Modularization defines the following data types and their semantics (as appropriate):

Data type Description
Character A single character, as per section 2.2 of [ XML ].
Encodings A comma-separated list of 'charset's with optional q parameters, as defined in section 14.2 of [ RFC2616 ] as the field value of the Accept-Charset request header.
ContentTypes Attributes of this type identify the allowable content type(s) of an associated URI (usually a value of another attribute on the same element). At its most general, it is a comma-separated list of media ranges with optional accept parameters, as defined in section 14.1 of [ RFC2616 ] as the field value of the accept request header.

In its simplest case, this is just a media type, such as "image/png" or "application/xml", but it may also contain asterisks, such as "image/*" or "*/*", or lists of acceptable media types, such as "image/png, image/gif, image/jpeg".

The user agent must combine this list with its own list of acceptable media types by taking the intersection, and then use the resulting list as the field value of the accept request header when requesting the resource using HTTP.

For instance, if the attribute specifies the value "image/png, image/gif, image/jpeg", but the user agent does not accept images of type "image/gif" then the resultant accept header would contain "image/png, image/jpeg".

A user agent must imitate similar behavior when using other methods than HTTP. For instance, when accessing files in a local filestore, <p src="logo" type="image/png, image/jpeg"> might cause the user agent first to look for a file logo.png ,and then for logo.jpg .

If a value for the content type is not given, "*/*" must be used for its value.

For the current list of registered content types, please consult [ MIMETYPES ].

Coordinates Comma separated list of Length s used in defining areas.
Datetime Date and time information, as defined by the type dateTime in [ XMLSCHEMA ] except that the timezone part is required.
HrefTarget Name used as destination for results of certain actions, with legal values as defined by NMTOKEN .
LanguageCode A language code. The values should conform to [ RFC3066 ] or its successors.
LanguageCodes A comma-separated list of language ranges with optional q parameters, as defined in section 14.4 of [ RFC2616 ] as the field value of the Accept-Language request header. Individual language codes should conform to [ RFC3066 ] or its successors.
Length Either a number, representing a number of pixels, or a percentage, representing a percentage of the available horizontal or vertical space. Thus, the value "50%" means half of the available space.
LocationPath A location path as defined in [ XPATH ].
MediaDesc

A comma-separated list of media descriptors as described by [ CSS2 ]. The default is all .

Number One or more digits
QName An [ XMLNS ]-qualified name. See QName for a formal definition.
Text A character string.
URI An Internationalized Resource Identifier Reference, as defined by [ IRI ].
URIs A space-separated list of URIs as defined above.

6. XHTML Attribute Collections

This section is normative .

Many of the modules in this document define the required attributes for their elements. The elements in those modules may also reference zero or more attribute collections. Attribute collections are defined in their own modules, but the meta collection "Common" is defined in this section. The table below summarizes the attribute collections available.

Collection Module Description
Core Core Attributes Module Basic attributes used to identify and classify elements and their content.
I18N Internationalization Attribute Module Attribute to identify the language of an element and its contents.
Bi-directional Bi-directional Text Collection Attributes used to manage bi-directional text.
Edit Edit Attributes Module Attributes used to annotate when and how an element's content was edited.
Embedding Embedding Attributes Module Attributes used to embed content from other resources within the current element.
Events XML Events Module Attributes that allow associating of events and event processing with an element and its contents.
Forms XForms Module Attributes that designate provide a mechanism of repeating table rows within a form.
Hypertext Hyptertext Attributes Module Attributes that designate characteristics of links within and among documents.
Map Image Map Attributes Module Attributes for defining and referencing client-side image maps.
Media Media Attribute Module Attribute for performing element selection based upon media type as defined in MediaDesc
Metainformation Metainformation Attributes Attributes that allow associating of elements with metainformation about those elements
Role Role Attribute Module Attribute for the specification of the "role" of an element.
Style Style Attribute Module Attribute for associating style information with an element and its contents.
Common Attribute Collections Module A meta-collection of all the other collections, including the Core , Bi-directional , Events , Edit , Embedding , Forms ,Hypertext ,I18N , Map , Media , Metainformation , Role , and Style attribute collections.

Implementation: RELAX NG

Each of the attributes defined in an XHTML attribute collection is available when its corresponding module is included in an XHTML Host Language or an XHTML Integration Set (see [ XHTMLMOD ]). In such a situation, the attributes are available for use on elements that are NOT in the XHTML namespace when they are referenced using their namespace-qualified identifier (e.g., xhtml:id ). The semantics of the attributes remain the same regardless of whether they are referenced using their qualified identifier or not. If both the qualified and non-qualified identifier for an attribute are used on the same XHTML namespace element, the behavior is unspecified.

6.1. Issues

xml:id PR #7442
State: Suspended
Resolution: Defer
User: None

Notes:
If xml:id becomes stable document in time for use in this document, we will migrate to its use.

[XHTML2] Constraining attribute relationship PR #7661
State: Suspended
Resolution: Defer
User: None

Notes:

7. XHTML Document Module

This section is normative .

The Document Module defines the major structural elements for XHTML. These elements effectively act as the basis for the content model of many XHTML family document types. The elements and attributes included in this module are:

Elements Attributes Content Model
html Common , version ( CDATA ), xmlns ( URI = "http://www.w3.org/2002/06/xhtml2/"), xsi:schemaLocation ( URIs = "http://www.w3.org/2002/06/xhtml2/ http://www.w3.org/MarkUp/SCHEMA/xhtml2.xsd") head , body
head Common title , ( access | handler | link | ev:listener | model | meta | style ) *
title Common PCDATA*
body Common ( Heading | Structural | List )*

This module is the basic structural definition for XHTML content. The html element acts as the root element for all XHTML Family Document Types.

Note that the value of the xmlns declaration is defined to be "http://www.w3.org/2002/06/xhtml2/". Also note that because the xmlns declaration is treated specially by XML namespace-aware parsers [ XMLNS ], it is legal to have it present as an attribute of each element. However, any time the xmlns declaration is used in the context of an XHTML module, whether with a prefix or not, the value of the declaration must be http://www.w3.org/2002/06/xhtml2/ .

Implementation: RELAX NG

7.1. The html element

The html element is the root element for all XHTML Family Document Types. The xml:lang attribute is required on this element.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .
version = CDATA
The value of this attribute specifies which XHTML Family document type governs the current document. The format of this attribute value is unspecified. However, all values beginning with the character sequence xhtml are reserved for use by XHTML Family Document Types.

Need a normative definition for the version attribute

The version attribute needs a machine processable format so that document processors can reliably determine that the document is an XHTML Family conforming document.
xsi:schemaLocation = URIs
This attribute allows the specification of a location where an XML Schema [ XMLSCHEMA ] for the document can be found. The syntax of this attribute is defined in xsi_schemaLocation . The behavior of this attribute in XHTML documents is defined in Strictly Conforming Documents .

7.2. The head element

The head element contains information about the current document, such as its title, that is not considered document content. The default presentation of the head is not to display it; however that can be overridden with a style sheet for special purpose use. User agents may however make information in the head available to users through other mechanisms.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

Example

<head>
    <title>My Life</title>
</head>

7.3. The title element

Every XHTML document must have a title element in the head section.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

The title element is used to identify the document. Since documents are often consulted out of context, authors should provide context-rich titles. Thus, instead of a title such as "Introduction", which doesn't provide much contextual background, authors should supply a title such as "Introduction to Medieval Bee-Keeping" instead.

For reasons of accessibility, user agents must always make the content of the title element available to users. The mechanism for doing so depends on the user agent (e.g., as a caption, spoken).

Example

<title>A study of population dynamics</title>

The title of a document is metadata about the document, and so a title like <title>About W3C</title> is equivalent to <meta about="" property="title">About W3C</meta> .

7.4. The body element

The body of a document contains the document's content. The content may be processed by a user agent in a variety of ways. For example by visual browsers it can be presented as text, images, colors, graphics, etc., an audio user agent may speak the same content, and a search engine may create an index prioritized according to properties of the text.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

Example

<body id="theBody">
    <p>A paragraph</p>
</body>

8. XHTML Structural Module

This section is normative .

This module defines all of the basic text container elements, attributes, and their content models that are structural in nature.

Element Attributes Content Model
address Common (PCDATA | Text )*
blockcode Common (PCDATA | Text | Heading | Structural | List )*
blockquote Common (PCDATA | Text | Heading | Structural | List )*
div Common (PCDATA | Flow )*
h Common (PCDATA | Text )*
h1 Common (PCDATA | Text )*
h2 Common (PCDATA | Text )*
h3 Common (PCDATA | Text )*
h4 Common (PCDATA | Text )*
h5 Common (PCDATA | Text )*
h6 Common (PCDATA | Text )*
p Common (PCDATA | Text | List | blockcode | blockquote | pre | table )*
pre Common (PCDATA | Text )*
section Common (PCDATA | Flow )*
separator Common EMPTY

The content model for this module defines some content sets:

Heading
h | h1 | h2 | h3 | h4 | h5 | h6
Structural
address | blockcode | blockquote | div | List | p | pre | handler | section | separator | table
Flow
Heading | Structural | Text

Implementation: RELAX NG

8.1. The address element

The address element may be used by authors to supply contact information for a document or a major part of a document such as a form.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

Example

<address href="mailto:webmaster@example.net">Webmaster</address>

8.2. The blockcode element

This element indicates that its contents are a block of "code" (see the code element). This element is similar to the pre element, in that whitespace in the enclosed text has semantic relevance. As a result, the default value of the layout attribute is relevant .

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

Example of a code fragment:

<blockcode class="Perl">
sub squareFn {
    my $var = shift;
    return $var * $var ;
}
</blockcode>

Here is how this might be rendered:

sub squareFn {
    my $var = shift;
    return $var * $var ;
}

8.3. The blockquote element

This element designates a block of quoted text.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

An excerpt from 'The Two Towers', by J.R.R. Tolkien, as a blockquote

<blockquote cite="http://www.example.com/tolkien/twotowers.html">
<p>They went in single file, running like hounds on a strong scent,
and an eager light was in their eyes. Nearly due west the broad
swath of the marching Orcs tramped its ugly slot; the sweet grass
of Rohan had been bruised and blackened as they passed.</p>
</blockquote>

8.4. The div element

The div element, in conjunction with the id , class and role attributes, offers a generic mechanism for adding extra structure to documents. This element defines no presentational idioms on the content. Thus, authors may use this element in conjunction with style sheets , the xml:lang attribute, etc., to tailor XHTML to their own needs and tastes.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

For example, suppose you wish to make a presentation in XHTML, where each slide is enclosed in a separate element. You could use a div element, with a class of slide :

div with a class of slide

<body>
    <h>The meaning of life</h>
    <p>By Huntington B. Snark</p>
    <div class="slide">
        <h>What do I mean by "life"</h>
        <p>....</p>
    </div>
    <div class="slide">
        <h>What do I mean by "mean"?</h>
        ...
    </div>
    ...
</body>

8.5. The heading elements

A heading element briefly describes the topic of the section it introduces. Heading information may be used by user agents, for example, to construct a table of contents for a document automatically.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

There are two styles of headings in XHTML: the numbered versions h1 , h2 etc., and the structured version h , which is used in combination with the section element.

There are six levels of numbered headings in XHTML with h1 as the most important and h6 as the least.

Structured headings use the single h element, in combination with the section element to indicate the structure of the document, and the nesting of the sections indicates the importance of the heading. The heading for the section is the one that is a child of the section element.

Example

<body>
<h>This is a top level heading</h>
<p>....</p>
<section>
    <p>....</p>
    <h>This is a second-level heading</h>
    <p>....</p>
    <h>This is another second-level heading</h>
    <p>....</p>
</section>
<section>
    <p>....</p>
    <h>This is another second-level heading</h>
    <p>....</p>
    <section>
        <h>This is a third-level heading</h>
        <p>....</p>
    </section>
</section>
</body>

Sample style sheet for section levels

h {font-family: sans-serif; font-weight: bold; font-size: 200%}
section h {font-size: 150%} /* A second-level heading */
section section h {font-size: 120%} /* A third-level heading */

Numbered sections and references
XHTML does not itself cause section numbers to be generated from headings. Style sheet languages such as CSS however allow authors to control the generation of section numbers.

The practice of skipping heading levels is considered to be bad practice. The series h1 h2 h1 is acceptable, while h1 h3 h1 is not, since the heading level h2 has been skipped.

8.6. The p element

The p element represents a paragraph.

In comparison with earlier versions of HTML, where a paragraph could only contain inline text, XHTML2's paragraphs represent the conceptual idea of a paragraph, and so may contain lists, blockquotes, pre's and tables as well as inline text. Note however that they may not contain directly nested p elements.

Attributes

The Common collection
A collection of other attribute collections, including: Bi-directional , Core , Edit , Embedding , Events ,Forms ,Hypertext ,I18N , Map , and Metainformation .

Example

<p>Payment options include:
<ul>
<li>cash</li>
<li>credit card</li>
<li>luncheon vouchers.</li>
</ul>
</p>

8.7. The pre element

The pre element indicates that whitespace in the enclosed text has semantic relevance. As such, the default value of the layout attribute is relevant .

Note that all elements in the XHTML family preserve their whitespace in the document, which is only removed on rendering, via a style sheet, according to the rules of CSS [ CSS3-TEXT ]. This means that in principle any elements may preserve or collapse whitespace on rendering, under control of a