Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


XML Syntax Rules

April 5, 2002

In this section, we explain the various syntactical rules of XML. Documents that follow these rules are called well-formed, but not necessarily valid, as we'll see. If your document breaks any of these rules, it will be rejected by most, if not all, XML parsers.

Well-Formedness

The minimal requirement for an XML document is that it be well-formed, meaning that it adheres to a small number of syntax rules, 6 which are summarized in Table 3-1 and explained in the following sections. However, a document can abide by all these rules and still be invalid. To be valid, a document must both be well-formed and adhere to the constraints imposed by a DTD or XML Schema.


    Table 3.1 XML Syntax Rules (Well-Formedness Constraints)
  • The document must have a consistent, well-defined structure.
  • All attribute values must be quoted (single or double quotes).
  • White space in content, including line breaks, is significant.
  • All start tags must have corresponding end tags (exception: empty elements).
  • The root element must contain all others, which must nest properly by start/end tag pairing.
  • Elements must not overlap; they may be nested, however. (This is also technically true for HTML. Browsers ignore overlapping in HTML, but not in XML.)
  • Each element except the root element must have exactly one parent element that contains it.
  • Element and attribute names are case-sensitive: Price and PRICE are different elements.
  • Keywords such as DOCTYPE and ENTITY must always appear in uppercase; similarly for other DTD keywords such as ELEMENT and ATTLIST.
  • Tags without content are called empty elements and must end in "/>".

Legal XML Name Characters

An XML Name (sometimes called simply a Name ) is a token that

  • begins with a letter, underscore, or colon (but not other punctuation)
  • continues with letters, digits, hyphens, underscores, colons, or full stops [periods], known as name characters.

Names beginning with the string "xml", or any string which would match ((`X'|`x')(`M'|`m')(`L'|`l')), are reserved.

Element and attribute names must be valid XML Names. (Attribute values need not be.) An NMTOKEN (name token) is any mixture of name characters (letters, digits, hyphens, underscores, colons, and periods).


  • Note:
    The Namespaces in XML Recommendation assigns a meaning to names that contain colon characters. Therefore, authors should not use the colon in XML names except for namespace purposes (e.g., xsl:template).

Listing 3-2 illustrates a number of legal XML Names, followed by three that should be avoided but may or may not be identified as illegal, depending on the XML parser you use, and four that are definitely illegal. (This is file name-tests.xml on the CD; you can try this with your favorite parser, or with one of the ones provided on the CD.)

Listing 3-2 Legal, Illegal, and Questionable XML Names

<?xml version = "1.0" standalone = "yes" encoding = "UTF-8"?>
<Test>
<!-- legal -->
    <price />
    <Price />
    <pRice />
    <_price />
    <subtotal07 />
    <discounted-price />
    <discounted_price />
    <discounted.price />
    <discountedPrice />
    <DiscountedPrice />
    <DISCOUNTEDprice />
    <kbs:DiscountedPrice />
    <xlink:role />
    <xsl:apply-templates />
<!-- discouraged -->
    <xml-price />
    <xml:price />
    <discounted:price />
<!-- illegal -->
    <7price />
    <-price />
    <.price />
    <discounted price />
</Test>

From the legal examples, we see that any mixture of uppercase and lowercase is fine, as are numbers, and the punctuation characters that were in the definition.

Since the last three examples in the first group use a colon, they are assumed to be elements in the namespaces identified by the prefixes "kbs", "xlink", and "xsl". Of these, the last two refer to W3C-specified namespaces; xlink:role is an attribute defined by the XLink specification and xsl:apply-templates is an element defined by the XSLT specification. The "kbs" prefix refers to a hypothetical namespace, which I could have declared (but didn't), since namespaces do not come only from the W3C. (See chapter 5 for a thorough discussion of namespaces.)

The three debatable examples are xml-price, xml:price, and discounted: price. The first two use the reserved letters "xml"; you shouldn't use them, but most parsers won't reject them. The discounted:price example uses a colon, which is frowned upon if "discounted" is not meant to be a prefix associated with a declared namespace.

The four illegal cases are much more clear. The first three, 7price, -price, and .price, are illegal because the initial character is not a letter, underscore, or colon. The fourth example is illegal because a space character cannot occur in an XML Name. Most parsers will think this is supposed to be the element named discounted and the attribute named price, minus a required equal sign and value.


Note: XML Names and NMTOKENS apply to elements, attributes, processing instructions, and many other constructs where an identifier is required, so it's important to understand what is and what is not legal.


Up to => Home / Authoring / Languages / XML / XMLFamily / XMLSyntax




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers