Summary
April 12, 2002
This chapter covered XML syntax rules and basic parsing concepts.
-
We were introduced to fundamental XML terminology, such as
element, attribute, tag, and content.
-
XML document structure was discussed, including the XML prolog,
consisting of the XML declaration and the document type
declaration, both of which are optional but desirable.
-
Names of elements, attributes, and many other XML identifiers are
required to conform to the definition of an XML Name.
-
An XML Name consists of a leading letter, underscore, or colon,
followed by name characters (letters, digits, hyphens,
underscores, colons, or periods).
-
XML is case-sensitive. Although there is no universal convention
concerning use of uppercase or lowercase when developing your own
language, one recommendation is to use UpperCamelCase for elements
and lowerCamelCase for attributes, a convention used in SOAP.
-
We learned the difference between markup and character data; all
text that isn't markup is character data.
-
We covered most of the types of markup, including start and end
tags,
empty element tags, entity references, character references,
comments,
CDATA sections, document type declarations, processing
instructions, and XML declarations.
-
The minimal requirement for an XML document is that it be
well-formed, meaning that it adheres to a number of XML syntax
rules.
-
Although well-formedness is a prerequisite for validity, a
document can be valid only if it also conforms to the constraints
imposed by a DTD or XML Schema.
-
More modern parsers can be toggled between two states: validating
and nonvalidating. Validation mode is crucial during development.
In a production environment, however, it may be desirable (under
certain circumstances) to disable validation for efficiency.
-
Event-based (e.g., SAX) and tree-based (e.g., DOM) parsing were
briefly contrasted.
Links to further explore
this topic are part of the larger
collection of
For Further Exploration links in the
XML Family
of Specifications area.
Well-Formed or Toast?
XML Family of Specifications: A Practical Guide
Nick Danger and Betty Jo Bialowsky are fictional characters
appearing in Firesign Theatre comedy routines.
Visit http://www.firesigntheatre.com/ for
audio and video samples of Firesign Theatre. Many of their recordings
are available at https://www.lodestone-media.com/firesign.html.
|