Web Developer's Virtual Library: Encyclopedia of Web Design Tutorials, Articles and Discussions


WDVL Newsletter

Active Server Pages
JSP/Java Servlets
Microsoft SQL Server
Daily Backup
Dedicated Servers
Streaming Audio/Video
24-hour Support    

jobs.webdeveloper.com

Hiermenus


e-commerce
Partner With Us















Developer Channel
FlashKit.com
JavaScript.com
JavaScriptSource
Developer Jobs
ScriptSearch
StreamingMediaWorld
Web Developer's Journal
Web Developer's Virtual Library
WebDeveloper.com
Webreference
Web Hosts
XMLfiles.com

internet.com
IT
Developer
Internet News
Small Business
Personal Technology

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


White Space Is Significant

April 5, 2002

White space consists of one or more space characters, tabs, carriage returns, line feeds (denoted as #x20, #x9, #xD, and #xA, respectively). In the XML 1.0 Recommendation, white space is symbolized in production rules by a capital "S", with the following definition (See http://www.w3.org/TR/REC-xml#sec-common-syn and http://www.w3.org/TR/REC-xml#sec-white-space ):

S ::= (#x20 | #x9 | #xD | #xA)+

In contrast to HTML, in which a sequence of white space characters is collapsed into a single white space and in which newlines are ignored, in XML all white space is taken literally. This means that the following two examples are not equivalent:

<Publication>
  <Published>1992</Published>
  <Publisher>Harmony Books</Publisher>
</Publication>

<Publication>
  <Published>1992</Published>
  <Publisher>Harmony
Books</Publisher>
</Publication>

By default, XML parsers handle the Publisher element differently since in the second example, the string "Harmony Books" contains a newline between the two words. The application that invokes the parser can either consider the white space important, ignore it (i.e., strip it), or inform the parser that it wants white space normalized (collapsed like in HTML).

Comments

Comments in XML are just like they are in HTML. They begin with the character sequence "<!-- " and end with the sequence "-->". The parser ignores what appears between them, except to verify that the comment is well-formed.

<Publication>
  <Published>1992</Published>
  <!-- This appears to be the second edition. -->
  <Publisher>Harmony Books</Publisher>
</Publication>

In XML, however; there are several restrictions regarding comments:

  • Comments cannot contain the double hyphen combination " --" anywhere except as part of the comment's start and end tags. Thus, this comment is illegal: <!-- illegal comment --->
  • Comments cannot be nested. This means you need to take care when commenting out a section that already contains comments.
  • Comments cannot precede the XML declaration because that part of the prolog must be the very first line in the document.
  • Comments are not permitted in a start or end tag. They can appear only between tags (as if they were content) or surrounding tags.
  • Comments may be used to cause the parser to ignore blocks of elements, provided that the result, once the commented-out block is effectively removed by the parser, is still well-formed XML.
  • Parsers are not required to make comments available to the application, so don't use them to pass data to an application; use Processing Instructions, discussed next.
  • Comments are also permitted in the DTD, as discussed in chapter 4.

Processing Instructions

Processing instructions (often abbreviated as PI) are directives intended for an application other than the XML parser. Unlike comments, parsers are required to pass processing instructions on to the application. The general syntax for a PI is:

<?targetApplication applicationData ?>

Where targetApplication is the name (any XML Name) of the application that should receive the instruction, and applicationData is any arbitrary string that doesn't contain the end delimiter. Often applicationData is name/value pairs that resemble attributes with values, but there is no requirement concerning the format. Aside from the delimiters "<?" and "?>", which must appear exactly as shown, the only restriction is that there can be no space between the initial question mark and the target. Some examples follow.

<?xml-stylesheet type="text/xsl" href="foo.xsl" ?>
<?MortgageRateHandler rate="7%" period="30 years" ?>
<?javaApp class="MortgageRateHandler" ?>
<?javaApp This is the data for the MortgageRateHandler, folks! ?>
<?acroread file="mortgageRates.pdf" ?>

Processing instructions are not part of the actual structure of the document, so they may appear almost anywhere, except before the XML declaration or in a CDATA section. The parser's responsibility is merely to pass the PI and its data on to the application. Since the same XML document could be processed by multiple applications, it is entirely possible that some applications will ignore a given PI and just pass it down the chain. In that case, the processing instruction will be acted upon only by the application for which it is intended (has meaning).

Although an XML declaration looks like a processing instruction because it is wrapped in the delimiters "<?" and "?>", it is not considered a PI. It is simply an XML declaration, the one-of-a-kind markup that may or may not be the first line of the document.

The target portion of the processing instruction can be a notation (defined in chapter 4). For example:

<!NOTATION AcrobatReader SYSTEM "/usr/local/bin/acroread">

The corresponding PI would be:

<?AcrobatReader file="Readme.pdf" size="75%" ?>

Start and End Tags Must Match
XML Family of Specifications: A Practical Guide
Entity References


Up to => Home / Authoring / Languages / XML / XMLFamily / XMLSyntax




Jupiter Online Media: internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and Jupiter Online Media

Jupitermedia Corporate Info


Legal Notices, Licensing, & Permissions, Privacy Policy.

Web Hosting | Newsletters | Tech Jobs | Shopping | E-mail Offers