To Validate or Not To Validate
April 12, 1999
My experimentation with the official IE5 release and the
discovery that the DTD wasn't used in validating the document
led me to post a message to the
XSL developer's list.
(My post would have been more appropriate for the
XML-dev list).
My original post,
Why Doesn't IE5 use the DTD to Validate?,
questioned whether not using the DTD to validate was a bug in IE5,
or an intentional change in behavior from the earlier IE5 Beta 2
release which I had used for several months. IE5b2 detected when
a document failed to validate according to the rules defined in
a DTD. In my view, this was the correct behavior; the result was
that the document was not displayed by the browser -- instead,
the browser displayed an informative message indicating exactly
what the first validation error was. This was extremely useful
for an author developing a DTD and for an author referencing
someone else's DTD.
A
reply from Microsoft's Jonathan Marsh
indicated that this behavior change was by design. He maintained
that IE5's XML parser was a validating parser, but by default, it
does not validate. It uses
"two properties properties set through DOM extensions to control
DTD handling:"
validateOnParse determines whether validation
errors are presented to the user.
resolveExternals determines whether the DTD or
XML Schema is loaded and datatypes, default values, etc.
are honored.
However, the IE5 default is validateOnParse=false
primarily because otherwise invalid XML documents wouldn't be
displayed by the browser. A document author must therefore use
a script language such as JavaScript to toggle this flag if
validation of the document according to the DTD is desired.
Marsh stated Microsoft's
reasoning for this decision and an example script.
I questioned this default behavior, stating
reasons why I felt validation should be the default.
I also quoted from the
Conformance section of the XML spec
and
Tim Bray's Annotated XML spec.
This in turn led to
a flurry
of around 50 postings from noted XML/SGML/CSS experts such as
James Clark, Simon St. Laurent, Paul Prescod, Chris Lily, Didier
PH Martin (and many others),
a spin-off thread started by Simon entitled
XML is broken,
and cross-posts to
XML-dev
by
Paul and Simon
[no, not Paul Simon :-o] entitled
Is validity an option? and
Between raw and cooked II: Are? DTDs are just for
validation [sic].
It became obvious to most participants that there was a
significant amount of disagreement (at least initially)
concerning some
fundamental XML questions:
- What precisely is a validating parser obligated to do?
- What type of parsing behavior can be legitimately called a
validating parser?
- What specific aspect of a DTD (e.g., the inclusion of
ELEMENTS, not just entities) should signal to the parser
that it must report validation errors to the client?
- When is well-formedness sufficient and validation overkill?
- Should a web browser with an XML parser require the document
author to enable validation via scripting, should validation
be the default, or should the end user be able to toggle
validation (and if so, how)?
- What is the desired behavior of a parser in different client
situations (browsing vs. EDI vs. databases, etc.)?
- What does the XML spec say specifically about these issues
vs. what do most people infer?
Interested readers with a lot of free time on their hands are
encouraged to read these threads for the various opinions.
If a consensus is reached, I'll follow this up on the
WDVL XML
News area and in this spot.
Meanwhile, you can read my
summary of the numerous posts.
For the record,
Microsoft
says:
"When directly browsing XML documents, Internet Explorer 5 loads the
specified Document Type Definition (DTD) or XML Schema, but does not report
validation errors."
Just as this article went to our WDVL editors, Simon St. Laurent started a
thread called
XPDL (was Re: XML is broken)
in which he indicates a proposal he wrote stemming from this validation
controversy entitled:
XML Processing Description Language (XPDL). Check it out!
Viewing It With IE5, Take 2
Doing It With XML, Part 1
Validating It Without IE5
|