XML and Java: XML Parsers in Java
January 12, 1999
Java
XML parsers are distinguished
by two pairs of traits:
- whether they are validating (checks DTD) or
non-validating (checks for well-formedness, no DTD checking);
- whether they are lightweight and therefore intended for use
in applets or whether they are best suited for full fledged
applications.
We note that DXP and AElfred are best for applets, while XML for
Java and XP are best for applications. See also the
Definitions section of Part 1 of the XML and Java article.
XML Parser in Java (IBM) - validating
URL:
http://www.alphaworks.ibm.com/formula/xml
According to IBM, "XML for Java is a validating XML parser
written in 100% pure Java. The package (com.ibm.xml.parser)
contains classes and methods for parsing, generating, manipulating,
and validating XML documents. XML for Java is believed to be the
most robust XML processor currently available and conforms
most closely to the XML 1.0 Recommendation." IBM has released XML
for Java as a parser toolkit with the rights for developers to
re-distribute it within commercial products. Also known as XML4J,
support for DOM Level 1 Specification [01 Oct 1998] was added
with the 1.1.4 version. XML4J version 1.1.9 (and later) also
supports the W3C Proposed Recommendations for Namespaces in
XML and has been tested under Linux.
Java Project X Parser - validating and non-validating
URL:
http://developer.javasoft.com/developer/earlyAccess/xml/
The parsers from JavaSoft included in the Early Access release
of Java Project X (formerly the "XML Library") support
XML 1.0, SAX 1.0, DOM Level 1, and the
XML Namespaces
proposed recommendation. The
Release Notes state that "There are two separate parsers,
sharing almost all the same code. The validating parser is slightly
slower since it performs additional error checking." According
to the
Java Project X FAQ, "In Sun's testing using JDK 1.1.6,
the validating parser (doing lots of error testing) was significantly
faster than the majority of the non-validating parsers tested, and
all other validating parsers. Of course, Sun's non-validating parser
is faster still."
Silfide XML Parser (SXP) - validating
URL:
http://www.loria.fr/projets/XSilfide/EN/sxp/
The Silfide XML Parser (SXP) is a parser and a complete XML API
in Java. It is part of XSilfide, a client/server based environment.
XSilfide includes SIL, the Silfide Interface Language, among other
things. "The SIL DTD is organized using modules, gathering (1)
the encoding of the user workspace (2) the encoding of the user
informations (3) the extended query language and (4) the encoding
of the queries result set."
DataChannel - Microsoft XML Parser for Java
(XJ2) - validating
URL:
http://www.datachannel.com/xml/developers/parser.shtml
The DataChannel - Microsoft XML Parser for Java (DCXML) will be
the new XML parser included with Internet Explorer. [Netscape 5
will use
expat,
a C XML parser from James Clark.] "This XML Java parser was announced
at the XML Developer's conference in Montreal, Canada on August
20, 1998. The XML Java technology is co-developed by Microsoft
and DataChannel and allows you to take your existing server-side
application and parse the data on the server. It also
allows for Multiple platforms functionality." According to the
press release:
REDMOND, Wash. - Aug. 20, 1998 - Microsoft Corp. and DataChannel Inc.
today announced that they have collaborated to deliver XML technology,
specifically, an enhanced XML parser written in the Java language.
Microsoft selected DataChannel for this effort because of the company's
expertise in both XML and the Java language.
The goal of the collaboration is to develop and deliver XML technology
that will allow developers to write XML-enabled applications on
multiple platforms, taking advantage of Microsoft XML functionality.
An early beta version of the XML parser written in Java will be
downloadable from
http://www.datachannel.com/xml.html and
http://www.microsoft.com/xml/ by the end of August.
XJ2 was formerly called "DataChannel XML Parser (DXP)".
According to the
December
21, 1998 press release for Beta 2, "[t]his release brings
the promise of XSL and XSL pattern matching capabilities to a
Java-based XML parser for the first time. This parser release
includes significant enhancements from the Beta 1 version of the
parser including: a validating XML engine, XSL support, and
transformations of data." Major enhancements include: direct
viewing of XML, additional functionality of the XML engine, XSL
support, XQL Querying of XML Data, XQL transformations of data,
and server-side XML, XSL, and XQL. Note: XQL is
XML Query Language, a submission from Microsoft, Texcel,
and webMethods to the W3C.
See also the
DataChannel XML Resources page.
Larval (Tim Bray) - validating
URL:
http://www.textuality.com/Lark/
Larval is Tim Bray's validating XML processor built on the same
code base as Lark (below). "Larval is a full validating XML
processor; it reports violations of validity constraints, but does
not apply draconian error handling to them."
Lark (Tim Bray) - non-validating
URL:
http://www.textuality.com/Lark/
Lark is a non-validating Java XML processor by Tim Bray, one of
the authors of the W3C XML spec. It implements all of the XML
1.0 Recommendation and reports violations of well-formedness.
XP (James Clark) - non-validating
URL:
http://www.jclark.com/xml/xp/index.html
XP is targeted for applications rather than applets.
James Clark's XML Parser (XP) in Java comes with
javadoc
documentation.
"XP is an XML 1.0 parser written in Java. It is fully conforming:
it detects all non well-formed documents. It is currently not a
validating XML processor. However it can parse all external
entities: external DTD subsets, external parameter entities and
external general entities." XP is a high performance parser intended
for use with Java applications, rather than applets. It includes
a SAX driver implementation. (Clark also has developed
SP,
a free, object-oriented toolkit for SGML parsing and entity
management; SP can parse XML and can convert SGML to XML.)
AElfred (Microstar) - non-validating
URL:
http://www.microstar.com/aelfred.html
Demo URL:
http://www.microstar.com/aelfred/browser-test.html
According to Microstar, AElfred is "a small, fast, DTD-aware
Java-based XML parser, especially suitable for use in Java
applets. We've designed AElfred for Java programmers who want
to add XML support to their applets and applications without
doubling their size: AElfred consists of only two core class files,
with a total size of about 26K, and requires very little memory to
run. There is also a complete SAX (Simple API for XML) driver
available in this distribution for interoperability."
Note that Microstar also has a commercial XML authoring tool called
Near & Far Designer
which has the distinction of being one of the few graphical
DTD environments which is XML enabled. It is not covered in our
XML Editors section because it is not written in Java.
HTML Enabled XML Parser (HEX) - non-validating
URL:
http://www-uk.hpl.hp.com/people/ak/java/hex.html
HEX is the HTML Enabled XML Parser. It is "simple, 100% Java,
non-validating XML parser with some hooks for more-or-less
correct parsing of most HTML pages. It doesn't understand either
SGML or XML DTD's but the parser API allows the application to
control its operation in ways that facilitate HTML parsing. "
HEX includes an implementation of SAX. HEX also implements the
Java binding for the DOM core level one as per the March 1998
Working Draft.
Hubick's XML Analyzer (HXA) - non-validating
URL:
http://www.hubick.com/software/HXA/
Hubick's XML Analyzer "is a pure Java tool built upon a low level
XML parser (HXP) which breaks an XML file down into it's constituent
productions for analysis. HXA allows one to examine the production
hierarchy for any character in an XML document or document fragment.
For easy reference HXA also provides links from each production in
the analysis to its corresponding section in the XML specification."
XML and Java: XML Editors in Java
XML and Java: The Perfect Pair: Part 3: Editors and Parsers
XML and Java: Specialized XML Tools in Java
|