Tags
March 29, 1999
By and large, tags make up the majority of XML markup.
A tag is pretty much anything between a < sign and a >
sign that is not inside a
comment, or a
CDATA section
(we'll discuss these in a bit). In short, it is pretty
much the same as an HTML tag.
The rules governing tags are a little more complex than those
governing character data. Let's take a look at them....
Gimme Something to Work With
For one, all well-formed XML documents must have at least one
element!
Watch Your Case
Also, care must be taken to assure that you maintain case
within a tag set.
In other words, the tags <HELLO>, <hello> would not
be equivalent as they would in HTML.
End Your Tags Right
Further, besides being spelled and capitalized the same way
as their start tag counterparts, end tags should include an
initial forward slash "/". Thus in most cases, a start tag
of <HELLO>, should be closed with a </HELLO>.
I say sometimes, because in certain circumstances, you can
bypass the end tag. Specifically, if you need to use a tag
that has no content, you may use a single start tag with a
trailing forward slash such as:
<HR/>
|
The "<HR/>" case is called an "Empty Element", empty
because it has no content. Since HR is already an HTML element,
I suggest inventing one, like <IMAGE/>. And the point is
that Empty Elements often will have attributes that give them
greater usefulness. (Think of the IMG element in HTML. Even the
HR element now has several attributes.)
- Ken Sall
|
Nest Properly
Also, note that XML elements may contain other elements but
the nesting of elements must be correct. Thus the following
example is wrong:
<CONTACT>
<NAME>Frank Foo
<EMAIL>frank@foo.com
</CONTACT></NAME></EMAIL>
Instead, it should be:
<CONTACT>
<NAME>Frank Foo</NAME>
<EMAIL>frank@foo.com</EMAIL>
</CONTACT>
Name Your Tags Legally
Tags should begin with either a letter, an underscore (_)
or a colon (:) followed by some combination of letters, numbers,
periods (.), colons, underscores, or hyphens (-) but no white
space, with the exception that no tags should begin with any
form of "xml". It is also a good idea to not use colons as
the first character in a tag name even if it is legal. Using a
colon first could be confusing.
Further, though the XML 1.0 standard specifies names of any
length, actual XML processors may limit the length of markup
names.
Define Valid Attributes
Finally, tags may specify any number of supporting attributes.
Attributes, that must not duplicate in any one tag, specify
a name/value pair, delimited by equal (=) sign in which the
value is delimited by quotation marks such as:
<SHOE style = "spectator" coloring = "black_and_white">
|
Unlike HMTL, XML specifies that values MUST be delimited with
quotation marks.
|
In this case, style and coloring are attributes
of the SHOE tag and "spectator" is the value of the style
attribute and "black_and_white" is the value of the coloring
attribute.
Attribute names follow the same conventions as tag names
(valid characters, case sensitivity, etc). Values, on the
other hand, may include include white spaces, punctuation and
may include entity references when necessary.
|
Note that all values are not typed. That is, they are considered to
be
strings. Thus if you were to process the tag
<ROOM_SIZE RADIUS = "10" DEPTH = "13">
you would have to convert "10" and "13" to their numeric values
outside
of the XML environment.
|
Character Data
Introduction to XML For Web Developers | Table of Contents
CDATA
|