Metatags and Other <Head>aches
July 15, 2002
|
At the beginning of most HTML pages is a <Head> section,
hidden from the user but highly visible to browsers and other
bits of Internet machinery such as search engine spiders and
caching systems. It's full of meta tags and other sorely-abused
bits of HTML.
|
If you're a developer using a WYSIWYG editor, it's easy to
neglect the <Head> section and hope that it takes care of
itself, when in reality a little more work here might give you
more of a return than spending extra time on the <Body>
section.
This is especially true on framed pages. If you only complete
the <Head> section for the frameset, and not the
individual pages, maybe you're missing unrealised potential.
Here we'll look at the common elements contained within the
<Head> section, most of which are META tags, starting
with the most common and interesting, and finishing up with a
few dull ones you may see used even though they aren't
particularly beneficial.
Introduction
Metatags divide into two types - those identified as
Http-Equiv, and those identified by a Name attribute. Both
types carry a content field, beginning content=.
Examples:
<meta name="resource-type" content="document">
<meta content="text/html; charset=ISO-2022-JP" http-equiv=Content-Type>
The order of material inside a meta tag is not critical. Also,
as with other HTML tags, developers often play fast and loose
with capitalisation and quote marks. Using quote marks is the
'proper' way to do things, but you'll find them missing from
some popular sites, and their tags still seem to work. If the
content section is in two parts, with a semicolon separating
them, you should definitely use quote marks or the tag may
fail.
The order of metatags within the <Head> section is also
no big deal. You might decide to put the more generic kind of
tags, like Content-Type, first, but machines reading the
<Head> information don't really care either way.
Regular stuff
Content-Type
<META content="text/html; charset=ISO-2022-JP" http-equiv=Content-Type>
The main use of this tag is to specify unusual charactersets.
For example, if you are about to use Japanese or Arabic
characters in the main body of the page, this is the place to
say so. It gives browsers a chance to preload the characterset
or tell the user that it's missing.
Confusingly, specifying the characterset doesn't specify a set
of characters, it specifies a kind of character encoding, but
this is a detail that isn't important unless you get into
problems with charactersets, and then it's worth remembering.
Note that this tag is not a 100% reliable way of getting
non-Western characters to display in all browsers, especially
on operating systems other than Windows.
The default characterset is ISO-8859-1. In theory, this should
always be specified too, but many sites don't bother, possibly
because the Content-Type tag has been known to generate its own
glitches, for example in old versions of Netscape and on
non-Windows systems, where it can occasionally cause pages to
fail entirely.
The default charset is sometimes replaced with a Windows
characterset - charset=windows-1252. Also common are ISO-8859-5
which supports Cyrillic, EUC-JP for Japanese, SHIFT_JIS for
Japanese/Kanji, GB2312 for Chinese (People's Republic of
China), BIG5 for Chinese (Taiwan), and UTF-8, which truly
demonstrates this is all about encoding, because it's for
regular characters but using different numbers of bytes
It can get quite complicated. If you want to get your feet
dirty try the W3C pages on
document representation, where you can be sure to get mud
on your boots.
Description and Keywords
<META content="Free Internet fiction" name=DESCRIPTION>
<META content=
"free book novel fiction epublishing ebook online publishing"
name=KEYWORDS>
These two tags may (or may not) be used by search engines when
indexing your site and determining its position in their
rankings. They're also the two tags that generate the most
nonsense from ranking experts. It's true, there was a time when
careful choice of description and keywords could have a
dramatic effect on the ranking of a page - but that time has
passed. The system has been so abused that some search engines
ignore these tags altogether.
Here are some brief guidelines:
-
Don't spend a long time working out what to put in these
tags. That time would be better spent getting your keywords
and key expressions into the first hundred words of main text
on your page.
-
Short lists of keywords (say, 10) are more effective than
long lists.
-
Avoid repetition of keywords, but don't worry if the
occasional repeat creeps in.
-
Try to avoid keywords that don't appear in the main text.
-
Some experts separate keywords with commas and some use
spaces. Nobody is sure that one system is definitely better
than the other.
-
The description is very likely to appear in search engine
results, but may not contribute to your ranking, so write it
for human readers and don't worry too much about what the
search engine spiders will make of it.
Title
<TITLE>Ants and Bees</TITLE>
This isn't a metatag but it does go inside the <Head>
section. It's the title of your page. Remember the title is
often analysed by search engines as part of the indexing and
ranking process, so it's a good idea for it to contain
keywords. Your title will usually be shown in search engine
results and in Favorites/Bookmarks. Also, in Windows it will
appear in a taskbar icon at the bottom of the screen, and may
be massively cut to fit, so it's worth getting some useful
words in right at the beginning, rather than "Welcome to
ZZZZZ.com" which might be polite but won't be very useful when
it shows up at the bottom of the screen as "Welcome
"
Metatags and Other <Head>aches
Interesting stuff - Page 2
|