WeblogTrix: Web Creation: Into the Tags

Attributes and their values
As mentioned, attributes are a means of adding extra information to elements in an XML document.
Attributes always specify some value, so you’ll always see attributes and values together in the form of
attribute-name="attribute-value". Generally in XML, each element can have an unlimited number
of attributes, though in practice each specific application of XML has a certain set of valid attributes
and valid values for those attributes. Part of learning any dialect of XML is learning which elements can
have which attributes and what the allowed values for those attributes are.
Attributes and their values are specified by adding their name and value inside the opening tag for the
element to which they apply. For instance, you can include the date this playlist was created as an
attribute of the <playlist> tag. For example, let’s specify that your playlist was created on May 28,
2006. You could add this data to the <playlist> tag as follows:
<playlist created="May 28, 2006">

Alternatively, if you wanted to specify the author of the playlist, you might add that data like so:
<playlist author="johndoe">
To include both the author of the playlist and the date it was created, you simply include multiple
attributes in the same element:
<playlist author="johndoe" created="May 28, 2006">
This brings up an interesting question: what data should be marked up as an element, and what data
should be marked up as an attribute/value pair? You could have just as easily created an <author> element
and defined johndoe as its content inside the playlist tag or a <created> element in the same
way. Although there is no real right or wrong answer here, it’s generally considered best practice to
use attributes and values for metadata (or data about data) and to use elements for everything else.
In other words, if it’s data that’s meant to be seen by the end user, it’s best to mark it up in an element.
If it’s data that describes some other data in the document, it’s best to use an attribute/value
pair. This is often a confusing topic, so don’t think you have to get it perfect right away. Oftentimes
you just have to step back and take a look at how you’re intending to use the data and determine the
best solution for your markup based on its application.
Empty elements
There’s one other way you might encode information inside an XML document, and that is through
the use of an empty element. Empty elements are just like regular elements except that they don’t
contain any content, neither text nor other elements, inside them. Such elements are well suited for
things such as embedding pointers to other documents or objects (such as pictures) inside an XML
document or storing important “yes/no” (Boolean) values about the document.
For example, if you wanted to use your playlist document in an automated CD-burning application,
you might embed an empty element inside the <playlist> element called <burn>, which could have
an attribute called format and whose value would indicate the kind of CD format to use when ripping
the mix:
<burn format="music" />
Notice that empty elements, like their nonempty counterparts, are both opened and closed. In the
case of empty elements, however, the closing tag is the opening tag and is denoted simply by including
the closing forward slash right before the right angle bracket.
Document types
When marking up a document, you must conform to some kind of standard of what elements, attributes,
and values are allowed to appear in the document, as well as where they are allowed appear and
what content they are allowed to contain. If web developers didn’t conform to these standards, every
website in the world might use a different set of elements, attributes, and values, and no web browser
(much less any human!) would be guaranteed to be able to interpret the content on the web page
correctly. For example, where you used a <playlist> element to enclose your favorite mix in the previous
example, someone else might have used a <favorite-mix> element.

Lucky for us, the standards that web pages and most other kinds of XML documents use are published
publicly and are known as document types. Every XML document declares itself as a specific type of
document by including a special <!DOCTYPE> tag with the appropriate attributes that refer to a specific
Document Type Definition before anything else. A Document Type Definition, in turn, is itself a
marked-up document that defines the rules, or grammar, of the specific document type.
DTDs are intended to be read by machines, not humans, so they can get pretty complicated. The
important thing for you as a web developer to understand is that it’s the document type that determines
exactly how you can use elements, attributes, and values to build your web page. In practice,
there are only a couple different types of DTDs that web pages actually reference, so you’ll need to
become familiar with the rules of only those types.
As a brief exercise, let’s figure out what elements, attributes, and values are allowed in the simple earlier
example. First, you need to allow for the existence of the following elements: <playlist>, <burn>,
<song>, <title>, <artist>, <album>, and <released>. Next, you need to specify that the attributes
called created and author are permitted to be attached only to the <playlist> element and that the
format attribute can be attached to the <burn> element. You can say that any values are valid for
these attributes, or you can say that only “date” values are permitted for the created attribute and
only “text” values are permitted for the author and format attributes. If you do that, you’ll also have
to explicitly define what “date values” and “text values” are.
Document Type Definitions are useful not only so you can understand how to formulate your markup
properly but also so that your markup can actually be checked by a computer. Software programs
called validators can read a document of markup and its corresponding DTD, and they can alert you
to any inconsistencies between the two. Using your example, your DTD could alert you if you accidentally
mistyped and created a <spong> element or tried to specify the author attribute on a <title>
element instead of <playlist>.
Starting with XHTML
There are several flavors of both HTML and XHTML that have been specified over the years. As the
benefits of a standards-compliant Internet began to be more widely accepted and the transition from
HTML to XML-based XHTML began, two types of XHTML were created. One of them is called
XHTML 1.0 Transitional, and the other is XHTML 1.0 Strict. As the name implies, the transitional document
type (or doctype for short) was meant to be more flexible and forgiving than its Strict counterpart
and was intended to be an intermediate step for developers who were used to the looser
grammar and syntax of HTML.
Because this transition should in most cases be over already and, more important, because you most
likely do not fit into the aforementioned category of developers, you will simply jump in to learning
XHTML 1.0 Strict.
Document shell
When creating a new XHTML document, it’s easiest to start with a basic template because there areseveral elements, attributes, and values you’ll always need, and most of them are difficult to remember.
First, create a new folder on your computer called html, and inside it create a new blank document
saved as index.html. Open index.html, and inside it, put the following code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; å
charset=utf-8"/>
<title>New Document</title>
</head>
<body>
</body>
</html>
Before moving on, let’s examine what this markup actually means, piece by piece.
This is the code to identify the doctype of this document:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
What this says is you’re using the XHTML 1.0 Strict definition published by the W3C, and the DTD is online at http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd. That’s it. It seems complicated,
but it’s going to be the same in every XHTML 1.0 Strict document and can simply be copied and
pasted in place every time.
This is the opening tag of the <html> element:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
Doctype excluded, everything else in the document will be inside this “root” element. The attributes
and values here specify the XML namespace (xmlns) as XHTML, the XML language as English, and the
document language also as English. Like the <!DOCTYPE> declaration, the opening of the <html> element
can also be copied and pasted to every (English) XHTML document. (For XHTML documents
whose content is not English, simply change the value of the lang attribute.)
The head
This is the <head> of the document:
<head>
<meta http-equiv="Content-Type" content="text/html; å
charset=utf-8"/>
<title>New Document</title>
</head>
A large portion of information goes into the <head>, much more than you have right
now. The important distinction here is that none of the information in the <head> of the document actually displays in the web browser window (except for the information in the <title>). The browser will, however, read some of the information and use it in a
wide variety of ways.

The first element you see inside the <head> element is the (empty) <meta> element. As mentioned
earlier, metadata is data about data, and this element is used to provide information about the document itself. In this case, it says that the content inside is text/HTML and all the
text is encoded using the UTF-8 character set.

Is this document XHTML, or is it HTML? Even though the doctype specifies that this is an
XHTML document, the <meta> element is telling the browser that it’s an HTML document.
So, who does the browser listen to? In reality, most of the time the browser doesn’t listen
to either of these and instead trusts the web server to set the appropriate HTTP headers.
If set, most browsers disregard both the doctype declaration and any value set in a
<meta> element and parse the markup as whatever the server says it is. Since most web
servers are configured to incorrectly tell browsers that XHTML documents are really
HTML documents, the designers of XHTML ensured that the markup was entirely compatible
with HTML to avoid problems that may arise from this inconsistency.

The next element is quite possibly one of the most important in the document: the <title> element.
It is simply a place to name the content of your document. The content inside the <title> element is
normally shown in the title bar of the web browser’s window or on the browser tab where the page is
being viewed. To change the value of the <title> element, simply change New Document to whatever
is most accurate and descriptive of your page content.
The title is also commonly used as the headline in search results from search engines such as Google
and Yahoo. When people are searching for content that is on your page and search engines find and
show your page as a result, the title of your page is often the only thing they have to decide whether
your page is what they want. If you intend your page to be found by people using search engines, pay
close attention to the <title> element.

Web Creation: Into the Tags

0 comments: