This Year’s Document Object Model

The Document Object Model is a working standard approved by the World Wide Web Consortium that is correctly implemented in most modern browsers.

When web standards are being discussed, more often than not it’s CSS that takes centre stage. Yet the Document Object Model is a working standard approved by the World Wide Web Consortium that is correctly implemented in most modern browsers.

The Document Object Model, or DOM for short, is an API: an Application Programming Interface. Essentially, it provides a means whereby programming languages can "talk" to HTML and XML documents. Usually the programming language that does the talking is JavaScript.

JavaScript first appeared in the third generation of browsers from Netscape and Microsoft. At that time, Netscape were very much the market leaders. Microsoft, with its relatively new Internet Explorer, was struggling to catch up.

JavaScript was capable of manipulating aspects of the browser itself. It offered a means of creating and changing the properties of browser windows by providing a Browser Object Model. The language also allowed limited access to some of the elements of the document contained within the browser window, such as forms and images. This was a very simple type of Document Object Model.

In those early, carefree days, the most common usage of JavaScript was to create rollover effects and client-side form validation.

When version 4 browsers appeared, the DOM really hit the fan.

The browser wars of the late nineties were a dark, dark time for web development. We are still suffering from the fallout to this day.

The first salvo was fired by Netscape in June 1997. From their dominant market position, the company launched their Netscape 4 browser touting all sorts of new enhancements. Microsoft launched Internet Explorer 4 in October of the same year, also promising myriad benefits.

Both browsers offered more JavaScript functionality, urging us developers to try out the latest buzzword: DHTML.

DHTML means Dynamic HTML. It isn’t actually a single technology. It’s the marriage of HTML, CSS and JavaScript. The basic thinking goes like this:

  1. Use HTML to mark-up your document into elements.
  2. Use CSS to apply styles to those elements.
  3. Use JavaScript to manipulate and change those styles.

Using DHTML, things like animation suddenly became possible. Let’s say you marked-up an element of your document by giving it the unique ID "mything":

<div id="mything">The Thing</div>

Using CSS, you can position “mything” wherever you like in the browser window:

#mything {
   position: absolute;
   top: 50px;
   left: 100px;
}

Using JavaScript, you can then update the position by changing the "top" or "left" style associated with "mything".

That was the theory anyway. Unfortunately, although they were both promoting the same ends, the two battling browsers used entirely different means.

Netscape’s Document Object Model required the use of proprietary elements called layers. These layers could be given unique IDs and addressed in Javascript like this:

document.layers['mything']

Internet Explorer used this syntax:

document.all['mything']

The differences didn’t stop there. To manipulate the styles attached to an element, different syntaxes were also required.

Here’s how you retrieve the left position of a layer in Netscape 4:

var xpos = document.layers['mything'].left;

In Internet Explorer 4 it would be:

var xpos = document.all['mything'].style.posLeft;

Even the simplest operations required ridiculous amounts of forking and browser-sniffing.

DHTML promised a fantastic world of possibilities. But anybody who actually attempted to use it discovered a world of pain instead.

All of which is a real shame, because all of the incompatibilities could have been avoided if the browser makers had simply paid attention to the already existing standards.

While Netscape and Microsoft were locked in battle using their proprietary DOMs as weapons, a standardised Document Object Model already existed and had been adopted by the World Wide Web Consortium. Despite the fact that Netscape and Microsoft both belonged to the W3C, they simply ignored the standard.

This wasn’t just a DOM problem: the browser makers also implemented non-standard CSS and even made up some HTML tags and attributes.

Going back to our example, let’s see how it would be handled by the standardised Document Object Model. We have an element (a <div>) with a unique ID ("mything") and we want to find out what style declaration has been applied for its left position:

var xpos = document.getElementById('mything').style.left

At first glance, that doesn’t look any better than either of the proprietary DOMs. However, the standardised DOM is far, far more powerful.

As well as document.getElementById, the DOM also provides document.getElementsByTagName. This means that JavaScript no longer has to restrict itself to altering the styles of uniquely identified elements. Using the DOM, it’s possible to create, manipulate and destroy any element in a document.

The power that this offers is nothing short of astounding.

The browser wars were eventually won by Internet Explorer. Ironically, the battles waged with competing DOMs and proprietary mark-up turned out to be completely pointless. Microsoft were in a winning position because they made the Windows operating system, pre-installed on most PCs. By bundling their browser in with their operating system, their victory was guaranteed.

The people hit hardest by the browser wars were web developers who attempted to get their CSS and JavaScript working across both browsers. DHTML became a dirty word.

A backlash formed against the proprietary stance of the browser manufacturers. The Web Standards Project was formed to pressure browser makers into following the W3C standards.

Whether or not the WaSP had any great influence on Netscape or Microsoft isn’t clear, but for whatever reason, the browser makers began to follow the standards.

Microsoft Internet Explorer 5 shipped with support for the DOM. It also maintained backwards compatibility with the old, proprietary IE DOM.

Netscape, meanwhile, decided to make a clean break. They built their next browser version from scratch. The standardised DOM was supported and the old Netscape DOM wasn’t. They also decided to completely skip the number 5 and release the browser as Netscape 6. Apart from the name, it has next to nothing in common with earlier Netscape browsers.

In the meantime, other browser manufacturers have appeared on the scene and, for the most part, they have learned from the lessons of the past. When Apple debuted its Safari web browser there was no question that it would follow the standards.

Both Microsoft and Netscape have continued to improve their support for standards in subsequent releases, although Internet Explorer has stagnated at version 6.

The stagnated development of Internet Explorer notwithstanding, life has improved greatly for web developers. Instead of coding for specific browsers, forking code and writing browser-sniffing scripts, we are in a position to write once, publish everywhere. As long as we stick to the standards, we can be sure that our scripts will work in the majority of browsers in use today.

Let’s take a closer look at the standardised DOM.

The W3C describes the DOM as:

A platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The document can be further processed and the results of that processing can be incorporated back into the presented page.

The keyword here is "dynamic". It’s always been possible to change the style and contents of a page after it’s been submitted for server-side processing, but the DOM offers a way to alter the style and contents of a page without the need for a page refresh.

However, as with CSS, there are good practices to be observed when using the DOM. The principles of graceful degradation and progressive enhancement still apply.

Whether you’re applying CSS, JavaScript or both, the starting point must be a semantically well marked-up document.

Let’s remind ourselves of an adage that’s been floating around since the start of the web:

Content Is King

Without content, there’s nothing to style and nothing to script. Of course, you can’t simply write out your nice content and put it on the web… Well, you could, but that would be a text file rather than a web page. To create a web page, your content needs to be wrapped up in HTML (or, better yet, its cleaner, leaner version: XHTML).

So let’s just revise our adage:

Well Marked-Up Content Is King

Mark-up describes what the elements of a document are. It doesn’t say anything about how those elements should look or how they should behave.

The mark-up says this is a level one heading, this a paragraph, this is another paragraph. They are all pieces of content, but we distinguish between them using tags like <h1> and <p>.

We can distinguish further by applying classes and IDs to elements. Let’s say we want to distinguish between the introductory paragraph and every other paragraph. We could add an ID attribute and give it a descriptive value like "intro". Effectively, it’s a way of saying this is the introductory paragraph, which is more descriptive than simply saying this is a paragraph.

By applying this logic to other page elements, lists, forms, links, etc., we create a well marked-up document. We could upload it to a web server, and any user agent that understands HTML or XHTML would be able to display it.

We are now free to style this document in any way we wish by applying a presentation layer on top of the document. In an external file, we can use CSS to say things like display the level one header in red 14pt Arial or display paragraphs with the class ’intro’ in bold.

It’s also possible to apply a behaviour layer. Again, an external file is the best way to do this. The language we use to alter the behaviour of the document is JavaScript. The Document Object Model gives us a way of addressing any element in a document. Once we can interface with a page element like this, we can get information about it, change its value or add event handlers.

Let’s take a step back and look at how not to add style and behaviour.

It’s possible to mix styles right in with the mark-up like this:

<h1 style="font-size: 15pt; color: red">My headline</h1>

As well as increasing the weight of the document and making it less readable, it also makes the job of altering styles at a later date much, much trickier. It’s far simpler to have an external stylesheet that declares:

h1 {
 font-size: 15pt;
 color: red;
}

The mark-up would simply be:

<h1>My headline</h1>

Similarly, it’s possible to add JavaScript right in with the mark-up:

<input type="button" id="clickbutton" value="click me!" onclick="alert('you clicked me!'); return false" />

It makes more sense to add the behaviour in an external file like this:

function clickMessage() {
 alert('you clicked me!');
 return false;
}
document.getElementById('clickbutton').onclick = clickMessage;

The mark-up would then be:

<input type="button" id="clickbutton" value="click me!" />

Using the Document Object Model, the JavaScript finds the page element with the ID "clickbutton" and adds a function to be executed when that element is clicked. Moving the event handler to an external script like this is a good example of unobtrusive JavaScript.

One caveat about adding the behaviour to your document is that it can only be done once the document has loaded (otherwise the DOM has nothing to talk to: the document is empty). This means adding functions to the window.onload event handler:

function prepareButton() {
 document.getElementById('clickbutton').onclick = clickMessage;
}

window.onload = prepareButton;

Also, it’s a good idea to make sure that the browser currently executing the JavaScript is capable of understanding the DOM. A simple check for the existence of getElementById should suffice:

function prepareButton() {
 if (!document.getElementById) return false;
 document.getElementById('clickbutton').onclick = clickMessage;
}

window.onload = prepareButton;

Applying the presentation and behaviour layers can be done with a few short lines in the <head> of the document. For CSS add:

<link rel="stylesheet" type="text/css" media="screen" href="path/to/stylesheet.css" />

For JavaScript use:

<script type="text/javascript" src="path/to/javascript.js"></script>

If you were to strip away the presentation and behaviour layers, you’d still be left with a perfectly good web page. The well marked-up content would still be king… albeit a king with no clothes.

By applying the principle of graceful degradation along with the separation of content, style and behaviour, it’s possible to create easily maintainable scripts that can enhance your documents without being required by them.

Let your imagination run riot. If you can think of a nifty way of adding some behaviour to a web page and you can think of the process to achieve it with the DOM, then it’s simply a matter of translating your idea into JavaScript. Here are a few examples of DOM applications written out in English:

Have certain links open in a new browser window simply by giving those links a class like "popup"
As soon as the page loads, loop through all the links on your page. If the link has the class "popup" then change the behaviour so that when the link is clicked, it opens in a new browser window.
Automatically refresh my webcam every 30 seconds.
Give your image a unique ID like "webcam". Write a function that updates the src attribute of the image. Have this function call itself every 30 seconds.
Make the default placeholder text in <input type="text"> disappear when the user brings the element into focus.
When the page loads, loop through all the <input> elements in the page. Check to see if the type attribute is set to "text" and that the value attribute isn’t empty. Make a note of the default value. When the user brings the <input> into focus, check the current value against the value that was there when the page first loaded. If they are the same, change the value attribute to be blank.

Those are just some simple examples that would require the use of DOM methods like getElementById, getElementsByTagName and getAttribute to "talk" to the web page. Just imagine the possibilities when you start using createElement, appendChild and other methods that allow you to actually add page elements on the fly.

We have a web standard with great, untapped potential. DOM is the new CSS.