[« You need a USB LCD display] [Main Index] [action is in /casa »]



This is a repost for archival purposes of a mini-rant I wrote for another forum. Enjoy its rant-y goodness.

In the beginning, There was SGML, the Standard Generalized Markup Language. It is an international standard, very complex, rigidly stuctured, and it's actually a tool for describing markup languages (it's a metalanguage).

HTML is kind of a bastard son of SGML, but it carries some of the genes. The idea (which seems not to get used much) of a Document Type Definition comes from SGML, and the concept of stuctural markup is from SGML as well. The markup was intended to describe the function of the text, and leave the user agent (that's a browser usually) free to interpret it according to local user preferences.

Well, that's not what happened at all. Users subverted the structural purpose of tags and used them for layout control. For example, there's an <EM></EM> tag that is supposed to indicate emphasized text, which may be rendered in bold in a graphical browser, but in all caps on a text terminal. But no one uses that, they reach for the <B></B> tag instead. The browser companies threw gasoline on to the fire with more tags for manipulating layout. First <TABLE>, <BLINK></BLINK>, and body background images, then <FONT></FONT>, table background images, GIF transparency (which got abused to do layout).

Before long, everyone was whining about how HTML was a crappy layout tool. And it is, you know. If you've ever worked in a real page layout tool, trying to do artistic things with HTML is very painful. But that didn't stop anyone from trying. The next wave included Applets and embedded multimedia gizmos, and animated GIFs which just made everything more tacky and less functional. The original idea of structural markup appeared to be dead in the water.

Enter Cascading Style Sheets. Some smart people said "remember the original concept of markup was to separate the data from its presentation." And CSS is a great tool for that. You can completely decouple what the data is and means from how it's presented. That's important. Not only do you have better tools with CSS to do real layout, but you can do real layout without severely mangling the text you are presenting. I know you've all seen pages where the ratio of markup to content is amazingly high. Tools like Dreamweaver would just keep wrapping tags as you make changes, so you see stuff all the time that looks like this:

<FONT size="2"><FONT color="red"><FONT face="Arial, sans-serif"><FONT size="4"><FONT color="white"><B>data</B></FONT></FONT></FONT></FONT></FONT>

And the brilliant "designers" that produce the pages that look like that were billing $100/hr. Dreamweaver lets them get more done in a billable hour but only by producing code that is hugely bloated. In the above example, our genius "designer" changed her mind just once about the color and once about the size of that text. It can get a lot worse, and I've seen it (and had to maintain it).

So what's so bad about font tags? They have no relationship to the structure of the data, just to the presentation. Compare this to a CSS approach. I want to do exactly what the genius designer does, and make my text big bold red Arial. Let's say it's a warning notice on my page somewhere. Thanks to CSS1 and HTML4, all I need to do is:

<SPAN class="warning">data</SPAN>

and have an appropriate selector and rule in my properly linked stylesheet (a separate document, which the client can cache for efficiency):

.warning {
font: bold large Arial, Sans-serif;
color: red;

Now we're getting somewhere. This is a real advance in technology. I can streamline the actual markup (33 bytes versus 145), and the style info is stored (and transmitted and cached) separately from the content, and I get a self-documenting structural markup tag, and I can reuse the "warning" selector freely thoughout my whole site, and I can update them all elements to which I have applied the selector with a single stylesheet change.

Pay attention to that last one. I said "I can update them all elements to which I have applied the selector with a single stylesheet change." Who cares? You do. Creating cool websites is fun, but maintaining them is anti-fun. It's like Kryptonite. CSS is the antidote. If you do your design job well, and embrace functional markup and observe rigorous separation of style and content, maintaining your site will be almost as much fun as building it was. Or if you do this professionally, the person who follows in your tracks and has to maintain your code will sing songs of joy instead of firebombing your home.

OK, what about XHTML?

Remember SGML, our grandaddy? He's a randy type and gave birth recently to another offspring called XML. XML was created around the idea that machine-readable, functional markup (sounds a lot like the original goals of HTML) was a worthwhile thing, but that HTML was too flabby for the job while SGML was too complex to be useful. Hence, XML. It is a metalanguage like its parent, in that you can use XML to define your own markup language. That's useful, especially in business-to-business situations where you need to update inventory with a trading partner. Once you've adopted an agreed-on markup language (an XML DTD or schema) your systems can talk to the partner's systems in a direct way about your inventory. It's a universal data interchange format of sorts.

Well, XHTML is a redefinition of HTML4 as an XML DTD. Not only can you impress your friends at parties by saying that, but it has some benefits. HTML has typically been a pretty well-defined syntax, but the user agents haven't insisted on synactical correctness, and all sorts of hilarity (not) ensued as things like DHTML came into common use. Browsers had to deal with an increasingly wacky Document Object Model, a scripting framework to let ambitious programmers and designers manipulate that DOM, several versions of HTML syntax and the kicker is that they all had code to try to "do the right thing" when faced with malformed, syntactically incorrect (like nested <A> tags) and generally bozotic HTML (about 90% of all markup fits this description).

XHTML changes that. Sort of. An XHTML document is required to be a well-formed XML document, which means no syntax errors allowed. Some weirdness about empty versus non-empty containers was resolved, which brought about the new syntax for empty tags (like <br /> It's a ggod thing. You can choose to look at it like HTML4 with some syntax ambiguities clarified if you like. But the real benefit is still over the horizon a way off. As I said, XHTML documents (if they validate) are well-formed XML documents. We could eventually see the advent of sophisticated new web robots that do useful things like scour the web to find you the best price on a motherboard, or locate an eye doctor near you.

Those things are still in the future, but they represent the original promise of the web (no, it really wasn't pr0n). Getting on the XHTML bandwagon now is healthy because it advances the state of the design art toward that ultimate goal of a semantically marked up web. Future XML user agents will be able to use your present XHTML code.

How do you take advantage of it? What do you do differently to make your HTML4 document into an XHTML document? Will it break current browsers? The answer is here. It's not too hard to convert from HTML to XHTML. And the guidelines linked above will get you over most of the hurdles. Use a validator. Correct syntax is important. Validate even if you use a tool to generate your code. It's an eye-opening experience. You may not be so impressed with your tool for one thing. Be sure you declare an appropriate DOCTYPE. The validator will need to know which standard to validate your document against.

text, scripts and images copyright © 2001-2011 . All rights reserved.