HTML I've never seen before

Improv_Geek · February 11, 2005, 4:40pm

I’ve been doing web design for a while and my dad asked me to do some work for him but the code is unlike anything I’ve seen before. Anyone recognize it and know where I can learn more about it?

The code has parts such as <O:P> and inside image tags it has v:shapes as an attribute.

Anyone?

Duckster · February 11, 2005, 4:44pm

It may be the page was created in Microsoft Publisher.

Bill_H · February 11, 2005, 5:45pm

Or with Microsoft Word.

ouryL · February 11, 2005, 5:48pm

I concur. I have seen programs which clean these extraneous tags off.

aldiboronti · February 11, 2005, 6:01pm

Yep, it’s Publisher, as you can see from this page.

ChordedZither · February 11, 2005, 6:01pm

The part in front of the colon is a namespace prefix. In XML (and newer versions of HTML are incorporating more and more general XML features - some are even well-formed XML) a anemspace is used to prevent clashes between elements that might have the same name but convey different meanings. In your example, <O:P> names a <P> element that might have nothing to do with the HTML paragraph element.

You can get some clue about the origin of these namespaces by looking for the definition of each prefix. Somewhere up above the place where <O:P> and <v:shapes> are being used, you should find an element with attributes
xmlns:O="…some long string…"
and, not necessarily in the element,
xmlns:v="…some other long string…"
The strings are URIs that represent a unique identifier for the set of tags being associated with that prefix (“O” or “v”). The prefixes themselves are arbitrary, but the corresponding URIs uniquely identify the set of tags being used. In many cases, you can figure out the general purpose of that set of tags by examining those URIs. Although it’s not required, some people use “real” URLs for the URIs and place documentation for the tag set at that web location.

If you’re going to work with this style of markup, you should start with some basic tutorials on XML.

Cerowyn · February 11, 2005, 6:26pm

To add to what ChordedZither posted, the tags are often embedded in HTML to enable “round-tripping” documents from other applications. For instance, Word infamously buries a huge volume of extra tags and formatting information when it saves a document in HTML format. However, that allows Word to subsequently re-load the document and retain almost all of the extra information that would otherwise have been lost in a strict HTML page.

mhendo · February 11, 2005, 8:57pm

A problem, of course, is that it is often a fucking nightmare to view an HTML page created by a Microsoft program in anything except Microsoft’s IE browser.

gotpasswords · February 11, 2005, 10:07pm

Dreamweaver does a fine job of stripping out all that crap and making normal HTML out of it.

I’m sure other apps do this as well - you know it’s bad when an app has a built in “Clean up Word HTML” item on menu - not even as an option or plugin.

Topic		Replies	Views
"Code:"...? About This Message Board	8	818	August 31, 2004
Html and paragraph breaks Factual Questions	10	707	January 18, 2001
HTML Coders--HELP! Factual Questions	29	901	September 23, 1999
A helpful note about special characters-- HTML & characters About This Message Board	45	2145	October 2, 2001
Anyone know why so many WYSIWYG html editors use "<p> </p>" instead of "<br />"? Factual Questions	6	3368	October 29, 2011

HTML I've never seen before

Related topics