I’ve been doing web design for a while and my dad asked me to do some work for him but the code is unlike anything I’ve seen before. Anyone recognize it and know where I can learn more about it?
The code has parts such as <O:P> and inside image tags it has v:shapes as an attribute.
The part in front of the colon is a namespace prefix. In XML (and newer versions of HTML are incorporating more and more general XML features - some are even well-formed XML) a anemspace is used to prevent clashes between elements that might have the same name but convey different meanings. In your example, <O:P> names a <P> element that might have nothing to do with the HTML paragraph element.
You can get some clue about the origin of these namespaces by looking for the definition of each prefix. Somewhere up above the place where <O:P> and <v:shapes> are being used, you should find an element with attributes
xmlns:O="…some long string…"
and, not necessarily in the element,
xmlns:v="…some other long string…"
The strings are URIs that represent a unique identifier for the set of tags being associated with that prefix (“O” or “v”). The prefixes themselves are arbitrary, but the corresponding URIs uniquely identify the set of tags being used. In many cases, you can figure out the general purpose of that set of tags by examining those URIs. Although it’s not required, some people use “real” URLs for the URIs and place documentation for the tag set at that web location.
If you’re going to work with this style of markup, you should start with some basic tutorials on XML.
To add to what ChordedZither posted, the tags are often embedded in HTML to enable “round-tripping” documents from other applications. For instance, Word infamously buries a huge volume of extra tags and formatting information when it saves a document in HTML format. However, that allows Word to subsequently re-load the document and retain almost all of the extra information that would otherwise have been lost in a strict HTML page.
A problem, of course, is that it is often a fucking nightmare to view an HTML page created by a Microsoft program in anything except Microsoft’s IE browser.
Dreamweaver does a fine job of stripping out all that crap and making normal HTML out of it.
I’m sure other apps do this as well - you know it’s bad when an app has a built in “Clean up Word HTML” item on menu - not even as an option or plugin.