You’ve gotten some very good information, but I thought I could add a bit of detail to sum it all up for you.
SGML (Standard Generalized Markup Language) is the daddy of them all. It is a “meta-language”, which basically means that it is used to describe other languages. HTML is one of those other languages. HTML isn’t written in SGML in a programming sense. Instead it’s “defined” using SGML.
If you look at XHTML as cleaned-up (the official term is “well-formed”) HTML, you cover about 99% of the bases. One of the main goals was to make HTML more portable. For those who haven’t downloaded a browser lately, they happen to be getting rather massive. This makes them a bit difficult to implement well on cell phones, PDAs, etc. One thing that adds to the complexity of browsers is the ability to read badly formed HTML documents. XHTML is an attempt to counter this.
All start tags in XHTML must have end tags. If you want to start a paragraph, you damn well better end it somewhere. This makes the parsing code much smaller. When the code runs across a <p>, it can safely keep parsing the text as a paragraph until it runs across a </p>. It doesn’t have to look for numerous other tags, some of which would also indicate the the designer had intended for the paragraph to end. Also stated earlier was the fact that all elements and attributes must be lower case. You don’t have to include logic to handle both <B> and <b>, you know it will always be the latter. To anyone with programming experience, this will start looking very helpful.
The easiest way to describe XML is as a SGML replacement, that is still understandable by a layman. I do this kind of stuff for a living, but SGML is still scary as hell. XML contains a grammar that defines documents. You could even define HTML in XML (which is basically what XHTML is doing). mattk is correct when stating that there are not a lot of standard definitions out there yet, but they are slowly being created. Once this happens, you can expect XML to take off like a rocket.
An example of how this might work in the real world:
If the real estate industry had a standard that defined several tags, such as : [numbedrooms][/numbedrooms], [stories][/stories/],[imgfront][/imgfront], etc., any application that wanted to display home information could do so, regardless of which realtor’s database contained the actual listing.
Currently, most users (displayers) of foreign data must deal with the conversion process on their end. If the source changes, tough luck. XML standardization puts the onus on the source of the data. If someone doesn’t comply, software applications in that industry will just ignore them.
As for learning XHTML, if you already know HTML, then you just have to unlearn and learn a few things and you’re done. If you don’t know HTML, you can still become fairly adept in a few weeks, using any online tutorial or book. There are many tricks that you can pick up over time, but the basics are a breeze.
As for lucrative skills in web publishing, a combination of (X)HTML, Java, XML will definitely work. Other possibilities are Cold Fusion, ASP, Javascript, etc. The market for those who only have knowledge of HTML is a bit flooded, so it’s not as lucrative as it once was.