what is xml?

i was at www.snopes.com and they have an option on their “whats new?” page to view it in “xml” i clicked it and i was sent to something that resembled my (admittedly ingorant) view of computer coding. whats xml?

In the future: Google works wonders. :wink:

Put basically:
XML stands for Extensible Markup Language. It is a markup language much like HTML. However, unlike HTML, XML was designed to describe data, not to display it- if that makes any sense.

although if you add a .xls (extensible style sheet) to an xml then it displays quite nicely

In HTML, which we all know and love, bits of text are placed inside “mark-up” tags which control how the text is displayed in a web browser …

XML is “extensible”, in that the standard allows you to make up your own tags … once you have a document in XML format, you can write (or use) software that interprets the contents of those tags any way you want it to. You can certainly get some nice effects by running your XML document with an XSL stylesheet and displaying the results in a browser … or, there are all sorts of other programs out there that will take XML docs and do nifty things with them. (You can feed an XML document directly into some kinds of database. Or use one to describe or control the output of another program - you should see what a Visio diagram looks like, converted to XML. Or maybe you shouldn’t - it makes for quite a big XML doc.)

In addition to the uses listed by Steve Wright, XML is also becoming the basis for many communication protocols. Many new programs like chat clients use XML for messaging formats, and most web services are based exchanging on XML over HTTP. Older protocols such as SMTP and HTTP are based on header/body formats using delimited headers, and they are not as flexible or easy to parse as XML.

As was mentioned regarding Visio files, XML is becoming the default file storage format for many apps. Even Microsoft is moving away from proprietary formats for things like Word documents to use XML which is more portable.

XML was over-hyped as the next huge thing and it’s never going to live up to what breathless but clueless zealots said about it, but it is a very flexible method for storing and transmitting data. The advantage of XML is that it is very easy to process. Most modern programming langauges include XML parsers which allow the programmer to either parse the XML sequentially or based on content (essentially a search, allowing the programmer to very quickly find specific kinds of tags and load their data while ignoring the rest of the document).

You can have data encoded with XML on YOUR computer and MY computer can read and display that data. It’s kind of like having a distributed database over the web. I’m not way off track here am I?

That’s interesting. I was talking at my university a year or so back. He knew stuff about web design, and when i told him i thought it would be cool to learn html, he told me that html would be pretty much dead before too long, and that anyone who wanted to learn anything should go straight to xml, as it was the Next Big Thing.

I ended up learning some html, and so far it’s all i’ve ever needed, although admittedly my web pages have never needed to be very complex.

In fact the Mozilla (and newer Netscape browsers based on the Gecko engine) use a “dialect” of XML to determine the way the User interface works. This makes these browsers highly customizable. There is more information on using XUL (XML User Interface Language) here. That shows the great advantage of XML - the fact that it is extensible. Basically, anything that can be described in words, can be described in XML. A simple example of an XML document might look something like this:

<body>Don’t forget me this weekend!</body>

There are programs that parse XML documents, and by using CSS (Cascading Style Sheets) or XML Style Sheets (XSL; which are themselves written in XML) you can customize how they are displayed. You can do all kinds of neat stuff with XML, including object oriented databases.

The funny thing is that the XML specification is itself written in XML. :smiley:

He was essentially correct, but a little premature. HTML is basically a certain set of XML tags. The new standards like XHTML will be more XMLish in how flexible they are (e.g. create your own tags instead of making do with the HTML standard) but the existing HTML standards are alive and well and everyone doing web pages uses them if they care about compatibility with old browsers. As the installed base of old browsers declines and most of the audience has browsers which support the new standards, we’ll be free to use more general XML. In any case, if all you care about is designing web pages then “learning XML” is overkill, you really just need to familiarize yourself with an existing HTML/XHTML set. When you say “learn xml”, I think of writing and using parsers, which you need if you’re going to write a browser, but not if you’re going to write a webpage displayed in a browser.

He was absolutely correct that XML is the Next Big Thing as far as HTML is concerned, and this is exactly the kind of problem XML is designed to handle (in the case of HTML, tagging data to describe its meaning and/or display). When I referred to the overhype, I was referring to some predictions that XML would obsolete everything from relational databases to sliced bread overnight.

Speaking of which…

could I add my own xls sheet easily to the snopes data, for example? Or is that something that has to be done by whoever posted the data (in which case it doesn’t seem very different from posting the information in html.)

If so… how do I go about it??

Well it would probably be easier if Snopes did it for you and then offered it as a service, eBay does it for you. See eBay’s developer’s area.

My Q …

When you are processing things and rendering things on a web page, aren’t you still limited to what basic HTML can do? I mean you could have a group of HTML statements generated by one XML statement, but in the end you are still bound by the capabilities of HTML. True?

Well, as I understand XML, (which isn’t that much, as I said,) it’s really meant for other things than displaying data. HTML and style sheets are great at displaying data - where XML shines is in doing other things to it… reading the information into a database. Converting it into a spreadsheet. Scanning it every day and flagging you with an email if some combination of criteria appear.

Also, with style sheets, two people can view the same data in very different ways, according to their own personal preferences. At least, so I have heard. :wink:

Once you have the XML, you can display it as you see fit. That’s the point of XML: It’s a simple, standard way to encode structured content, completely independent of how that content will be displayed later on. The specifics of displaying XML are completely dependent on what program you will be using, and each program is able to display it in a different way, as will become very clear in the next few paragraphs.

(However, I’d probably use an RSS viewer to view the snopes XML content. That way, it can be streamed across a viewer ticker-style and I can pick out the specific stories I want to read. Or not, depending on the software. Find a good RSS viewer here.)

For example, you could display the data as a webpage. In that case, yes, you would be bound by HTML, if you were indeed targeting browsers that can only deal with HTML input. You could also display the data in its own window in a graphical environment, where you’d only be limited by what the graphical system allows you to do.

Hell, you could display the data by translating it all into Morse code and broadcasting it to your nuclear submarines beneath the Arctic ice cap. It doesn’t really matter, and the XML file itself doesn’t know or care what it will end up being used for. All XML does is encode the structure in the information itself.

In theory, you could create a programming language out of suitably named XML tags. The XML files that are your programs could be displayed to look like more traditional languages, turned into diagrams and other pieces of documentation, and compiled into machine code for efficient execution. The XML syntax is more than flexible enough to allow that.

The XML FAQ – I hope this is useful.

XML just describes data. It puts the data in a format where some meaning is attached to each bit of data. HTML is essentially XML where the only meaning attached to the data is how to display it. In a more general XML situation, you’d use tags that described what the data meant and then have a something else render it as appropriate to different situations.

But XML doesn’t “do” anything. It doesn’t convert data to a spreadsheet or database or send you alerts when something changes. If the data you care about is formatted in XML, it might be easier to write applications that did those things and, in fact, because XML is general and standard, many of those applications already exist. But those features are done by things that use XML, not things that are XML. XML is just data.

The original intent of HTML was also to describe content, rather than describe how it would appear. For example, the H1 tag says “this text is a top-level heading”, but the browser decides how a top-level heading is supposed to look.

Problem is, web designers want to describe how their pages will appear. More and more tags like FONT were added, but HTML is getting back on track now that web designers can use CSS to describe the appearance instead of HTML.

I was generalizing and probably misspoke. I meant to say that XML is useful FOR such situations, not that it DOES such things. I really do know that XML doesn’t import information into a spreadsheet any more than HTML, by itself, displays information on the screen. :wink:

(Have been hoping to get a chance to use XML for transferring business data with some of the other companies my employer does business with, but they apparently wouldn’t know XML if it jumped up and waved a style sheet in their face. :wink: They insist on using such versatile :rolleyes: formats as ms-excel, arcane systems of plain text, and even… snrf PDF.)

More precisely, the applications that process XML can hook into standardized XML-parsing modules that know XML syntax but do not attempt to ascribe any semantics to what they read. This is the big win of XML: It cleanly separates syntax from semantics, and allows parsing to be handled completely by clean, standardized tools.

(In brief, syntax' means *how it looks*, semantics’ means what it means. These terms come directly from the corresponding words applied to human languages, where syntax is a property of grammar and semantics is a property of word definitions.)

Well, I’d say grammar encompasses both syntax and semantics, since it defines the structure of whole sentences; the meaning of a sequence of words is not entirely defined by the individual meanings of those words, but by how they’re arranged. But this is just semantics. :slight_smile:

With regard to the Snopes XML feed and how it’s displayed, this is simplified because it is part of a pre-defined subset of XML known as RSS, designed for containing individual snippets of news. In other words, it consists solely of tags whose meanings are already defined. An RSS reader program will understand these tags and display them in a usable format, in just the same way as do email programs with email, and web browsers with HTML. If you want to use the Snopes XML feed, you can install an RSS reader (e.g. RSSreader) and copy into it the XML link from the Snopes homepage. What you’ll see is a bunch of the Snopes stories appearing, a little like unread emails in your inbox. You can preview them and follow the link to the main story. The reader lets you import multiple feeds, so you can see when any of the sites you’re interested in has been updated.