Fight my ignorance of XML

So I’m an engineer by degree but a quasi-programmer by necessity. A business process that I am investigating requires proficiency in XML files.

I have the option of submitting daily market positions via XML documents. A quick study of the process shows me that I could use MS Excel 2007 to generate the XML files so I’m interested in heading down that path. The market operators have already defined the schemas and other requirements so I wouldn’t be inventing the process as much as would be mapping Excel to their schemas

What I need to do now is find a source (book, on-line primer or forum) that will give me a beginner’s look at this technology but also help me explore the specifics of Excel as an XML file generator. I started to buy one of the “XML for Dummies” books but I thought that first I’d ask the teeming masses for their guidance.

So what book, forum, article etc is the best for getting me up to speed?

Mods, if I’m in the wrong forum for this (is it GQ or Humble O) please move and accept my apologies.

I have this page bookmarked, which I often refer to — not just for XML, but as a general reference on HTML, CSS, and the like. There are also some tutorials that might be helpful.

An important thing to note about XML is that it isn’t a specific data format, but rather a framework for defining data formats. It isn’t good enough to say, “I stored the stuff in XML — now my work is done”. Both the reader and the writer have to agree on what the tags mean and what the structure is supposed to look like. Which elements are required and which are optional? Which elements contain other elements, if any? Do you want a hierarchical structure, or just a list? Or some of both? Do you put this bit of data in an attribute, or in an element’s content?

So although Excel will save data in XML form, that XML won’t automatically conform to what your operators are expecting — unless they purposely chose Excel’s native output as their desired input. (It might also be possible to program Excel to write whatever XML is required, but I can’t help you on that score.)

Excel outputs XML in its Office OpenXML format, which is almost assuredly not what you want, since you mention that other people have already developed the schemas you’ll be using.

You could theoretically use XSLT to translate the Excel output to your own schema, but XSLT was designed by Satan’s minions (XML was only designed by the Angel of Death).

My recommendation would be to use something appropriate for the job, which Excel almost assuredly isn’t. If your source data is in the form of spreadsheets or CSV files, the best thing to do is probably to save them as CSVs and write a simple text-processing script to parse them and spit out XML in the appropriate format.

Thanks for this link. Just a quick look through it answered some leading questions for me.

This is the part I’m actually very comfortable with. I have examples of macros that create XML files and I have done a lot of VBA programming so I don’t think I will have that big of a problem setting up macros to create XML files.

Also friedo, thanks for your opinion. As I said above, I’m not concerned with Excel native file solutions. I have other reasons for sticking to Excel. My colleagues will also be using whatever application I create and Excel is a kind of middle ground for us to do our guerrilla programming method of development. It gives them the freedom to adjust reports and calculations without constantly running to me for a change in the application. Sometimes they crash my straw-man applications but other times we manage to develope some slick time saving, money making processes.

Another option would be to write a very simple .NET application to generate the files. You can get the appropriate Express edition of Visual Studio for free, and download the Microsoft Office SDK (also for free). That would let you read an Excel spreadsheet and create the compliant XML output using pretty straight-forward code.

I’ve done very similar things using C#, and it took just a couple of hours to develop, even in learning the proper namespaces to use.