For those of you only interested in the technical questions, fell free to skip past the next two paragraphs.
The situation is that I’m using reports generated by a database which I have heavily restricted access to, being an end user. The problem being that the reports have to be gone through manually and double-checked against a third system, but the reports have an atrocious layout and provide way more information than I actually need. (The reports provide 40-50 cells in a row per instance, with between 1 and 222 instances per report and it’s all jumbled together with incomprehensible abbreviations and obfuscating context-sensitive placements - I only need 13 or so of the fields, all of which are either text or 4 to 8-digit numerical.) Bringing in direct outside help is also off the table, since the data is financially sensitive for my employer. (If someone needs it, I can make a simulacra, though.)
I’ve requested a simplification of the report system and it’s being considered, but the system is currently being rolled out piecemeal accross the country and that will take at least another year, so it’s way down there on the list. There’s nobody who’s not currently otherwise engaged to ask for help, so I’m bringing this to the Dope.
And now for the techy bit:
I fortunately stumbled on to a way of exporting the report(s) I need to a local XML file which I can manipulate. Vaguely remembering some similiar ad-hoc tech work from the past, I remember that Excel can read and treat XML files, so I imported it into Excel. This gave me a readable table that I could manipulate by (manually) removing columns I don’t need, changing their order and removing duplicates while still retaining the sort order.
So question number 1:
Is there any way I can retain this Excel table as a sort of template and then ask it to update only the columns I have retained from the XML file? (Which I can just overwrite each time I export the reports.)
Doing this by hand led to consistent occasions where if the comments field had gone over the character limit for the database, it would dump a second (or more) instance identical to the first one, only to fit in the rest of the text in the comment field. This is very undesirable, leading to…
Question number 2: Is there any way to ask Excel to check if the row-cell with the unique instance identifier is the exact same as another row-cell higher or lower in the column and if so, remove the duplicate fields and merge the two (or more) rows?
(I was thinking something like “IF B1=A1 THEN DELETE B1-B12, SELECT A1-13+B1-13 AND MERGE” but then that would screw up the third or more duplicate instances.)
Let it be said that I have only casual and cursory experience with both Excel and XML - or any other programming outside HTML and some miniscule amount of terminal/command line work from the bad old days. As in, I have a snowball’s change in hell of programming this myself in VB/XSD/etc, unless given explicit instructions. Of course, if the only solution is to teach myself XDS - which seems likely - then I’ll try, but I was kind of hoping for a clever ad-hoc solution; doesn’t have to be pretty, only has to work!
Thanks!