A couple years ago I was in an improvisional webboard-based roleplaying group, and in a spate of insanity I agreed to take charge of the “historical archives” (i.e., all our old posts).
Many of these posts were to Guestbook-styled webboards, and displayed in reverse chronological order. The original archivist saved these posts in that order, so I’m left with somewhere around 40~50 files that I need to reorganize.
The posts look something like this:
Post 10 -
More stuff happens.
Post 9 -
Stuff happens.
etc.
Other than cut and paste, is there some way of reshuffling these files? And no, I can’t bring up the original page that this stuff resided on - it’s long since been deleted.
Anything with a decent regex engine should be able to split it up and reverse the order relatively easily. If you provide a more detailed spec about what the files actually look like I could whip up a quick Perl program to do it.
What we really need to know is how are these files stored? Are they in a database (MS Access, SQL Server, Oracle, etc.)? Are they stored as ASCII text files? Are they handled as proprietary files by the message board software?
The answer on how to sort depends on the type of file.
If the .html files are all you have, then it could be somewhat complicated. If you could get the original administrator to do a database dump, then it would be easier.
Even if you just have the html…hmm…it shall be possible to come up with something to parse all the subject lines. The dates are all in the “correct” TimeStamp format already. So we look through line by line, use a combination of HTML tag and the word “Subject” to identify the start of a new message…