About pruning old threads...

Two things to keep in mind, though, Joe.

First, everything that old was under the old UBB, so now the answers are jumbled as to time, the quoting and other formatting doesn’t work, smilies are replaced with long links and the threads are just generally messed up. It would be hard to extract pearls out of those oldies in any case.

Second, before we moved to VB we had a project to try to archive the best old GQ threads with an eye toward pruning the lesser-quality stuff. All the stuff I archived is apparently gone forever, a victim of the changeover to VB. So that means the best stuff is mostly already gone.

All that said, if you remember a particularly good thread from prior to the cut-off date which I might have missed during archiving, run a search for it (during the evening, please!) and bump it up.

That’s true, the UBB post’s aren’t in a very readable fashion since they got converted. So why not just delete all the pre-VB/UBB stuff - a few hundred thousand posts - and monitor the difference? Seems like a protocol that people would be happier with and besides, we can monitor the effectivity of speed improvement via dumping a modest percentage of the board’s storage (of old jarbled stuff). If it doesn’t make much of a difference it would at least be a good preliminary test/indicator as to whether dumping a larger load is necessary or worthwhile. No?

Oh dear…

::slapping self on forehead::

I thought everything prior to the beginning of THIS year was gonna be deleted…

::huddling in the “bad reading comprehension corner” with UncleBeer::

I am assuming this also goes for locked threads?

Esprix

kabbes said:

Actually, I said six months–I was temptedto say three months.

This is kind of my point, though–the same topics are being debated, largely by the same people with the addition of newer posters. Some topics are sort of unique to the time-period, such as the election last year. Some topics reoccur, but fairly infrequently. But some topics are pretty much continual, ebbing and flowing but always ready to pop back out again–guns and abortion are two obvious ones in this catagory. Nobody on any side in these is going to stop arguing over them, and very few are going to stop using what amount to the same analogies that were used three months ago, and six months ago, and nine months ago, and last year, and back to the beginning of the board. I submit that if all the gun threads in GD were eliminated to January of 2001, there wouldn’t be a single pro- or anti- argument that wouldn’t reappear by the end of the year.

Not that this matters–as Lynn makes clear, they aren’t gonna do it anyway. Just as well–if they actually did something I recommended, then somebody might blame me for it.

Actually, I wish we COULD do it MysterEcks’ way…but that would take waaaaay more resources than we have.

Lynn

What bothers me is that the message board could be as rich a source for future historians as diaries and letters are for current historians. For it to be a good resource however, all of the threads would have to be available without editting. If I were to pick and choose, I might get a bad pun thread, a “guy stuff” thread, maybe a flirting thread if I had been flirting with my sweetie. I would miss the threads about the Columbine massacre or McVeigh’s execution that were current to the event, various peoples “MPSIMS is my diary” threads (gives a picture of day-to-day life) and things like post parties and Pit threads (what we want and what we hate)

Like kookalouris, I would pay for a CD with a text version of the old threads. However, as manny pointed out, most of the old threads are unreadable anyway sigh and there is probably no easy way to dump a large number of threads to flat files. (Makes me wish I was a programmer for vB.)

[sub]oops, that last post was me, not rjk[/sub]

Well, first I don’t think that the tiny amount of speed we might get from pruning that much is going to be worth it.
As the tech’s said, it won’t change very much.

Id prune most all of the messages that are in About this Message Board, Comments on Cecils Columns, Comments on Staff Reports & the Pitt instead of the other bases…

Due to the Great Software Switch, any thread last posted to before the cutoff date has been unlocked. Anything relocked has generally been bumped since the Switch.

Uncle Beer,

Thanks. I know I can save some threads (and have already). But I have always considered the SDMB like a kind of wonderful free ‘Library of Alexandria.’ Just 'cos I can make off with a few smoking scrolls before the place goes up in smoke don’t mean I won’t miss being able to look something up later I can’t imagine today.

Hey, rjk, nice to know great minds think alike. :slight_smile:

Gerry

Forest and trees time. You guys truly have no clue what you’ve got and how valuable it really is. Thousands upon tens of thousands perhaps even hundreds of thousands of intellectual man hours of the considerations of some of the more interesting and intelligent people of the age are held in your data sets.

If people in the future really wanted to know what reasonably intelligent people thought about things your data base is the black pearl of the orient. You’re going to be sorry if you destroy this resource and if you aren’t your descendents will be. Granted there’s a lot of trash in the mix but trying to parse out the “important” stuff is an exercise in futility. What’s seemingly trivial or un-important to you could be gold to future psychologists, sociologists and historians.

Would it be possible to simply dump the old messages to an archive website where people could get at them if they wished to. There is no richer resource of data reflecting the temper and mood of the times than the dialogs that have gone on between SDMB members over the years. It would be an invaluable historical resource for future historians. Think of your grandkids if not yourselves. You can get 80 gig drives for about $ 220.00 which I imagine would hold at least a few years worth of SDMB info.

Wouldn’t it be great to have recorded village square multi-person conversations between groups of people from the past. What did our ancestors really think about such and so in all the richness and complexity inherent in ongoing group conversations, not the distilled and extracted hypotheses of historians guessing what they were up to.

Just think about it. Please.

First, let me say that it doesn’t surprise me that the techs are telling you that it will help to reduce your database size. Without knowing vBulletin’s schema and the queries involved, I can’t say for sure, but I can easily believe that the forum display query performance is going to be fairly sensitive to overall table size. And since we’ve been diligently reminding people to search over all articles when they search, those undoubtedly become full table scans. And we’ve also been exhorting them to search before posting …

(If the DB design is sensible, I’ve got to believe that a thread display will be fairly efficient, BTW. It is surely done in such a way that a query restricted on threadid will be making good use of a primary key or a secondary index. One thing you have to remember with DB queries is that how expensive a query is does not correspond to the size of the data in the result set - that’s a factor, but the way in which the data is stored and indexed is likely to be far more important.)

That said, it does appear that vBulletin supplies easy to use backup / import facilities, and I would like to add my voice to those urging that you dump the data somewhere before pruning. Even if you put it on removable media that you toss onto a shelf somewhere for now.

MIS shops with long running projects have been doing this sort of thing for years, and designing retrieval systems for large bodies of documents is non-trivial. To allow people acceptable online use of a “current” set of documents, you generally have multiple levels of archival. In the old days, that last level might have been “request that the operator mount the tape your old document is on, and …”.

If you just dump it somewhere, we’ll have it in raw form against the day someone figures out how to allow access to it.

Would it not be possible, to condense an entire thread into one post, with paragraph headings indicating authorship. Although the data size wouldn’t be much smaller, the index sizes certainly would be. Searches on thread titles would still work (and faster); searching by posters through would be lost on such threads.