PDA

View Full Version : Get rid of the archives?


Mudshark
11-10-2001, 12:46 AM
What if we got rid of all of the old forum archives? Only keep old posts for a year (2000/01)? I mean, who actually goes and reads the really old posts in there? Wouldn't that save a lot of space on whatever it is that runs the board? I don't know maybe it is a bad idea and I should just be quiet.

Ma Parrot
11-10-2001, 05:47 PM
It's really an excellent suggestion, but they have many times refused such suggestions.

One good one was to delete everything in the pit over 2 weeks old. That would be no loss.

Things in MPSIMS, likewise, are not worth repeat looks after 2 weeks.

capybara
11-10-2001, 06:01 PM
I've often wondered this too-- of course GQ should be kept around in archives, as it is sort of the original point of the board, but is there really a need to keep posts about tonight's episode of Friends and how I just grazed my knuckles on a stove burner for future consideration?

The Pit and The Pendulum
11-10-2001, 06:39 PM
:( Buuut ... My finest posts are in the Pit!

Mudshark
11-10-2001, 09:25 PM
Originally posted by Ma Parrot
It's really an excellent suggestion, but they have many times refused such suggestions.

One good one was to delete everything in the pit over 2 weeks old. That would be no loss.

Things in MPSIMS, likewise, are not worth repeat looks after 2 weeks.

Why have they refused the idea?

I agree ther everything in the pit, MPSIMS, even IMHO over two weeks old should should be removed.

AHunter3
11-10-2001, 09:32 PM
For what purpose? What's it to you if the old posts remain extant?

aftermarketeer
11-11-2001, 02:01 PM
The size of the board affects everything about it.
Maintenance is slower, disk usage is greater, the newest posts are farther out on the spindle for longer seek times, etc.

It reaaaly slows searches, even if you specify "last 30 days".

This is because the index is not sorted by date, but alphabetically.
The 30-day part only reduces the number of answers returned, not the work to locate them.

AHunter3
11-11-2001, 04:35 PM
OK, so we ARE back to talking about speed issues. (That wasn't apparent from the OP or the posts that followed).

I would be happy to see everything older than 30 days spooled to a different (read-only) database. That would speed up the hot & active database but would still preserve the old posts for searching and we could still link to them in new posts.

I would be very unhappy if the old posts were simply deleted, though.

Chronos
11-11-2001, 05:43 PM
This is because the index is not sorted by date, but alphabetically. The 30-day part only reduces the number of answers returned, not the work to locate them.Just curious, but where do you get this information?

Fern Forest
11-11-2001, 08:15 PM
Seems like there might be 500 "test" posts which seem absolutely useless to me. Isn't that why we have a preview button? Toss all them too. They're really just MPSIMS stuff in ATMB.

Mudshark
11-11-2001, 09:51 PM
Originally posted by AHunter3
OK, so we ARE back to talking about speed issues. (That wasn't apparent from the OP or the posts that followed).



My question didn't really have anything to do with speed, it is just something that came up.

AHunter3
11-11-2001, 10:29 PM
Well, if it isn't for reasons pertaining to speed, it can't reasonably constitute so much as a hair up anyone's nether regions if the archives continue to exist.

And even if speed IS an issue, there are other ways of separating the currently active threads from the archives, such as the one I've already suggested here.

Put this digital pack-rat down for a resounding and emphatic "NO" to the proposal that we kill old threads.

Lynn Bodoni
11-11-2001, 11:33 PM
Originally posted by Osiris
Seems like there might be 500 "test" posts which seem absolutely useless to me. Isn't that why we have a preview button? Toss all them too. They're really just MPSIMS stuff in ATMB.

Every now and then I go back and delete old test threads. I haven't been doing it lately, because I haven't known just what we're gonna do with the message board. I'd say that I've deleted a couple of HUNDRED test threads in ATMB, and probably at least that many threads reporting problems. Back in the UBBS days, the board would go down just about every weekday in the middle of the afternoon, and people would report it in this forum.

KellyM
11-12-2001, 07:12 AM
Originally posted by aftermarketeer
The size of the board affects everything about it.
Maintenance is slower, disk usage is greater, the newest posts are farther out on the spindle for longer seek times, etc.

It reaaaly slows searches, even if you specify "last 30 days".

This is because the index is not sorted by date, but alphabetically.
The 30-day part only reduces the number of answers returned, not the work to locate them.
This seems highly unlikely. It is a relatively simple matter to index the "LastPost" field in the database engine, which allows the use of an index slice instead of a full scan.

Your supposition that "the newest posts are farther out on the spindle" lacks merit given that the Boards run on a Linux machine using MySQL as a database engine. Linux does not use the dumb first-free-block allocation algorithm that Windows uses, and the MySQL database engine's use of disk space is complex, to say the least. (Does vBulletin store the posts in the database engine, or just pointers to the file system? I've never used the software.)

Has anyone tried to graft glimpse onto vBulletin? That would really speed up searches. I'd be willing to help develop a custom system for the Boards provided the end result is open source.

Una Persson
11-12-2001, 07:33 AM
Originally posted by KellyM
(Does vBulletin store the posts in the database engine, or just pointers to the file system? I've never used the software.)
All dynamic content, such as posts, are stored in the database. The Posts.myd file becomes quite large.

postcards
11-12-2001, 07:34 AM
I say we dump stuff over a month old, or transfer it to a different site.

Is there really any good reason to keep 682 pages of General Questions around?

Or 609 pages of MPSIMS?

123 pages of ranting from The Pit?

54 pages of About This Message Board?

Etc., etc., etc.


All told, it adds up to a staggering 1,862 pages of spanners in the works, gumming up the speed of the board.

We don't need it.

Weed it!

astro
11-12-2001, 09:04 AM
IIRC the amount of space the old posts (all of them) take is trivial in terms of disk space. I think it was quoted as being a few gigs at best and with fast 40 gig drives being sold for $ 150 I don't think this should be an issue re storage. The issue (I think) is how fast the DB engine and related systems can respond parsing through this quagmire.

I think the old posts are quite valuable and should have a special and separate place on the board with their own search engine front end that will not affect the speed of the main SDMB DB engine.

Doing this will cost time and money. The Reader is probably not inclined to indulge this board's needs and wants much further without seeing a return of some kind.

KellyM
11-12-2001, 09:15 AM
Originally posted by astro
IIRC the amount of space the old posts (all of them) take is trivial in terms of disk space. I think it was quoted as being a few gigs at best and with fast 40 gig drives being sold for $ 150 I don't think this should be an issue re storage. The issue (I think) is how fast the DB engine and related systems can respond parsing through this quagmire.
In my professional opinion, they shouldn't make that much of a difference. I am quite familiar with the database engine they're using and it should not degrade significantly under that sort of usage, assuming that the proper indices have been set up on the tables in question and that the software avoids making unnecessarily expensive queries.

An upgrade to MySQL 3.23.23 or later and using the FULLTEXT index type might improve search performance dramatically, but this is probably nontrivial.

Colibri
11-12-2001, 09:19 AM
Sorry, the suggestion of "getting rid" of old threads in GQ (and CCC/CSR) is absurd. These are an immense resource of information. It would amount to an act of vandalism against a unique source of knowledge. The very suggestion turns my stomach.

I'd agree that some of GD, and most of MPSIMS and the Pit is expendable. However, there's a lot that is valuable there as well - e.g. some of the race threads in GD provide a fuller discussion than anything in GQ. There are some very funny threads in MPSIMS and even the Pit that it would be a shame to lose. But sorting the wheat from the chaff would be very time consuming - a better solution would be to archive everything: Possibly older than two years for GQ and Comments, and three months for everything else.

Show_Biz
11-12-2001, 09:27 PM
Originally posted by Lynn Bodoni
I'd say that I've deleted a couple of HUNDRED test threads in ATMB, and probably at least that many threads reporting problems. The total number of deleted threads is currently about 5,577.
Based on the main menu page saying "93116 threads in total" and the most recent thread #98693.

Mudshark
11-12-2001, 09:47 PM
The bandwidth costs a lot of money, right?

If you shorted the bandwidth, it will cost less money, right?

Getting rid of the archives will shorten the bandwidth, right?

If the bandwidth costs less money, maybe we don't need to pay to post?

What about the Cafe Society? Is that taking up space too?
We could always remove that forum (to save space) and let it go to MPSIMS.

Just an idea.

KellyM
11-12-2001, 10:00 PM
Getting rid of the archives will shorten the bandwidth, right?Probably not, at least not very much. The archives are infrequently accessed and thus do not consume much bandwidth.

Chronos
11-12-2001, 11:05 PM
Removing the Cafe probably wouldn't decrease the load, it'd just redistribute it a bit. The posts people are making there would be made elsewhere, or at the least, the folks posting there would be posting something else elsewhere.

Doing something with old threads (be it deleting or archiving) might decrease the need for storage space, but space is cheap nowadays. It might potentially also improve speed some (or so I understand from various other folks on the board), but even then, it won't help the monetary situation at all.

Markxxx
11-12-2001, 11:15 PM
I would tend to disagree with no one reading old posts. I've been around for quite awhile and if I ask something and it gets no response often I come back and as much as a year later bump it up and it will get a lot of response.

I guess you could start a new thread but I've done that with other people's threads and not always but more often than not they can show new life.

TubaDiva
11-12-2001, 11:22 PM
Folks, archives are the least of our problems.

We're getting over a million page views a week . . . nearly all of it on fairly recent pages.

It's not that we have too much stuff, it's that we're accessed by, searched through, read and written to by so many people.

It's not such a bad problem to have. :)

Once upon a time, when we used other bulletin board software, the volume of posts was problematical. It's not so much of a deal with the vB, our current board system.

Thanks for your suggestions, but I'm afraid they're not useful solutions for this situation.

your humble TubaDiva
Administrator

scotth
11-13-2001, 12:52 AM
My understanding is that the big need is money for staff, bandwidth, and a bit more processing power.

Advertising doesn't seem to be doing the job. Might we be able to find a real sponsor or some grants? Might even be able to figure out a way to get Uncle Sam to kick in some bucks. It isn't as far fetched as it might sound on first pass.