This has “dumb question” written all over it, and I’m sure I’m just over-thinking the issue, but here goes:
As I understand how these things work, there’s an index that is searched against. The obvious answer would seem to be that limits (all forums vs. one in particular, thread titles vs. entire thread, and post date) would speed up the search, causing there to be fewer indexed entries to be searched through.
But the more I think about it, the more I wonder, does the search engine go through the index and for each matching entry ask, “Is this in the forum Knead is looking in? Is the search term in the title or elsewhere in the thread? What date was this posted?” If so, that would seem to me to be a tremendous burden on the server.
Even if it just did a quick sort (sorting threads by date so that only those from after 1/1/2002 were queried, for example), it seems that each non-default selection would add a processing step to the search process.
Well, crap. It’s not a dumb question if there’s no question, is there?
So, how does the server handle searching with limiters like this, and which way is least taxing?
I have no idea how mySQL works. But here is how indexes work in an Oracle database, and I imagine mySQL does something similar:
let’s suppose I have a table POSTS with an index on a field USER#, and a field called TEXT that is not indexed.
I. Searching by USER#
get all records in POSTS with USER# = 8334
the database will read through a separate “file” called an index, do some kind of binary search for USER# 8334, get pointers to records in POSTS where USER# = 8334.
Since we do a binary search in the index for USER# first, this is “fast.”
II. Searching by word in TEXT
get all records in posts with TEXT contains “chocolate”
the database will read through every record in table POSTS and check the record to see if “chocolate” is in field TEXT. This is much slower.
III. Searching by USER# and word in TEXT
The database should do a binary search in the index to find all the records for USER#, and then read those records to find those that have “chocolate” in field TEXT. This is faster than II. and probably faster than I., because the database has to send less information over the internet to display on your home computer, and the page being displayed will have less data, so generating the web page will be quicker.
Now to answer the question “which is the most efficient combination of search criteria to use at the SDMB”, one would have to look at all possible combinations to figure out which are the least invasive. But without having access to the database, the source code, or the server, the question is impossible to answer.
General rule of thumb:
It seems certain to me that EXACT USERNAME is probably an indexed field (it gets translated to USERID in the search); an individual forum vs. all fora will probably cause the search to use an index; searching post subjects instead of post text will cause the comparison to be much faster; restricting the date for searches, even if the date field is not indexed, will cause less records to be sent back to your client machine.