I often use the advanced search at Google to search within domains (including my University) because I find it works better than the in house engines. For example I had about an hour to kill today and I thought I would read some of the sex questions that had been asked at this board over the years. I did a search for “sex” from the “in house” engine and got bact an error message that my search term was too short (or long or just plain wrong). I figured I would overcome this by using Google and restricting the domani to www.straightdope.com/sdmb , but this didn’t work either. Is this an example of the so called “deep web” which typical search engines cannot reach?
That’s not a valid url. Try either www.straightdope.com or boards.straightdope.com.
Google is banned from the SDMB. The robots.txt file at http://boards.straightdope.com/ reads:
User-agent: *
Disallow: /
This blocks any bot – at least those that respect robiots.txt, which includes Google, MSN, Yahoo and others – from crawling the site.
Why, though? Bandwidth? Database queries resulting from bot visits? A possible flood of traffic from outside the SDMB as they find relevant search results pointing here?
Letting a search engine such as Google spider a site reduces capacity available for members and tends to slow or block access through system timeouts from overdemand.
Our resources have nearly always been compromised as is and we wished to reserve as much access for the community as possible.
Chicago Reader management also felt they should have more control over the content and wished to archive for themselves. They backup everything, so there’s no good reason to have it out there somewhere else, somewhere the Reader does not control. I should point out all this is copyrighted material and the Reader spends a great deal of time as is persuading cyberthieves from lifting material wholesale.
You will also notice that none of the other search engines or archive.org is allowed to spider here either.
Our search engine is far from perfect, but it’s the best we have under the circumstances that prevail. If you are disaccommodated, we apologize.
your humble TubaDiva
Administrator
Roland, I have seen posters say several times that three-letter search values are too small for the search engine. That could be why “sex” gave you the answer it did. You may want to make it a larger word.
Where does that put my (and I think Google’s) dream of having every word ever published in every language archived at one easy to search location? Perhaps the as yet unwritten Patriot Act III will provide the legal foundation for such a plan to be implemented! In the interim I will have to try and think of words longer than three letters (there goes my Boggle game scores).
Google search would be nice. In the meantime, appease yourself with *sex. It works.
That’s your answer to everything, isn’t it?
I’m a convert to *sexual appeasement.