Back off, inktomi, and learn some manners

The other day, the usage stats on my server went thru the roof. Logs showed that inkomisearch.com was responsible for all of it, using multiple IPs in the range 66.196.72.*.

What the fuck is a search engine doing spidering my sites all at once? I’m not against searching and linking, but don’t they know the rules of the web?

I know, I could use the robots.txt file to exclude certain pages, but I don’t want to eliminate them, just get them to show some respect. But after two days, I blocked their IPs in disgust. The .com site is not valid for web browsing anyway, so it’s 100% outgoing traffic.

I also noticed they were spidering the user list on my message board most of all. This smacks of email-collecting for spam purposes. Anyone else have any experience with inktomi? Are they going renegade now that Yahoo! bought them?

I fully expected this to be a thread about a poster named inktomi, complete with links.

Me too. Someone should register as inktomi and be proud of having been flamed at before even joining the SDMB.

For all we know, inktomi is spidering us right now and has already “registered” as a guest.

Hard to believe that someone with 7200+ posts is such a Net Newbie that he hasn’t heard of Inktomi as a search engine. It’s been around for quite a while, and predates googoo.

No offense, Rilchiam? :slight_smile:

By through the roof, do you mean it’s the first time someone looked at your site?

:wink:

I’ve been using the internet forever and a day, and have never seen the name “Inktomi”. Apparently it’s not as widespread as you think (as far as awareness goes), Musicat.

Been on the Internet since the early '90s - I thought they were a printer company when I saw the name. (Or a poster here.) Plus with the difference in spelling between the title and the first sentence, I couldn’t tell which was correct.

Stupid question-what does “spidering” mean?

Not a stupid question at all. A web spider or crawler(obviously derived from the medium we are in, a Web) is a program that continuously accesses web sites, looks for URLs, logs what it finds, then goes to each new URL and repeats the process; basically a search engine. A well-behaved spider:[ul][li]spaces out data requests over time to avoid overwhelming a single server all at once[]revisits each site, but not too often, to see what has changed[]respects entries in the robots.txt file that indicate what the webmaster doesn’t want them indexing[/ul] [/li]I’m surprised that Inktomi is such a new name to some. Inktomi has rented out their services for many years – I recall seeing “powered by Inktomi” on many search engine sites since 1995. I believe they were one of the earliest large ISPs.

And, parlo americano, smilie noted, but “through the roof” means a doubling of traffic overnight, from a typical 900 sessions (or 3000 hits)/day to 2000 (6000 hits) several days in a row. Not close to the volume of Amazon, but this is a regional, non-profit site and probably the most visited one in the County I live in, more than the Chamber of Commerce. And it shares a single T1 line with 2000 more dial-up users (100 modems) and other web sites on the same server, so I hate to overload it.

Certainly not. But I really never have heard of it before.

And you did mean to type “google”, did you not? :wink:

Christ, man, you haven’t heard of googoo either? !:wink: