Deep Web Search Engines

Go_You_Big_Red_Fire_Engine · June 20, 2004, 11:18am

I tried googlin’ for one, and unsurprisingly enough, came up with nothing. I have heard about the “wonders” of the deep web, and want to try it out. Any recommendations, or is it not worth the trouble?

I considered putting this in IMHO, but I do want factual answers and statistics. Our engineering lecturer said something along the lies of “google only contains 3% of the available content on the web.” Verification?

alterego · June 20, 2004, 11:44am

I have a link to a deep-web website in the references sticky. There was a book written on it and they published a website alongside it. It really hasn’t been updated since 2001 so some of the search links might not be accurate, however they accounted for this by simply providing links to the websites as well the website’s search engines. IMHO, I don’t recommend the book.

I also want to point out that by its nature the deep web has no search engines. See this quote from Wikipedia:

While you can be pointed towards websites that are members of the deep web, once they have been spidered by a search engine (search engines run by the very site notwithstanding) they are no longer part of the deep web!

In reference to your sig, “Choke a fish or kill a tree?”

alterego · June 20, 2004, 11:45am

Sorry - the website is http://www.invisible-web.net/

Go_You_Big_Red_Fire_Engine · June 20, 2004, 11:48am

That’s what googling turned up, but it didn’t help me much.

But your post does make so much sense. I feel really dumb now. :smack:

In response to my sig, I don’t use paper bags either? It’s all backpack stuffing for me

Go_You_Big_Red_Fire_Engine · June 20, 2004, 11:50am

I take that back. I was at http://invisibleweb.com :smack:

alterego · June 20, 2004, 11:57am

That sucks. Damn those squatters!

Chronos · June 20, 2004, 7:08pm

There’s also the matter that most of the deep internet is specialized information. For instance, a researcher might have his computer hooked up to the Internet so he can run his simulations remotely, using a password. That computer is then part of the deep internet. But only people who have a reason to use it can access it.

alterego · June 20, 2004, 8:01pm

From everything I have read, including the book above, it only applies to publicly accessible web sites that a search engine could not possibly crawl because the pages are generated dynamicly and simply do not exist in static form - thereby making themselves invisible or “deep”.

To requote Wikipedia:

A definition given by Berkeley seems to imply the same - pages that are searchable by the end user:

The “invisible web” is what you cannot retrieve (“see”) in the search results and other links contained in these types of tools.

* **Searchable Databases**. Most of the invisible web is made up of the contents of thousands of specialized searchable databases **that you can search via the Web**. The search results from many of these databases are delivered to you in web pages that are generated just in answer to your search. Such pages very often are not stored anywhere: it is easier and cheaper to dynamically generate the answer page for each query than to store all the possible pages containing all the possible answers to all the possible queries people could make to the database. Search engines cannot find or create these pages. More explanation.

* **Excluded Pages**. There are some types of pages that search engine companies exclude by policy. There is no technical reason they could not include them if they wanted. It's a matter of selecting what and what not to include in databases that are already huge, expensive to operate, and whose search function is a low revenue producer. More explanation.

Although it could be said that your researcher’s password protected databases are searchable via the web, if one has certain credentials, it seems to go against the general “spirit” of the deep web. It should also be said that the entire reason the concept was named was not that there were password protected databases out there, but that search engines simply didn’t have the capability of crawling them, whereas if the end user was at the site they certainly could.

Garfield226 · June 20, 2004, 8:10pm

So the specialized databases that you generally have to have a membership to use (usually through a university or business), Lexis-Nexis being the best known (IME), wouldn’t count?

That’s what I immediately thought of when alterego’s first reply.

alterego · June 20, 2004, 8:13pm

I personally don’t think they apply. I have access to hundreds of databases online through my university but they aren’t public, they are private. Yes they are ‘invisible’ or ‘deep’ but not for the same reasons that the term “Deep Web” was coined.

Maybe someone else has some ideas about it.

Chronos · June 20, 2004, 8:23pm

OK, then, a misunderstanding of terminology. Is there some term for the restricted portion of the Internet? Because it seems to me that that would be even larger than the public deep web.

Also, the quote

is perhaps a bit misleading, since the search results pages themselves are dynamically generated pages built from a database, and would thus be part of the deep web. So seach engines can, in fact, create some of the deep web pages.

alterego · June 21, 2004, 8:26am

That’s an interesting idea. Technically speaking the exact pages that Google spits out do not exist in static form. On the other hand they are just dynamic representations of static data that we already have access to.

It still sounds like they apply, which probably makes places like archive.org with their petabyte boxes and Google some of the biggest sources of the deep web themselves.

“I am my greatest problem”

I don’t know of a term for what you ask. Probably “private web” vice “public web” would work, though that could get confused with a VPN.

Dewey_Finn · June 21, 2004, 2:49pm

FYI, there’s an article (free registration required) about this issue in today’s New York Times. It discusses how even academics tend to use only online sources to do research and how librarians are responding to this.

Topic		Replies	Views
What is the "Deep Web" and how can I view it? Factual Questions	18	2818	January 14, 2005
What's The Dope On The Deep Web? Factual Questions	7	1444	October 1, 2008
Tell me about the 'Deep Web'. Factual Questions	25	4703	February 28, 2014
Deep Web question Factual Questions	1	854	January 3, 2003
Borderline surface web sites Factual Questions	19	2258	November 18, 2016

Deep Web Search Engines

Related topics