Search Engine / HTML question

Duderdude2 · March 19, 2004, 6:58am

Is there a command I can place (perhaps in the header) that would prevent a search engine from cataloging my page (e.g. adding it to their search engine)? Thanks!

Q.E.D · March 19, 2004, 7:09am

I don’t believe there is, however, most (if not all) search engines will remove you if you specifically request it.

brianjedi · March 19, 2004, 7:15am

In the actual HTML, no.

However, you can put a robots.txt file in the root directory of the website (such as http://www.websitename.com/robots.txt) and that should do it. There are some spiders, however, that will ignore a robots.txt file.

For how to construct your robots.txt file, try this: http://www.robotstxt.org/wc/norobots.html .

Cugel · March 19, 2004, 7:15am

http://www.google.com/webmasters/3.html
http://www.robotstxt.org/wc/norobots.html

brianjedi · March 19, 2004, 7:16am

Great minds think alike, eh, Cugel?

Duderdude2 · March 19, 2004, 7:18am

Well, here’s the problem. My website uses frames, though Google is cataloging every page but the main one for some bizarre reason. I really don’t mind that it’s indexing them, but it can be troublesome since visitors to said pages are effectively missing half of the site (because of the frames). I’ve remedied the problem somewhat by adding a link to the index on every document; but it still bothers me that Google neglects the home page.

Duderdude2 · March 19, 2004, 7:19am

Well damn, in the time it took me to write my last message, I get 3 f’ing replied, haha.

Anyways, thanks guys, I’ll look into those to see if they help.

Dancing_Fool · March 19, 2004, 7:43am

Not strictly true. As well as the robots.txt, there is a ROBOTS meta-tag that works on a page by page basis, however many search engines ignore it. Google does not.

But that wouldn’t solve your problem, Duderdude2. The reason Google is doing what it does is discussed in the Google webmasters section. It’s because you are using frames. Google is not ignoring your home page - it is returning the URL of the page that actually has the information requested by the search. It doesn’t know that it should be part of a frameset. It will only return your homepage if that is the page that has the data or if it thinks your entire website matches the search query. If you tell the search engine not to look at your sub pages, then you will get no hits at all, because they will never be looked at.

DancingFool

typhoon · March 19, 2004, 8:12am

Info on the ROBOTS meta tag.

Robots.txt is your best bet, though; if you find any search engines that aren’t adhering to it, ban them.

As for your specific problem: I don’t think blocking Google from accessing those pages is the answer, especially since it’s the single largest referrer on the 'net. Most users will enter your web site by finding an innermost page on a search engine; this is the nature of the internet and you should design accordingly instead of resisting it.

You could also try using JavaScript to force frames.

Mops · March 19, 2004, 8:19am

Possible ways to address your problem:

Don’t use frames [best, as frames are evil. I know of no commercially successful web site that uses frames.]
Include easily visible links to the frameset page on the framed page, so people who find the framed page in a search engine will be able to navigate to the frameset
Include Javascript code that, when the framed page is opened out of the frameset’s context, opens the frameset instead (in this case the frameset’s navigation must lead users easily to the desired content.

CurtC · March 19, 2004, 2:12pm

That, indeed, is the problem.

fezpp · March 19, 2004, 3:36pm

One way to help (but not solve this problem) would be to put appropriate keywords in a meta tag on the index page. This might mean that google will pick this up rather than just the actual data holding pages.

DoubleJ · March 19, 2004, 4:25pm

Another thing you can do is use Javascript in the framed pages to make the page load correctly. If you feel like mucking around with things like query strings you can even make a specific page come up (like if Google indexed a content page) when you put the frames back in. But I’d make sure the frames are necessary first; they’re almost always more trouble than they’re worth.

Anyway, this should work. Haven’t tested it though.



<script language="javascript" type="text/javascript">
<!-- // there may still be a couple browsers that can't handle script tags
if (window.frames && window.top == window.self) { // is the page not in a frame?
    location.href = '/index.html';
}
//-->
</script>

Duckster · March 19, 2004, 4:56pm

If all search engines all followed the rules of the game, yes. However, many do not and the spiders will search every area of your web site, even if you use a robots.txt file and use the META tags to disallow searching. It only takes one search engine to find things you do not want seen and the game is over. It may take time, but it will happen.

If you don’t want something searched, don’t put it on the web.

Tim_T-Bonham.net · March 20, 2004, 12:03am

I agree with the other posters that frames :mad: are the source of your problem. Dump them as soon as you can. This search engine difficulty is only one of the problems they cause. Another example: if an impressed visitor to your website wants to send an enthusistic email to his friends saying “you’ve got to look at this great webiste!”, it’s really hard for them to cut-past your url into their email. So you lose that word-of-mouth recommendation, which is one of the most useful parts of the web.
Real professional web designers mostly stopped using frames circa-1999.

Topic		Replies	Views
Calling webmasters and html gurus: Keeping a site/page from being indexed Factual Questions	6	1003	February 21, 2009
How do I make my webpage invisible to search engines? Factual Questions	6	5902	January 14, 2005
Blocking Search Engines from web pages Factual Questions	4	822	March 20, 2009
SDMB not crawled by search engines? About This Message Board	4	866	June 2, 2002
How can I stop Webcrawlers? Factual Questions	16	1559	September 28, 2001

Search Engine / HTML question

Related topics