I’m new to this website building thing, so there’s a bit of a little learning cliff. It probably wasn’t the best choice, but I used Yahoo for hosting, and Yahoo Site Builder* for building the site, which is for somebody else. This person does not want her website to be picked up by the search engines. So following instructions, I placed this code in the header of every page:
<meta name=“robots”’ content=“noindex”>
My question is, does this reliably work? Will none of the content be indexed? What about page titles and other header content?
*That’s why I’m asking here–Yahoo’s help center isn’t very – um, helpful.
You can do that plus you can also add a robots.txt file.
Also sign up with Google Webmaster Tools to make sure they get your robots.txt right away. I think you can also specifically delete pages from their index if any do get indexed. You can do the same using Yahoo’s tools
I don’t know if there are any unscrupulous spiders/robots out there that would ignore the nofollow and noindex rules, and robots.txt. But, if something does pick it up and then Google or Yahoo gets links via the unscrupulous spiders, you can use those tools to remove.
There are. Email address-harvesting spambots love to look at robots.txt for pages that may have email addresses on them.
But the major search engines will respect the rules in robots.txt and your meta tags (as long as they’re written properly.)
If I understand the implications of this–if I happen to build another website that I don’t want to hide from the search engines, I should still hide the contact page then, right?
Really it’s just best to not post email addresses directly to the web. Either present them as a graphic or use a contact form that doesn’t have the email address in the html.
It’s best to use a php or perl script (there are tons of free ones online if you Google around) and make a contact form.
You can also Google around for known robots that refuse to obey the noindex meta tag and you can exclude those IP addresses.
Also beware of fake things. Like many email bots will produce a standard message threatening legal action, that way you’ll reply and they’ll get your addresses.
Best way I found is to create a php form, that way your email address stays hid behind the server. But make up a weird email address and change it every month. Since the php contact form will not show any email you can change emails whenever you need to.
I’m curious about this decision (in a non-challenging way) - she does realise that her site will still be accessible to the general public, if they know where it is or follow an inbound link (which probably will be indexed) from some other site?
Why would anyone want a website, but not want people to find it? I’m genuinely puzzled.