Why when I use a search engine for small, obscure sites – and big ones, too – does the little box for search terms seem to “remember” words and phrases that I’ve entered in other places? For example, after I use Google, other search boxes will display those search terms. This site, Straight Dope, even displayed the user name that I use at the New York Times, for heaven’s sake. With all the recent news about how Google is going to track my every keystroke, I find this especially disturbing.
Thank you for your insights.
Signed: Just an English Major.
It’s your browser that’s tracking your keystrokes. Check out this guide for turning it off in MSIE. Or switch to Firefox, which has AutoComplete turned off by default, and is a better browser (IMHO).
As well as “how to turn this off”, the OP is specifically asking why auto-complete entries from one site should show up when filling out a form on a completely different site.
I’m pretty sure this is because Internet Explorer (and probably other browsers) only match on the actual name of the text input field, not the domain, when “autocomplete” = “on”. Thus, the Straight Dope might have a text input box called “username” and the NYT might happen to have the same field name on their HTML form; the browser just compares the names and offers the same autocomplete entries in both cases.
Here’s an MSDN page for the terminally curious: http://msdn.microsoft.com/workshop/author/forms/autocomplete_ovr.asp
Note that it’s a naive match on element name - it doesn’t say the domain (NYT, Straight Dope) has to match.
Hopefully, you can be reassured that Google doesn’t track your searches - they’re recorded by your web browser, and if you don’t like that, it’s easily disabled, as per the instructions in Jurph’s link.
Oh, yes, they do. Or rather, they can–if you sign up for Personalized Searches. Once you do that, then Google keeps track of your searches and uses this stored information to better target your search results in the future.
As QED said, if you’re logged in this definitely happens. But even if you’re not logged in:
I’m not sure how that is “non-personally identifying”. It’s like saying your fingerprints are non-personally identifying simply because they don’t have your name directly encoded in them. In fact, your ISP can take that IP address that Google has along with the timestamp and then determine who you are.
What about aggregating this information over time? Check out Karen’s Cookie Viewer. By default Google makes cookies as old as possible:
My current google cookies expire on. 1/18/2038. That’s circa. 11,791 days 2 hours 12 minutes 23 seconds (This same date from each of googleblog.blogspot.com, google.com and blogger.com)
Google also uses javascript to silently (as in, without refreshing the display) rewrite urls such that you are sent through a forwarding url on the google domain so they are aware of the link you clicked.
It comes down to whether or not you trust Google. John Battelle’s recent book, The Search has a great section that delves into this topic. If you want to protect yourself and only want to use Google’s vanilla search product, disable cookies and javascript in your browser. Further, apply the guidelines in this EFF document for blogging anonymously to your searching.
I feed G as much data as I can, but that’s just me.
Acrtaully, there is an internet almost-standard for this. The idea was to make it easy for the end user, where most sites would use the same internal names for the same fields and the poor user could browse to a strange page on a strange site & find his browser had already filled out his name, address, phone number, etc. All the user had to do was click [submit]. See RFC 3505: Electronic Commerce Modeling Language (ECML): Version 2 Requirements - The RFC Archive and http://www.rfc-archive.org/getrfc.php?rfc=3106.
From an end-user perspective it’s a little scary becasue, as the OP indicates, he/she can’t tell whether the data originated from the server, or from the client, nor whether it’s already been transmitted to the server without their permission.
Oh crud… Forgot all about their “personalized” searches and information harvesting. :smack: And yes, if you let them, they will quite happily absorb a lot of information. I just discovered that if you use the Google Toolbar’s spell check, they get that info, as well as if you have the Page Rank feature enabled. The Auto Link and Translation features also send your info to the mother ship.
I was only referring to whatever appears in a drop-down box, earlier.