in this [thread=396958]MPSIMS thread[/thread] about googling for your Soc. Security number , there are posts warning you NOT to type your private info into a google search.
Can someone please explain this to me? What is the “referrer string”?
I imagine that it work like this:
If I search for my soc security number, google finds a hit, (say , a university)and then I click on it, google remembers that there is a logical connection between my number and a web page called, say, “famous_students_who won a Nobel prize but still owe money to the university.html”
So now google remembers a “referrer string” that shows a relevant connection between that web page and my Soc Sec. Number. I suppose that means that the next time I search for my SSNumber , google will give a higer rank to the university page. Higher than, say, to a page from the county jail web site called “famous criminals who have escaped.html” (I am assuming that my SSN appeared on both pages )
But my number was already listed on those web pages, otherwise google wouldn’t have found it when I searched. So my “confidential” information is still open to the whole world. What harm does it do for me to search for it?
What am I not understanding?
By “referer string”, they mean the http referer header. Any time you click a link on any web page, not just google, your browser sends along a header called “referer” with the URL of the page the link is on. In the case of a google search page, that will indeed include the search keys in the URL. The referer header is there because it serves a useful purpose, although it is open to some minor abuse. It lets the site know that your request came via a link from somewhere. This is how sites prevent direct links to their graphics, for instance. If you want to avoid the header, right click on the link, select properties, cut the URL and paste it into your browser.
A Referer tells the web server you are visiting how you got there.
Why is this important …
Take Slashdot.org - a tech news and comment site. As soon as an item with a web link is posted, that site will be hammered into oblivion by thousands of geeks. However, a smart web admin will detect lots of hits with a slashdot.org referer, and divert them to a static cache or “sod off slashdot” message to survive the slashdotting. **Digging **and Farking should be dealt with in the same way.
In the case of Google, a canny webmaster will be really interested to know not just that the visitor came from Google, but what search took them there. This might inform them about adwords to purchase, or let them modify the content of the page to better catch particular searches, or show them that they have been Googlebombed.
In your case, if you make a Google search on your SSN, not only does Google now know your SSN, and may use it to modify it’s index (probably temporarily), but any site you click on will also receive your SSN as a referer, and your current IP address. Now, they already had your SSN, or maybe they just had another search term in the string. But now they have more data on you, and could now put it all together to do something nefarious.
That may not protect you, however. There is research going on regarding the ability to identify people solely from their search patterns. For example, if you pulled all the searches I’ve ever made, you’d find lots of searches that use my city name, when I’m searching for restarants and the like. Then you might find more searches for my area of the city. Then searches for my kid’s school. I’m sure you could easily figure out if I was male or female, what kind of vehicle I drive, what my hobbies are, etc. Eventually, you could probably zero in on me. You might determine that I owned an airplane, and even the registration number. Then it’s an easy lookup in the aircraft database to find out who I am. If I never searched by registration, you could probably pull up all the aircraft owners in my city, cross reference the names to the kids in my daughter’s school, and find me that way.
And once you know who I am, you could use all my Google searches to put together a profile of me that’s deadly accurate.
Specifically, the research is about how not to be able to identify people from their search patterns, i.e. how to obscure the information in a way that will make it impossible (or at least very difficult) to trace back to a particular individual. The folks who work at Google, Yahoo!, etc, don’t want anyone to be able to identify you any more than you do.
Then again, I’m sure the government is funding projects to be able to identify people using searches. It’s somewhat adversarial. Most research in that area, though, is about obscuring that information.