My awesome Google search moment

I was reviewing medications for a study I’m working on and a study subject had taken ibuprofen for “…and elaride pain” as it was recorded in the database. “Elaride?” I asked myself and off I went to Google. Googling produced a link to a Medical Lexicon in google books. I pulled it up and there’s this old lexicon with what looks like elaride highlighted near the binding side of the page. Upon closer examination it’s not elaride but clavicle, which makes perfect sense in my case because another section of the patient’s file mentions clavicle pain. I guess data entry just entered elaride since that’s what it looked like even though it doesn’t seem to actually exist as a body part.

So google took “elaride” and image-matched it to “clavicle” in an old dictionary because it sort of looked like “elaride” because the page was bent? That’s pretty mind-boggling!

Cool.

Hmm, wonder if someone came up with that in a CAPTCHA?

It took me a few minutes to realize how the two were so similar, but when I did I realized something - if google misunderstood it using software…in a similar way that you or I might have misunderstood it…that is some scary- advanced software!

I’m voting for a captcha as well. Recaptcha: reCAPTCHA

That’s pretty dumb software. Advanced software would, like a human brain (normally), recognize that “elaride” isn’t a word and fiddle about with the image until it saw “clavicle”.

So now that highlighted image is going to show up on login page somewhere and the person has to answer “clavicle”? And then that will contribute to the digitizing of the medical lexicon?

More likely, it scanned that book with crappy OCR and matched the text.

Something along those lines. It works well. Look up Luis von Ahn or watch his Ted Talk. He’s one of the brains behind Captcha, Recaptcha and Duolingo.

Except they’d have to type something like “elaride” because that’s what Google thinks it is.

Or, quite likely, Google sent the word to Recaptcha, and that’s what people thought it looked like. That’s one of the flaws of the system–no context.

I’ll bet they keep track of the different answers for a reCAPTCHA and that there are multiple answers, including both ‘clavicle’ and ‘elaride’. If they’re really good, they’ll allow an erroneous word to go extinct if it doesn’t get enough reinforcement from searches or appearances in other texts.

Of course, if they did, we’ve just completely screwed that up.