How do "click here if you're not a robot" buttons from ReCaptcha work?

Free advertising.

For what it’s worth, I think there is data associated with the mouse click that goes into the algorithm, but I think it’s probably ignored 99.9% of the time. In almost all cases, the algorithm is going to have your browser fingerprint pegged as a bot or a non-bot based on all the previous data it’s collected on both you and the scammers. The mouse movement stuff is probably only used when it doesn’t have enough information to make a clear determination.

More accurately, the web page can include script (programming) that runs inside your browser and can record anything about you that your browser is willing to share with the script. That can include the cookies we’re all familiar with, but vastly more than that. Including things like mouse & keyboard actions.

Many people use their mouse pointer kind of like a finger sliding across a printed page while reading, so mouse movements and especially mouse lingers over specific images or verbiage on a page can be used as a half-assed form of eye tracking. Your mousing (often) gives away what you’re paying attention to on the page vs what you’re ignoring.

The script may just use the info gathered from your PC right then and there locally on your PC to make the human/bot decision and locally select whether you see the enabled [Save] or [Post] or whatever button or not. With the server never being the wiser about any details.

OTOH, there is nothing to prevent the script from “phoning home” (or phoning any 3rd party, such as Google’s ad metrics department) with 100% of the details of what it can get the browser to reveal about you, your current actions, and your lifetime history.

And of course once that data is up on the server, those systems can try to piece together what they’ve learned about you from this single page visit together with whatever they’ve already learned about you from every page on every website you’ve ever hit that uses the same tracking tech, or other cooperating ecosystems of tracking tech.

The WWW is many things. But it is utterly positively NOT anonymous. Unless you disable so much stuff that most modern websites simply refuse to function at all.

Sleep well. A hundred Big Brothers are tracking your every move. :wink:

It’s possible that they are using it for a machine learning AI that would look at images from camera traps to count lions by gender for wildlife management.

It’s also possible that they are using your responses to create more CAPTCHA tests.

I suspect that AI if nothing else overthink less than I do on those recaptcha images. What counts as a streetlight? Do I click the square if only a tiny sliver of streetlight intrudes? Do I count the pole? What about the wire hanging down between two of the lights? Auuuuuuughhhh!

So many of the reCAPTCHA tests asked the user to identify things like streetlights, crosswalks, buses, etc., that it was suggested that Google was using us to help train its AI for Waymo (its self-driving car initiative), though I never heard a confirmation or debunking of this theory.

I do wonder if for some of those “maybe” images, it’s not important if you get it right or not, but how long it takes you to make a choice.

Due to the fact of Google ‘detecting unusual traffic from your computer network,’ I just had to click the captcha box from my phone, no mouse movement there.

I hadn’t thought of anything like that but good idea. It just seemed oddly specific compared to finding all the traffic lights or interpreting a fuzzy house address like it usually is.

When they do do the images, no image is ever sent to just one person. And even a highly-simplistic (i.e., not Google) AI would know enough to put little or no decision weight on an image that has a lot of disagreement from the users. In other words, if it’s hard to tell if the image contains a traffic light or not, then it’ll probably accept it regardless of whether you do or don’t click that one.

I found a “Not a Robot” thing today on the Walmart site.

I had to click and hold on a button until it completely filled from left to right. Like a “Loading” progress bar. I’d never run across that before.

The attackers don’t know the algorithm. Most of the processing happens server-side. And the part that runs client side runs inside an obfuscated virtual machine (itself implemented in Javascript).

Given enough time, attackers can still figure this stuff out, but it takes time, and in the meantime Google can tweak their algorithm further. It’s an arm’s race, as always.

It’s entirely possible that Google themselves don’t know if the algorithm uses the mouse movements. It’s part of the dataset that gets fed in, but it’s likely that the algorithm is itself trained with deep learning. Who knows exactly what it’s taking into account?