Google has a new way to recruit unpaid labor to train its AIs game, and it’s actually pretty fun. You’re given the name of a noun, and then you have 20 seconds to draw a sketch of that noun, while Google’s AI (which presumably hasn’t seen the word) tries to figure out what you’re drawing. After six drawings, it shows you your score, and lets you see what the AI’s top matches were, and what other people drew for the same word.
I suspect that the dictionary is fairly limited-- I’ve gotten a number of repeats, and there’s no way it got something as abstract as “animal migration” unless that was already on some sort of shortlist (I only got as far as two double-curve birds-in-flight before it guessed). But they’re probably constantly expanding it, and it’s off to a good start so far.
Doesn’t seem to work for me. 0 out of 6. I’m not gonna claim I’m a great artist, and drawing with the mouse is kinda awkward, but I don’t think my drawings were all that terrible compared to the samples:
I got 5 out of 6, including some ridiculously fast (like “rake”, “stethoscope” and “blueberry”). The one it didn’t get was “binoculars”, which it thought looked more like a kangaroo…
EDIT: wow, there are a lot of very weird-looking drawings for “panda”.
It pretty good for, six out of six for three sets, except it was convinced my tennis racquet was a guitar.
I wonder if part of the secret is that you generally quit drawing when it guesses right, but keep adding adding details when it’s wrong, like how my drawing went from ‘cat’ to ‘lion’ to ‘tiger’ (correct) once I added a single stripe on its body.
@Dr.Strangelove, that looks like a perfectly respectable snail, to me. Maybe that’s a new word, and it’s still learning about it?
One that puzzles me was, I got “Hammer”, and drew a crude but passable claw hammer. But it didn’t recognize it, and I tried to think of what I could add to make it more hammer-ish. Finally I put a nail in its path, and that was enough for it… but none of the other hammer pictures it showed me included a nail.
I also got one for “Compass”, and wasn’t sure which kind it wanted, so I drew a circle-scriber. But it turns out that all of the compasses it recognizes are direction-finders.
I’m pretty sure it’s right. For example, why did it guess ‘sea turtle’ when I drew one rather than simply turtle. I’d done nothing to indicate the sea.
I think I have a theory as to what’s going on. Needs more testing. I tried again on my home system, and again got 0/6. But just now I’m at work, having had to do some maintenance, and tried it again–easily 6/6, with most of the guesses coming before I was done.
At home, I tried two different browsers, so I know it’s not that. There’s one significant difference: my home system has a 144 Hz display instead of 60 (my work display, and the most common frequency).
Why is that relevant (maybe)? Because I know it’s working from vector data. One of the preview-size drawings rendered some lines that I drew, but didn’t show up in the original drawing. So it must be recreating the image from the cursor data instead of simply saving the bitmap.
I believe it’s taking the speed of drawing into account, as well as the ordering of the lines. Similar-looking shapes may be drawn differently, depending on the dominant shape present, and it’s probably picking up on that. So why would 144 Hz make a difference? Well, it would be sampling the cursor position at 2.4 times the rate. That’s going to make me look like a speed demon in comparison to its samples, and probably throw off its calculations.
Actually it’ll make me look like a slowpoke. In fact it may not even have all the relevant information. If it tags each point as a 1/60 s “tick”, then given the 20 s limit it should only look at the first 1200 ticks. But I’ll have 2880 ticks. Could be it just completely ignores the ones over the limit, leaving me with an incomplete drawing.
Interesting take on the classic Pictionary game, thanks for sharing. On my first try I got 3 out of 6. The neural net thought my lobster more resembled a dragon, but somehow my “mailbox” drawing which in retrospect looked exactly like a wooden mallet, was accepted.
Got 5 out of 6 on the second try, but again some questionable decisions were made by the neural net - I felt I made a pretty damn good drawing of a swan which got rejected, while it accepted a picture of a Styrofoam cup as a “bucket”.
My final thought is that for the more abstract/difficult words, 20 seconds is too short a time limit.
I played this a bunch, years ago. I found it very interesting that I might have been training it in some ways, but it was also training me. I learned how to quickly get it to recognize my drawings, if they were previously unsuccessful. By changing my drawings. That doesn’t much train the AI to recognize my sort of drawing.
Like the compass thing. If you’re training it, you keep drawing a scribing compass. If you’re being trained, you switch to a direction compass.
Another thing I wonder about: It’s pretty quick to get “mug”, but its next closest matches are always “coffee cup” and “cup”. Well, that’s not surprising, to a human… but does the AI “know” that those three concepts are all basically the same thing?
Also, how many wrong guesses is it allowed to make? If it’s unlimited, then the simple strategy for the AI would be to just guess every word in its dictionary. I’m sure Google would have thought of this and prevented it, but on the other hand, it IS allowed to make some number of wrong guesses before the right one.
I expect that the NN is continuously mapping your scribble into a high-dimensional embedding space where the point associated with the mean scribble for each possible guess is known already. The system just continuously guesses the closest “guess” point to the current “scribble” point.
The fact that “mug” and “coffee cup” are similar is encoded in the fact that the points in the embedding space associated with those guesses are close together, so if a scribble embeds close to one, it naturally is also close to the other.
I’m doing the opposite: trying to “elude” the machine intelligence, stumping the computer with a drawing that a human would instantly recognize as the object in question.
I’ve found two techniques. One has an obvious explanation. The other, less so.
The first technique is to draw the object in three dimensions. This works because most people don’t do that, so the AI has no comparison point. A good example is “camera” — if you draw it from a fairly oblique angle, with the protruding lens pointed away, you can producing something a person would easily recognize, but that the computer can’t process, because nobody else drew it that way.
The other technique is more mysterious: draw the object piecemeal, in an unusual order. Example — “fish.” I drew the eye, then I went over and drew the tailfin, then I put the mouth/lip shape in front of the eye, then I added the gill line, then the dorsal fin, then the pectoral fin, and then, finally, I added the oval of the body to unify the disconnected bits. The final drawing was clearly a classic fish sketch, and was very similar to the comparative drawings displayed afterward, but for some reason the computer couldn’t figure out what I was doing. This may suggest something interesting about the logic being applied.
The model being used is probably a sequence classification model rather than a traditional computer vision model. So the input is a representation of the way you moved your mouse rather than the overall view of the image. So by moving your mouse in a “weird” way, even if the final result is the same, you screw up the recognition. Kind of like writing a sentence with all of the right words, but in the wrong order.
I learned I can’t draw. I must have drawn 30 or 40 objects and it didn’t guess one. It didn’t even guess “line”. Then, ridiculously, it asked me to draw “yoga” and I closed the tab.