View Single Post
Old 05-18-2016, 04:31 PM
JR Brown JR Brown is offline
Join Date: Mar 2006
Location: Boston, MA
Posts: 1,032
Originally Posted by Mangetout View Post
Actually, it's a bit weird - I think there must be some essential detail missing from the write-up, because it sounds like they:

Built and trained a system for correlating brain scan data to specific movie clips from a small library, allowing them to:
Determine which clip is being watched, and having done so:
Construct a shitty interpolated computer rendering of the known clip.

Unless I'm missing something, the visually striking output from step 3 is not actually a rendering of anything inside the head, just a re-rendering of a known piece of footage. It can't be that lame can it?
Close but not quite.
1. They recorded from the brains of volunteers while they each watched thousands of short video clips
2. Then for a given person they selected the brain activity pattern corresponding to a tiny segment of a selected clip, and found the 100 clip-segments that induced the most similar brain activity pattern in that person
3. Then they averaged those 100 "most similar" clip-segments corresponding to each segment of the original clip and strung them together to make the "video reconstruction" for the starting clip

In other words, they are finding videos that induce a similar response in a specific individual, and then trying to pick out what visual features the similarities correspond to by blurring those videos together.

From comparing the reconstructions to the originals, you can see that certain things can be picked out pretty reliably one you know an individual's activity patterns. In particular, the responses to images of people are reliably similar to that for other images of people in similar poses, and text brings up other text in similar positions (people and text are probably the two most readily identifiable things, since there are specific brain regions largely dedicated to face recognition, and text activates language areas, so once you know the individual person's activity map those signals are fairly distinctive).

Beyond that, simple patterns with strong contrast (the horizontal lines and central blobs) bring up other similar patterns, etc. So if we could record from someone without the giant machine, we could probably tell if they are looking at a person, or at text, or something with a strong horizontal/vertical/diagonal/center-surround structure, etc., but so far we are nowhere near being able to tell exactly what the person is looking at, and of course the algorithm needs to be trained on a large amount of data where the activity is matched to known visual input.

Last edited by JR Brown; 05-18-2016 at 04:36 PM.