View Single Post
Old 01-27-2018, 12:31 PM
SamuelA is offline
Join Date: Feb 2017
Posts: 3,903
Originally Posted by Tripler View Post
Pick any one. Go from there. . .

The machines you describe are dependent on the data fed to them. For example, i f the Soviets/Russians decide to feed bad data into the machine, then you'll have bad output.

I have no idea what you are talking about in context to the current conversation. You've leapt from plumbing to neural analysis of potato chip bags without any reason. Responding to your one idea though, a human must program that machine to '"know" a bag of chips crumples.' A human, with all of his/her emotional imperfections, will program that machine. And you, as a computer programmer, must appreciate that.

Can we at least agree to hate the Soviets?
Ok, so let's say the classifier says the environment is state S0. That's what the classifier thinks is true "right now". S0 is just a tuple of several matrices, some for position, some for geometry, some for color, some for velocity, etc.

The simulator/predictor is a neural network, such that Predictor_Convolve(S0) = Predicted S0 + dt. That is, it's making the prediction that after a small amount of time, there will be a new state.

You can obviously keep re-running the predictor and the predicted states are going to become increasingly uncertain for moving objects and stay pretty firm for stationary objects.

The key trick is that after dt actually passes, you feed back what the environment actually did back to the predictor. And you adjust it's matrix of numbers in a way that will cause it to give more accurate readings next time.

Then the other key component of this system is a planner. This is a system that guesses possible paths that might accomplish your goal. So if it's "shove the red ball to the left touching nothing else", the "goal" is just a matrix of numbers that contains a shift to the red ball position. The planner will come up with possible guesses as to sequences of robotic arm motions that might accomplish what you want.

The planner's guesses get optimized by comparing them to what the predictor will think will happen.

And then the system picks the best path and does it. It uses the results from that path to update the planner.

Given enough data, planner has "machine intuition".

This is where this starts to really work. These algorithms need not be even a tiny fraction as good as human brains. But if you can give them the collective experience of a million separate robots working for 1 year, that's a million years of experience. Or maybe 1000 real robots and 999000 simulated robots. Either way, this vast pool of data will mean that the predictor has truly "seen everything". The planner has tried many, many strategies and knows for a given configuration what type of things are actually going to work.

This is why you get superhuman performance. Your machine has far more experience doing what it does than any human alive. Also, it always does it's best. At all times, it's faithfully working out the optimal answer from the data it has. It never gets tired or angry or bored.

You can see how this type of algorithm slowly gains on humans. You could build one that knows how to fight jets in a dogfight. It has millions of years of experience in aircraft simulators and a smaller amount of real flight time. So it's always going to be calculating the path that optimizes it's chance of victory, but doing so using expected value calculated from the sum of all the outcomes that typically happen in a given scenario.

Last edited by SamuelA; 01-27-2018 at 12:35 PM.