The information about what the Jaws poster looks like is stored. Just like information about what the Statue of Liberty or the Mona Lisa is stored. The actual .png of the poster or photos of the Statue of Liberty or Mona Lisa is not.
Babale is correct that the original image is not stored. Rather the information about what makes up the poster is trained into the AI. Just like I could explain to someone how to draw the poster without needing to display an original copy. You ask it to draw a new copy of the poster but you can’t just reverse engineer the original out like flipping through a photo catalog because it’s not there.
Just as PennyLane.mp3 doesn’t actually store the song Penny Lane by the Beatles, just some information about the frequency and amplitude, that allows my MP3 player to approximate what the song songs like.
Great post, just quoting the bottom line. It’s basically a Turing test, including all its quibbles and muddles. And your point is correct–once humans can no longer reliably distinguish between human sapience and device functionality, we generally won’t make that distinction.
And yet, you’ll find that the file size of the models generated after all that training, which you can put on a brand new computer unconnected to the Internet with no access to any of these images, is far too small to contain even a tiny fraction of the original images.
A good artist might be able to draw an even better C3PO and Jaws posed shark from scratch. That doesn’t mean that they have the images stored in their brain like a JPG; it means that they have enough of an understanding of what makes a robot look like C3PO and what pose makes a shark look like the one from Jaws to create a similar image from scratch.
You are very confident, but also very wrong.
None of the patterns generated in the Game of Life are encoded as 1s and 0s in the file that defined the rules of the Game, and yet, if you run it with any given initial state, you will always get those patterns back again.
Image generators are much more like a very complicated Game of Life than like a library of pictures.
No, not really. More like how my brain doesn’t have a copy of Happybirthday.mp3 in it for someone to extract but I can still sing it or make parody versions or create a song in a similar style.
Penny Lane by the Beatles IS the frequency and amplitude information, just like an image is a pattern of pixels.
An AI Song Generator doesn’t store the frequency and wave information. Instead it has a bunch of metadata about what Beatles songs are like, and what songs that match various keywords that people also use to describe Penny Lane are like, and it can use all of that information to put together something that’s very reminiscent of Penny Lane; but it cannot play back Penny Lane, the way an mp3 can.
No it is not because there are no limitations future AIs are going to face that they do not face today. We can trivially list the ways computer programs can do things humans cannot do:
They can be trivially cloned to produce a perfect copy
They do not get tired
They do not age
They can do the same tasks with absolute consistency
They can be adjusted at will
They can communicate their direct inner workings with each other
Can we deliberately create a computer experience that is none of those things? Sure, but it would be a choice and a weird choice and thus we must look to the creator of that choice as to why they made it. With humans, we don’t have a choice, that’s, depending on your theology, something left up to god, evolution or a cruel quirk of the universe.
It’s arguable that if we ever advance to the state of humans being immortal, trivially cloneable, ESP capable and 100% rational, humans would be unable to create art as well because what is there left to make art about?
Or to take another angle, the only reason “murder” is an ethical violation is because of the unique irreplaceability of human life. If we developed the technology of trivial cloning, the concept of murder might disappear as a concept. You might create a clone of you to go fetch a beer from the fridge and deliver it to you and then kill that clone after it’s delivered it, it would have no more moral weight than typing “kill -9 emacs” into your console.
If you had someone in that universe who, as a matter of choice, decided they would create no clones of themselves, you arguably still couldn’t murder them because they were the ones who chose to be a big dumb idiot and let themselves die. It’s only through our inability to clone ourselves that murder becomes a meaningful concept.
The very ability for AIs to trivially copy themselves means we can only ever create a simulacrum of feeling the experience of “murdering” an AI. We know they can copy themselves and we can’t not know that and we can’t make it so some future more advanced version of AI stops being able to do that. All we can do is make one specific AI stop doing that for some pointless reason.
And you could come up with a set of initial conditions that perfectly recreate the Jaws poster in Conway’s game of life. And in doing do you would have encoded the information represented by he Jaws poster in those initial conditions (and the rules for the game of life). Just as much as if you’d encoded it as a JPG.
If at the end of all the previous explaining, this is still where you start as a prior, then I don’t know how more explaining will help. Take what you just said as a premise and draw it all the way out. You immediately run into obvious contradictions that require penny lane to be more than just a set of wiggles on a graph.
These are facts about the current implementations of computers. It is not a logical necessity for these facts to remain true in the future.
To pick at one point: AI is often seeded with noise because that helps make the output less consistent. This is desired behavior behavior because too much consistency is not useful to us. I am not willing to presume what other current properties of computers might be modified in order to improve their implementations of AI.
I wasn’t talking about wiggles on a graph, but I can see how you might be confused by my phrasing.
Ok, let me adjust my initial statement by one word.
Wiggles on a graph represent frequency and amplitude. The wiggles on a graph, or grooves on a record, or the data your computer uses to both draw wiggles on a graph and play music through the speakers, are the information about frequency and amplitude, and they are used to generate the frequency and amplitude through a speaker; but Penny Lane is the actual sound waves, not the information about them (which is merely used to perfectly reproduce Penny Lane at whim, no big deal).
Could you? Maybe if you used a very large canvas with very large starting conditions and defined how many ticks to go for and what subsection of the finished enormous canvas to look at. By that point, you are almost certainly looking at a much larger file than a JPG.
That said, that’s a good example of the difference between a deterministic encoding method, like image compression that would be used to store pictures, and what AI is doing. Because you will never be able to prompt an AI to give you the Jaws poster and get back an exact reproduction.
The AI may have seen so many reproductions and parodies of the Jaws poster that it has a pretty solid idea of what you want; but it will never perfectly reproduce the original, because it cannot, because the original doesn’t exist anywhere in its memory.
But it can. As demonstrated by that image above. Given the right prompts it will give you back a close to exact rendition of Penny Lane, just like that image contains pictures of Jaws, 3CPO and R2D2.
That is only possible if those images are encoded in the AI model. That’s just a fact, that’s how information works. There are not magical AI fairies
I want to perform Penny Lane legally on the radio so I dutifully send a check to the Michael Jackson estate for the rights to do so and some lawyer comes back to me and is like, “not so fast, the rights to the Beatles catalogue was sold to Sony in 2016”. I peer inside the “frequency and amplitude”, of the song, nowhere in there do I find the word Sony.
Something that was legal for me to do in 2015 is now, via the same set of actions, illegal for me to do in 2017 despite every single iota of the frequency and amplitude remaining the same between those two years. That’s only the most trivial example.
Think for even a second more, which frequency and amplitude is Penny Lane? The studio album? The live performance in London? The live performance in New York? If I sing Penny Lane at my local pub, not a single frequency or amplitude is shared between that and your hypothetical example. There are thousands of different frequency and amplitudes that are indisputably Penny Lane and a countable infinite number that are indisputably not and a few million that are in some grey area in between. Can you derive a mathematical equation to seperate out these things?
And that’s not even mentioning the larger point that what penny lane is is the history behind it and the real world impact it had and the artistic intention etc etc etc. which I’ll even grant you as something we have a disagreement on.
I’m simply commenting on the fact that your proposed model of the world does not hold up to even a minute’s scrutiny and if you can’t see that, then we’re at very different places in the conversation.
Yes you could. That is proven. Conway’s Game of life is Turing complete so anything a regular computer can do Conway’s Game of life can do.
Setting up a set of initial conditions to recreate the Jaws poster is no different conceptually to writing a computer program to spit out a JPG of the Jaws poster.