I'm missing something about AI training

The most trivial example of what?

Nothing about Penny Lane changed. It’s the same exact song - the same exact set of sounds. Our society decided that the rules about who can and cannot play those sounds (for commercial purposes) have changed, but nothing about the song itself has changed, so I’m unsure why you would expect to find a difference with the song itself.

They’re all different renditions of Penny Lane.

Are you familiar with a concept called context? In the post I was responding to, they were talking about Penny Lane.mp3. Obviously, an mp3 file is going to contain a recording of just one rendition of the song; there may very well be other commercially available renditions of the song.

If I say that “this is a table”, are you going to sputter about how many other tables exist in the world, so how could I possibly say that?

Sometimes, a word can mean multiple things! Gasp!

Yes, you can use the term “Penny Lane” to refer to the lyrics, as in the case where you sing Penny Lane at the pub. Or you could use the term to refer to the Guitar Hero level (just kidding, Penny Lane isn’t featured in Guitar Hero!). What a profound observation!

Correct, there are lots of other things you could also call Penny Lane. I was responding to a guy talking about a file named PennyLane.mp3, so that’s the context I was speaking in. In other contexts, other things are also Penny Lane, including a physical street in Liverpool.

If we were totally ignorant of all of that history because all references to the Beatles were wiped out in the Great War of 2076, and all we had was a scratchy record with Penny Lane on it, we could still identify it as art.

No but it’s a pretty good lossy encoding. This image…
Imgur

Is absolutely a copy of this image.
Imgur

Are you familiar with context? This is the post you were responding to:

They introduced two different things, PennyLane.mp3 and the song Penny Lane by the Beatles. You said the second was frequency and amplitude.

Seriously - that shark has 4 eyes and gives serious “muppet” vibes.

We must have very different definitions of the word “copy”.

It’s cleat that a computer program has taken the image of the Jaws poster, processed it, and spat about out a “new” image

If my autotuned version of Penny Lane is a derivative work then that absolutely is

I’m not sure why legalities are useful in this discussion.

If you seriously don’t think so. I’d suggest you try selling that image as your own on an art print website and see how long it takes for it to get taken down for infringement of the Jaws poster.

I only wish Umberto Eco could have been around to debate all of this.

You mean a deterministic encoding method like image compression and a deterministic AI model.

AI models are just computer programs, and are 100% absolutely deterministic. There are no AI fairies at work here. If you run your AI program with all the same inputs (including training data, parameters and random seeds) you will get exactly the same 1s and 0s spat out.

Right. As in, the frequency and amplitude contained in PennyLane.mp3 IS the song Penny Lane.

That doesn’t mean that nothing else on the planet is also Penny Lane.

I’ll ask you a question. If PennyLane.mp3 doesn’t contain the song Penny Lane, what does it contain?

I’ll ask you the same question if the MP3 contains the song Penny Lane why doesn’t an AI model that is able to recreate the song Penny Lane as least as accurately as a low bit-rate MP3?

Of course it’s derivative. It’s not derivative because it came out from a generative AI, it’s derivative because Darren_Garrison literally prompted the AI to create a parody work based off an existing property. It’s derivative for the same reason that me drawing the Mona Lisa on a Vespa is derivative and that reason isn’t because I have all the 0s and 1s stored in my head.

Are you under the impression that directly reproducing an image is the only way to violate someone’s intellectual property rights?

If I draw the Jaws poster from memory, it’s all good?

No one is denying that the image is very similar to Jaws’ poster, to the point where you might have issues if you sold it commercially (although the Jaws image is so well parodied that it could very well fall under fair use?).

So to bring it back to your original statement. You said

That’s patently false as shown the fact that the AI was able to recreate Jaws, 3PCO, and R2D2 as they appear in the training image. Those training images are absolutely stored in the AI model. It’s a lossy encoding, but it’s an encoding none the less.

Those images are absolutely stored in the AI model. It is giving you back the training images it was provided as input, with some incredibly clever processing, but that’s what’s happening. To say different would be to advocate for the presence of magical AI fairies

I can sit down and draw a recognizable Mona Lisa. Not perfect, possibly not even “good” but people would recognize that it’s the Mona Lisa I’m copying.

Is this because I have MonaLisa.png stored in my brain for retrieval or because of magical brain fairies? Are those the only two options?

No more than they are “stored” in my brain. What is stored are rules and associations for all kinds of concepts, and how to generate images that match them.

I usually screencap examples of prompts and their outputs for future reference. Here are the ones I could find of prompts involving Jaws.

Clearly, that AI is just copying the Jaws poster over and over.

By the way is his name really “Jaws the Shark”? I thought it was Bruce, like the shark from Finding Nemo

There is confusion about the signifier versus the signified when we are talking about a song.

Grooves on vinyl or binary data in a file are signifiers (or expressions) of the song.

The song itself (the abstract entity) is the signified (or content) that arises through interpretation of the signifier/expression in the proper circumstances.

You need the expression to hear the content but they are not identical. That’s why I can recognize it as the same song whether it’s MP3, vinyl, midi, etc.

Model parameters are not direct encodings of expressions. They are weights that adjust how the model processes inputs based on statistical patterns of it’s training set.

The model could be said to have compressed the signified/content dimension into an abstract high dimensional feature space where items that share conceptual similarities tend to be clustered more closely together. This is, in a sense, the model’s learned sense of meaning.

So when asked to make a poster like the one from Jaws it tries to recreate something that conveys similar content using learned features, like “shark”, “movie poster”, “1970s style”, etc. It has been trained on what the content of the original poster had, but it doesn’t contain the expression of the poster.