We do know how “AI” works. While you often hear terms like “black box” when talking about LLMs, CNNs, RNNs, and the like, that’s because we don’t know the individual weights/biases/parameters/etc. that are assigned to the various inputs/tokens. This is due to the fact that those values are calculated inside the model as it consumes the training data. We absolutely understand how the models work as we (humans) wrote the code that they are executing. We could, in theory, output the values of all of those metrics at every step and every loop, but in larger models that would be next to useless due to the vast amount of data it would spit out. And we just don’t really care enough to try. On a simple perceptron (same principle, tiny scale), we do that sort of thing.
This is where “Death of the author” comes in. It depends entirely on how you engage with the piece. If you want to just reprint them on some cushions and sell them because they “tie the room together” and you don’t care about who made them or why, just that they look pretty, then yeah, you’re engaging with it as content.
If you’re using it as a lens for trying to understand what the artists wanted to express and why they chose this particular set of choices to convey their meaning, then yeah, you’re engaging at it at the level of art. You’re using something someone made primarily as a way to understand them better as a person.
“Death of the author” claims that, although authorial intent is important, the author isn’t actually necessary for authorial intent. The reader is also able to see authorial intent that is unnoticed or even flat out contradicted by the author and this should be treated as equally valid. This doesn’t mean you can just make up any interpretation and it be as correct as any other interpretation, people have fun making joking “interpretations” of popular fiction as something opposite to what the author intended and they intentionally abuse the analysis process to get to their conclusions (as a part of the point).
But I can look at the ancient lascaux paintings and wonder what their social structures were and what they were trying to communicate to each other and why they chose to make the choices they did to convey what they wanted to convey and listen in, as imperfectly as we can across the vast chasm of age, to a conversation early humans were trying to have with each other that were the first steps of extending beyond the literal.
This may be an example of the Overvaluing Of Ideas.
There’s a cliche (that has been discussed on the SDMB before) where someone says to an author: “Hey, I’ve got a great idea for a book. I’ll tell you the idea, you write the book, and we’ll split the profits!”
Which is ridiculus, because having an idea (for a book, or an app, or an invention, or a song, or an image), even a good one, is the easy part. It’s implementing the idea that takes skill and effort and creativity and judgment. And it’s in how you implement that idea that most of the artistic merit and interest (if there is any) comes in.
But this is only true because it currently takes skill and talent to translate those ideas into an image. If someone has the means of waving a competent image into being with a thought then the value of being able to physically produce the image goes down dramatically.
Well, first off, I intentionally chose a pretty trite idea because there is little skill involved in dumping some bullets into a vase and taking a photo of it. At the same time, the result would be something that people would at least recognize as trying to be “art” and making a statement. If I spend five minutes dumping bullets into a vase and take a photo or spend five minutes prompting the image, print the results and hang them up at the Freshman Art Exhibition, they’ll probably both elicit the same responses.
Although we appreciate labor and talent, that can’t be what makes something art. Otherwise your banana-on-the-wall or “red circle on blank canvas” type works couldn’t be art since they took minimal labor and skill to execute. Their value lies in their statement and vision. When people DO scoff at those works and say “My kid coulda done this!” we view them as not understanding art because it’s not about your ability to paint a circle, it’s in WHY you painted a circle. But if an AI rendered that circle at my direction and under my intent instead of me painting a circle, what difference does it make?
I’m not even really arguing that AI art is ART. I dunno. It sort of doesn’t matter too much to me. But I find it hard to come up with a solid reason why the products of AI, under the directions of a person with intent, couldn’t be art.
So to put this (yet) another way. I (legally) download a few MP3s of my favourite artist. Then i use my fancy audio mixing software with all the its high tech bells and whistles, like autotune and what have you, to mix those tracks into a new track. Have I just copied those tracks to make a derivative work?
No, of course not, my fancy mixing software was simply inspired by those original tracks to create a new piece of art, just like all the artists over the years who have been inspired by listening to the Beatles to produce songs that sound a bit like the Beatles.
If you disagree with this. Why does the collection of matrix multiply operations that make up the AI model get to have the human attribute of “inspiration” but my fancy mixing software does not? What is the technical difference between those two collection of machine language instructions that means one is just copying but the other is creating new unique creative work?
Well, why do humans get to have the human attribute of “inspiration”?
Because the courts of the world have decided they do. That’s a pretty uncontroversial point of law.
You’ve made a derivative work. Whether or not you “just” did that is up to interpretation since it’s possible for a lot of skill and talent to go into creating a derivative work.
But that’s not what an AI is doing anyway. If I ask a music AI to create a bluegrass inspired track, it’s not just taking a couple of tracks and mixing them. It’s using the training of umpteen bluegrass tracks to find commonalities and determine what makes a track “bluegrass” and building a new track from there. Which is why some people compare it to human learning – you listen to a bunch of music, learn what makes it “that” type of music and learn how to make your own songs in a similar style based on what you experienced from the other tracks. It’s also why the legal argument for a derivative work falls apart; there’s nothing to point to and say “This riff came from this other song” because that’s not how the AI is constructing things. And, without being able to point to that, there’s no valid argument that you were ripped off.
In fact, in the case of image generation and I believe music (and don’t know enough about how LLMs work to guess), the AI starts with a canvas of randomized noise and then works in iterations to form that noise into what it understands the prompt to be. So asking for a watercolor of a kitten starts with random noise then removes/changes pixels that don’t seem to fit a watercolor of a kitten and keeps doing it until it arrives at what it thinks is a suitable image based on its model training. What it’s NOT doing is starting with images of watercolor kittens and “mixing” them into something new.
Talk of AI inspiration or intent seems kind of moot. While it would be neat if an AI spontaneously decided we need more late period French impressionism in the world and started making images to fill that desire, the reality is that right now it’s humans driving the process of prompting for new music, images, prose, etc with varying levels of success. So the inspiration and intent is easy to pin down since it’s coming from humans just like it has since the dawn of man.
Why is it incoherent? I can totally imagine an AI that cooks up fancy combinations of dishes, even things that no one has tried before, and this being treated as a fine dining or gourmet experience.
Problem is that a similar thing can be said about the human process: we could in theory understand all the weighting and output the values of all of those metrics at every step and every loop, but it is so large a model that is useless for understanding what is actually going on. It doesn’t explain creativity let alone other emergent properties. There is no reason to believe that whatever processing patterns emerge are the same between the two, but both are too large to be subject to that reductionism as sufficient for understanding what it does.
I see no reason at all. It is another tool. Same arguments were made dismissing photography as art but it clearly is. It is however a different medium of art.
Oh did I not say by “a few tracks” I mean umpteen different tracks. My favorite artist has been churning out double albums since the sixties. I spent a month in front of the computer to produce this piece of work. I am very happy with the work my mixing software has done it’s a truly inspired artist (no one else is probably but that’s beside the point)
All those fancy processing techniques my mixing software has also used a random noise a lot.
No it’s starting with some images of watercolors and some images of kittens and mixing them together and giving me the result. It’s doing so in an incredibly clever way but it’s fundamentally the same operation…
- I start with a bunch of pieces of art that someone else made encoded as digital files.
- I pass those files to a computer program (a bunch of a deterministic computer instructions that have no agency or human attributes) that does some very clever processing on them
- Based on instructions I give the computer program it spits out a “new” piece of work based on the inputs I provided originally (that were someone else’s art)
Why is one copying and one not?
In a nutshell, yeah, that’s pretty much what they’re doing, assuming the most common method I see in the wild.
Tensorflow, the most common lib used today, works by creating a spectrogram of the audio it’s fed. Then it generates an image of white noise, based on a random seed, and, like all diffusion models, “denoises” the image until it matches the initial spectrogram. It’s the exact opposite of how humans make music, in my experience, which is an additive process.
Another method is to use a GAN versus diffusion, generative adversarial network, where the model “synthesizes” a new spectrogram and attempts to fool itself.
No, it’s not.
So there are magical AI fairies that draw those pictures? Fundamentally the outputs of the AI come from its training data. The way that training data is combined to produce the output is very complicated and really clever but it’s not creating anything new
If your training data only had pictures of baby puppies then your output would not look like kittens. If your training data only had charcoal drawings then your output would not look like watercolors
Sure, just like a person who has never seen a kitten couldn’t draw a kitten. But someone who HAS seen kittens isn’t just mashing together two kitten pictures each time they draw one.
And similarly for my mixing software. Sure if I hadn’t included Brian Eno’s 24 minute noodly masterpiece Extracts from Music for White Cube in its inputs then that bit in the middle of my track won’t have sounded so much like that. But that’s just because the software was so inspired by Eno’s ambient tour de force.
Why does your collection of machine code instructions get to have the human attribute of inspiration whereas mine does not?
You’re asking the wrong guy. I never said AI has independent inspiration. I said it’s a tool for human intent. I also said that its output is not directly derivative because you cannot point to what was copied. The court case you cited backed me up on that.
So what is it that makes your collection of machine code instructions that takes other artists work and spits out a new piece of art based on them, “a tool for human intent” that is creating non-derivative, original work, whereas my collection of machine code instructions that does the same thing is just making a copy?
Well, since your machine doesn’t actually exist and mine does… I guess that. When you make a song in the manner you described, go ask the courts. Frankly, it’s a silly and messy argument which has done nothing to convince me that AI output is innately derivative, much less so in a legal infringement sense.
Edit: Though I actually did answer it before. Your example is taking existing tracks and mixing them to create a new track. The AI is starting with random noise and trying to use the principles it was trained on to create a new appropriate track based on your request. You want to say these are the same thing but I fundamentally disagree so there’s probably nowhere else to go from there.
So one case it’s taking existing tracks and combining them with random noise and clever maths to produce a new track based on the parameters you provided. And in the other case it’s taking existing tracks and combining them with random noise and clever maths to produce a new track based on the parameters you provided.
Got it totally different than clears it up.
Dammit, it doesn’t exist!?! Then what have I just paid one and half grand for a copy of Magix Sequoia for then