Just going by what works. At the time just putting “Jaws” wasn’t enough and DE3 got confused, making a giant gremlin in the background. Modifying it to “Jaws the shark” worked. I just tried five new runs based on one of the old prompts: Jaws, Jaws the shark, Bruce the shark, Bruce, and shark. Just “Jaws” continued to be mostly insufficient to tell DE3 what I wanted. “Jaws the shark” works. “Bruce the shark” works, too. Just “Bruce” does not. Just “shark” makes very similar images to the other successful ones. And this time none of the results were particularly close to the Jaws poster.
Now I explicitly ask for the Jaws poster
We don’t actually know exactly what is happening in the brain when we create some piece of art, but we can say for sure what is NOT happening. Your brain is absolutely not a deterministic automaton that predictably produces an output based on the inputs it receives. That “tabula rasa” theory has been completely debunked for decades.
But that is 100% definitely what is happening inside all computer software, your AI model included.
If we are going to assign human qualities to your AI model and the ability to create original work based on its inputs (which are other people’s work). Then what makes other software like my mixing software or the image processing kernel I just wrote unable to have those qualities?
That didn’t answer the question. Regardless of whether or not I create art, is MonaLisa.png stored in my brain or is it all brain fairies bringing it up when I want it?
We literally don’t know the answer to that question for the human brain
We absolutely do for any bit of computer software, it is mathematically provable. In order to produce a close approximation of the Mona Lisa then then a digital representation of the Mona Lisa must be encoded in the data that the computer program reads. We can even say, at a minimum, how much information must be stored for any given approximation
Saying your computer program produced that image without encoding an image of the Mona Lisa from the training image is as plausible as saying your machine is a perpetual motion machine. It’s fundamentally impossible
So brains are magic?
Probably not but they are absolutely not deterministic automata that predictably produces an output based only on the input the input they receive. That much we do know.
Computer programs (AI models included) are definitely exactly that and no more.
Of course they are. The only reason we can’t predict how a person will react to any given input is because we have a vastly insufficient model of the very precise physical and chemical state of any given brain. The only non-deterministic factor is quantum randomness. Anything other than that is believing that human minds are magical things not subject to the laws of physics.
Yeah if we can completely know the entire state of the brain AND everything that effects it, which is everything in the universe, then we can deterministically predict how it will react. Except no…
And fortunately none of the chemistry or physical processes of the brain involve any quantum physics
That’s not true. We don’t know everything about how the brain stores memory and experiences but we know a good bit and we do know that it doesn’t store MonaLisa.png. I guess that only leaves brain fairies. After all, those are the only two ways to recall information, right?
So what makes humans special is an insufficiently effective error-correcting code?
I’d say a better example is less “Look at that Jaws poster” and more what it can’t accomplish.
For example, take the painting by Pierre-Auguste Renoir, “Two Sisters (on the Terrace)”. For reference, here it is:
This is a famous enough painting that it’s certainly in the models. No one trained Stable Diffusion or Midjourney or Dall-E and left this out when they were scraping up all the art in the world. It’s also not especially obscure; any intro art student or amateur interest in French Impressionism is going to be familiar with the work. So, knowing that the work is included in the model, it should be trivial to prompt is back out, right? Here’s Midjourney’s guess at what I mean when I ask it for Renoir’s Two Sisters (On the Terrace):
That’s not a “lossy” interpretation. That’s just wrong. That’s something that is saying “Ok, so I know who Renoir was… And I know Impressionism… and I know what two sisters would look like… in an era/style appropriate way… and it’s a terrace so there’s a railing… and I guess I heard that there’s flowers?..”
The dresses are wrong, the background is wrong, the positioning is wrong, even the strokes are way off… it’s something that understood the basics of what the painting would entail but was left to its own devices to take it from there. But why? After all, we know that Renoir-TwoSisters.png is in the training, right? So why couldn’t it just look at Renoir-TwoSister.png and come up with a much more accurate depiction? I wasn’t asking for something LIKE Two Sisters, I literally asked it to give me Two Sisters and it failed. And this is a public domain image so it’s not a rights protection issue either.
The obvious answer, of course, is magic AI fairies. The other obvious answer is that there is no Renoir-TwoSister.png in the model to extract but rather a collection of tokenized bits telling it what generally makes up a Renoir painting and what they look like thematically and perhaps some general information about the painting itself (red hat, flowers, etc) so it starts with noise and tries to work its way towards a French impressionist style painting of two sisters in era appropriate garb with flowers in a way that looks like a Renoir work, etc.
Things like Jaws or R2D2 or the Mona Lisa come out “cleaner” because they’re so popular that the appropriate tags for them have been reinforced a ton of times. Something like Two Sisters is popular enough to be in the set with some weak information. That shouldn’t matter if Renoir-TwoSisters.png was in the model as an image but, well, it ain’t because that’s not how generative AI models work. They work via fairy magic, of course.
Edit: I noticed that I went with a thinner aspect ratio on my prompt so I ran it again closer to the original ratio to give the AI the best chance at getting it right. It came out considerably worse instead.
What on Earth are you talking about? These two concepts are entirely disconnected. The fact that you are not a blank slate has nothing to do with whether or not you are deterministic. Everyone is deterministic, but since everyone’s mental models are weighted differently, the result is that different people think differently in different situations.
How do we “know” that?
Everything involves quantum physics, but the fact that quantum randomness makes determinism fuzzy at a fine enough scale doesn’t really leave room for that randomness to be behind consciousness or free will. It all averages out over any meaningful scale anyhow. Pretending otherwise is just the God of the Gaps fallacy, and that gap is constantly shrinking as our understanding of quantum mechanics grows.
Have you actually looked at the code?
Would you like to? Here, StableDiffusion is open source and you run it locally, so you can tell exactly what data is available to the model:
Sure. These are based off the LAION image set which is searchable.
(The linked site is, I believe, actually an abbreviated version without all the porn and whatnot. There’s more complete search sites but they tend to be jankier and this makes my point anyway)
Dall-E 3 appears to have no clue at all about that painting. I quite like Renoir’s Two Sisters On the Terrace with a shark and R2-D2, though.
The sisters on the first terrace made a huge fucking mess. They deserve that shark attack.
Your git link goes to a user interface program. Here is the Stable Diffusion project:
That is not code. That is a database