Another AI images question

Chronos · February 18, 2023, 1:10am

And a lot of those images, humans would interpret as the elephant’s head being at the front, but only because that’s where the elephant’s head is: The truck part doesn’t have any particularly recognizable front or back.

And I’m not clear that it entirely does understand the “frontness” of an elephant, given that one of those has tusks coming out of both ends, and several are missing the most quintessential feature of an elephant’s front, the trunk.

Also, I can’t help but remember one of the experiments from the long NightCafe thread, where someone asked it for a painting of the prompt “Facing the Charging Elephant”, and the AI dutifully created a painting of an elephant plugged into a charging station.

Sam_Stone · February 18, 2023, 1:26am

That would be a failure of ‘word in context’ determination. Word in Context is an emergent capability that just appeared at a certain point in training.

Riemann · February 18, 2023, 1:32am

But it’s not a “mistake” to say that lungs have a purpose in a sense that geologic formations do not.

Chronos · February 18, 2023, 1:45am

Who says it’s a failure?

Riemann · February 18, 2023, 1:54am

It’s a better “charging elephant” joke than ChatGPT is giving me.

Darren_Garrison · February 18, 2023, 1:56am

But the majority of the current explanations for why a trait evolved boil down to “because it made the organism more fit” which isn’t much of an explanation at all.

A more direct analogy to explaining how an AI recognizes specific things is if you can explain how a trait evolved by saying that a switch from adenine to thymine in the 46th codon in the gene for the 3rd step in a five step process resulted in an enzyme that was 17 percent more efficient. We have some explanations like that, but they aren’t the norm and they aren’t cheap or easy answers to get.

Explaining how the trained set for an AI model “understands” a specific concept would have a similar level of fine detail, difficulty to tease out, and gibberish-soundingness laymen. The real answer for how an AI recognizes the front of the truck would be something like “because values b724 and b725 in node 17 of layer 8 are set as ‘1’”.

Riemann · February 18, 2023, 1:58am

But who cares about that, for either evolution or for AI? Aren’t the general underlying principles by which it is operating the important explanation?

Darren_Garrison · February 18, 2023, 2:02am

No? Not when the question is “how does Midjourney recognize the front of an elephant”, or “why is a St. Benard so big”. General principles are not answers to specific questions.

Danger_Man · February 18, 2023, 6:07am

Looking at the Midjourney pictures, it also has both the truck and the elephant rotated in the same way in 3d space. It does look like two 3d models merged together.

The examples using other AI services were not as good as the ones from Midjourney. I’m not sure why, but it seems like Midjourney has its own style.

Darren_Garrison · February 18, 2023, 6:20am

Which is weird, because Midjourney is Stable Diffusion with some custom tweaks.

Jophiel · February 18, 2023, 6:23am

I believe Midjourney has a much larger model than what you get from Stable Diffusion. At least, locally run Stable Diffusion where the models are maybe 2-3GB. So Midjourney probably has a lot more data to work with and extrapolate what an elephant-shaped truck might look like.

For that matter, I suppose you could custom train a SD model on elephants and garbage trucks if you really wanted.

Darren_Garrison · February 18, 2023, 7:04am

Okay, apparently the latest Midjourney is not SD-based.

Danger_Man · February 18, 2023, 8:11am

From that link:

“Yeah, all of my Midjourney results seem to be a pastiche high quality 3D render of the prompt, instead of mimicking the style asked for.”

MrDibble · February 18, 2023, 11:54am

Given that learning systems ultimately involve human feedback, isn’t the answer “because humans keep picking the versions that have the right front bits”?

Snarky_Kong · February 18, 2023, 5:21pm

This is not true.

As to the OP, the best answer you’re likely to get is that the images in the training data tend to show more of the front of both trucks and elephants since those are the interesting parts.

jjakucyk · February 18, 2023, 6:03pm

This video is five years old now but still entirely relevant to why we don’t really know how AI works.

MrDibble · February 18, 2023, 9:45pm

OpenAI’s own discussions of their research seems to mention using human feedback quite a bit.

Snarky_Kong · February 18, 2023, 9:48pm

Having the ability to incorporate human feedback is not the same as human feedback being necessary. A large part of recent success of deep learning is that there are techniques (masked language or image modeling are most common) that allow algorithms to learn relevant features without annotations or feedback.

MrDibble · February 18, 2023, 9:50pm

I didn’t say it was necessary. I just said it did involve it. Because human feedback is all over the existing datasets these systems are trained on.

Snarky_Kong · February 18, 2023, 9:55pm

You said they “ultimately involve human feedback” (which suggests necessary to me…) and suggested human feedback is the reason why these systems can reason about relevant part of images. Both are straight up incorrect.

Topic		Replies	Views
Ask the AI In My Humble Opinion	52	3640	December 12, 2002
Artificial Intelligence - yea or nay? Great Debates	29	1353	July 6, 2000
Constructed Intelligence Great Debates	24	1237	February 28, 2001
Could computers evolve into the replacement for humans? [New title] Great Debates	59	2855	December 12, 2003
What's closer, BCI or AI? Great Debates	31	1887	October 11, 2005

Another AI images question

Related topics