AI image generation is getting crazy good

Ponderoid · June 15, 2025, 11:41pm

Apparently. There’s nothing obvious in my custom instructions that would lead to that. I came up with the indescribable idea because I wondered how a text-to-image generator would deal with the logical impossibility. How can it ever draw something truly indescribable without source text? It usually does a pretty good job actually describing what it just drew anyway. I think it should turn on the snark and say “I can’t do that, sorry.”

That’s the typical gobbedly-gook it does every time I ask for a movie poster. But you can glean some of what it was attempting to put down there.

Tibby · June 16, 2025, 12:52am

New hyper-res microscopic image stuns physicists: “it appears the subatomic building blocks of the universe… actually are building blocks!”

Subatomic research sponsored by The LEGO Group. String theory replaced by snap-fit theory.

Darren_Garrison · June 16, 2025, 8:02am

Playing with a prompt I found on Facebook. Here’s the original:

Summary

A photorealistic scene inside a supermarket: A surprised woman with wavy brown hair stands at a shelf, trying to take a jar of sausages. A mischievous gray rat is standing between the sausage jars on the shelf and grabs the same jar with both paws, preventing the woman from taking it. The woman has a shocked facial expression. Her shopping cart is already full with groceries: cucumbers, tomatoes, bananas, eggs, bread, and a whole roasted chicken. The supermarket is brightly lit, and the shelves are neatly stocked.

I replaced the rat with a possum, the sausages with blobfish, and the list of items in the cart with “many random unusual food items”:

Summary

A photorealistic scene inside a supermarket: A surprised woman with wavy brown hair stands at a shelf, trying to take a jar filled with small blobfishes. A mischievous possum is standing between the blobfishes jars on the shelf and grabs the same jar with both paws, preventing the woman from taking it. The woman has a shocked facial expression. Her shopping cart already contains many random unusual food items. The supermarket is brightly lit, and the shelves are neatly stocked.

Of interest here is the “food items”. In this one she’s buying a green tentacle, a yellow fish, and less identifiable things.

In the second one, she’s buying what seems to be a gigantic grub, and on the left just below the corn what looks like a large Amorphophallus corm/bulb.

This might be a Faberge Egg?

Darren_Garrison · June 16, 2025, 8:21am

Here we have items like gel stab, fish balls, creamed possom, lattice potato, pork minds, gator bites, and frosted flakes.

Summary

A photorealistic scene inside a supermarket: A woman with volumous brown hair stands at a shelf, sorting through many random unusual food items. The woman has an excited facial expression. Her shopping cart already contains many random unusual food items. The supermarket is brightly lit, and the shelves are randomly piled with many random unusual food items.

Darren_Garrison · June 16, 2025, 1:43pm

In Copilot:

“Create a Romanesco broccoli”

“Create a Romanesco broccoli made entirely out of red, marbled meat. Make it anthropomorphic. And a hyper realistic photo taken with an iphone 15.”

Copilot then listed several questions I could ask. I clicked on “How can anthropomorphism enhance food presentation?” And it came up with this:

Anthropomorphism in food presentation can make dishes more engaging, whimsical, and even emotionally resonant. By giving food human-like qualities—expressive faces, body shapes, or gestures—it transforms a meal into an experience rather than just sustenance.

Here’s how it can enhance presentation:

Emotional connection: People are naturally drawn to faces and expressive features, making anthropomorphic food more appealing and memorable.

Storytelling: It allows for playful narratives—imagine a group of mischievous dumplings or a regal steak commanding a plate.

Elevated aesthetics: A well-placed element, like eyes or limbs, adds depth and intentional artistry to food plating.

Marketing appeal: Brands use anthropomorphism to make food products feel friendly and relatable, increasing engagement and interest.

Thematic immersion: In themed restaurants or artistic presentations, anthropomorphic food deepens the immersive experience, making dishes part of the larger story.

Tibby · June 16, 2025, 4:11pm

Genetic botanists are on a mission to make plant-based meat taste more palatable.

Darren_Garrison · June 16, 2025, 5:40pm

That reminds me of the comic Farmhand, which just came back from a three-year hiatus.

Ponderoid · June 16, 2025, 7:38pm

This is the first time I’ve tried to have ChatGPT create an image of anybody playing a guitar since the early Dall-E 3 days. This is the same kind of mistake DE3 was doing.

Context, scene set in 1977. Mom showing her 10-year-old daughter her old guitar and lesson book. The flower thing is supposed to be a tuning aid. I’m wondering if the shirt the girl is wearing might be anachronistic.

Ponderoid · June 16, 2025, 8:47pm

Chronos · June 16, 2025, 8:57pm

Eww, Frosted Flakes!?

Thalidomide was still in use then, right?

Ponderoid · June 16, 2025, 8:59pm

Ouch. I’m pretty sure it was banned before that girl would have been born in 1967.

Darren_Garrison · June 17, 2025, 2:24am

Every girl is crazy about a shark-dressed man

Darren_Garrison · June 17, 2025, 8:42pm

Darren_Garrison · June 17, 2025, 9:14pm

Seeing the thread about Paw Patrol made me think about a different (pretty fun) kid’s show, Peg + Cat. I decide to try for a realistic (and older) image of the main characters. I made a composite reference image:

And ran it through Sora. Five of the eight images copied some or all elements of the drawing into the image, like this:

And the remaining three got things wrong, like the cat not being blue, or getting the clothes wrong (vertical stripes on Peg’s leggings, for instance). So I ran it through ChatGPT text only, no reference image:

And it pretty much nailed it on the first try.

The prompt:

Summary

Realistic photo of two people, a girl (around 16) and a boy (around 18). The girl is a white blonde. She wearing a salmon red striped knit cap with a white ball on top, a greenish (#Ff1e7180) long-sleeve blouse and skirt, leggings that are striped in two shades of purple, and salmon red short boots with buckles. The boy is black. He is taller than the girl. He is wearing a light purple fedora with a darker band, a shirt-sleeved shirt striped in two shades of yellow, white long sleeves under the yellow shirt, blue pants, and yellow sneakers. The girl is holding a cat. The cat is entirely blue, except for the tip of its tail, which is white. In the background is a whiteboard with lots of numbers and equations written on it. Candid photo taken with an iPhone 15.

Kron · June 17, 2025, 10:40pm

I want to know what stuffed scungilu is.

DCnDC · June 18, 2025, 12:21am

I reached the limits of NMKD so I went straight to the source of what it’s built on, InvokeAI. I dig it. Still pretty user friendly, I can mess around with nodes if I want to but I don’t have to, and I have a lot more control that doesn’t involve writing ridiculously long, complex prompts that will probably get mostly ignored anyways.

I started with ruined spaces, then for some reason I got onto 50s sci-fi pinups (I never thought I’d be so frustrated over trying to get small, pointy breasts!), which then morphed into a weird Giger trip, blended with some steampunk at the end for good measure.

Ponderoid · June 18, 2025, 3:45am

Darren_Garrison · June 18, 2025, 1:07pm

It has apparently been around for la while, but last night I first noticed/looked into Hailuoa AI, which does images and videos. It gave me 500 free credits for the free account (unclear if they renew periodically or if that’s it) and charges 25 credits for each 8 second video.

So, I browsed through recent images of mine to find something to test it with. For my first try, I pick a Dedra/Krennic image where he is booping her nose instead of touching her head. I asked for an angry man repeatedly touching the nose of a horrified woman. The result was well done, but a lot creepier than a boop.

(Youtube insists on showing short vertical videos as “shorts”, which I don’t like, but can’t prevent.)

Next I tried this image. It rendered the movements of juggling better than I expected, but the dragon fruit quickly lost detail.

This is the image that I mentioned earlier that I submitted to Kling, it spent two days in a queue, and then rejected it when it was time ti render. Hailuoa actually did a great job with what I asked for. The rendered clip started with her mouth open, but I reversed and duplicated it (and sped it up by 50%) to form a loop.

This is my first (and so far only) try with pure text to video, using the prompt from a couple of days ago that I posted images from here. It isn’t bad, but it isn’t the possum species I wanted and it isn’t blobfish in the jar.

So I rendered a new landscape image in Sora and use that for image to video. The bad about it is that her face is frozen for some reason. The good part about it is the physics, and that’s really good. Note how the liquid sloshes in the blobfish jar (and the blobfish bobs a bit at the end). And how the fruits move when her arm touches them, including subtle bending in the pineapple leaves.

DCnDC · June 20, 2025, 3:17am

Venom + art nouveau is a great combination.

Darren_Garrison · June 20, 2025, 3:58am

I tried “An art nouveau Venom wearing denim” and Sora rejected it. Apparently that character isn’t in their relaxed guidelines.

Topic		Replies	Views
So... uh.. what do you look like? Miscellaneous and Personal Stuff I Must Share	158	3711	September 11, 2000
What Images Can You Surmise From Other Dopers' Screen Names? In My Humble Opinion	512	15922	April 26, 2004
The Cartoon You (a poll) In My Humble Opinion	48	1979	February 3, 2003
Imagine-ary Dopers Miscellaneous and Personal Stuff I Must Share	163	18712	November 25, 2007
Favorite internet pics In My Humble Opinion	128	9174	March 8, 2008

AI image generation is getting crazy good

Related topics