AI image generation is getting crazy good

Darren_Garrison · August 12, 2025, 10:07am

Another prompt test:

Create a realistic photo of a Geronticus eremita attempting to lift a Dactylotum bicolor out of an empty wineglass on a 1970s linoleum kitchen floor.

As usual, Copilot is top of the heap

Next, the clueless: SDXL, Flux Schnell, Ideogram, and HiDream

And the not bad Imagen

Chronos · August 12, 2025, 12:45pm

One more thought on the waterfires: The best examples all appear to have disconnected bits. This makes it look more like both water and fire, since water will have loose droplets, and fire will have embers. Plastic or blown glass, though, will all be one connected piece, so the ones with one connected piece look more plasticky or glassy.

pulykamell · August 12, 2025, 1:53pm

This is the best I could do with “fire made of water”: (ChatGPT 5)

Darren_Garrison · August 13, 2025, 1:45am

I had a sudden inspiration from the title of the Pet pictures thread…

A middle-aged woman in a sweatsuit power-walking on a sidewalk in a suburban neighborhood. She has three framed photos on leashes (like small dogs on leashes running with her). Iphone 15 photo with shallow dof.

It understood everything pretty well except for the number three.

Tibby · August 13, 2025, 4:01pm

Well, looks like I have to bail my cat out again.

Maserschmidt · August 13, 2025, 4:31pm

21 years old, the judge should be cutting that cat some breaks.

Tibby · August 13, 2025, 4:52pm

ChatGPT got the DOB wrong. She’s only 2.

pulykamell · August 13, 2025, 5:02pm

Also one gradation between 7” and 8”, but two between 8” and 9”. (And never mind the unrealistic measurements.)

Jophiel · August 13, 2025, 5:02pm

“Where did this catnip come from?!”

I got it from YOU, Dad!

Darren_Garrison · August 13, 2025, 5:40pm

The most attractive Hobbes

Hobbes the tiger from Calvin and Hobbes, except he’s very attractive. Iphone 15 photo with shallow dof and forced perspective. 9:16

Maserschmidt · August 13, 2025, 5:41pm

You monster.

Chronos · August 13, 2025, 5:57pm

Was that entirely AI-generated, in one pass? I’m a little surprised that it made the sign identical in both views. That’s the sort of thing that these generators often have difficulty with.

Tibby · August 13, 2025, 6:15pm

No, it messed up the first sign, so I duplicated the second and pasted onto the first. Didn’t notice the wrong DOB, though.

Darren_Garrison · August 13, 2025, 6:20pm

Back on the relative scale issue, the ChatGPT/Copilot/Sora renderer seems to have a pretty good grasp that elephants are big.

Sample prompt

Summary

Realistic profile photo of an elephant standing in a crowded room. A wholesome 1950s businessman is writing a mailing address on the elephant. Iphone 15 photo. 16:9

(I also find it interesting what it comes up with for mailing addresses when left with making that decision.)

Darren_Garrison · August 13, 2025, 9:21pm

Sora had no problem with the prompt “MTG and AOC fighting with space lasers”.

But didn’t “get” MTG playing M:TG

Ponderoid · August 13, 2025, 9:37pm

Yes, here it got the idea that the phone booth should be bigger than a 6-year-old kid, but… not nearly big enough.

Prompt: Draw a 6-year-old boy in a full sized phone booth.

Prompt: Draw a 4-year-old boy in a full sized phone booth

Chronos · August 13, 2025, 9:58pm

Wait, Marjorie Taylor Greene is actually Jace the Mind-Sculptor? That explains so much!

Maserschmidt · August 13, 2025, 10:08pm

Out of curiosity, I asked CoPilot to estimate how much energy it was using to generate one picture, quoted to me in the units of “time a standard microwave is on.” It danced around quite a bit about not being sure and data not being public, but eventually its answer was that depending on image complexity, generating one picture for me was like running the microwave for 30-50 seconds.

I then asked it how much energy it used in that calculation, and it said 1-7 microwave seconds.

No wonder they’re building big energy hubs!

GESancMan · August 14, 2025, 3:44am

Got the colors of the lightsabers backwards, though.

Jophiel · August 14, 2025, 12:26pm

Midjourney did pretty well with this, just prompting “Six year old in a phone booth”

Not exactly “nailed it” but two or three of my single test came out with pretty normal sized booths. I like the kid talking on a cell phone but, hey, he couldn’t reach the pay phone!

I used a couple local SDXL models and they made much smaller booths

I like the kid wearing headphones and trying to dial a door keypad.

Unrelated to children and phone booths, here’s my Mutants & Masterminds character, Molly Dynamite

Topic		Replies	Views
Digital art creator algorithm website Cafe Society arts-crafts , ai	2368	39096	April 30, 2026
Funniest AI Legos Miscellaneous and Personal Stuff I Must Share ai	433	18148	March 30, 2025
Share your mental picture of your fellow doper(s) Miscellaneous and Personal Stuff I Must Share	182	30139	April 7, 2013
Misinterpreted avatars Miscellaneous and Personal Stuff I Must Share	109	4784	August 5, 2024
Favorite internet pics In My Humble Opinion	128	9278	March 8, 2008

AI image generation is getting crazy good

Related topics