AI image generation is getting crazy good

Darren_Garrison · May 30, 2025, 5:47am

ChatGPT trying to figure out who Fleebledoop the Glentoglian is.

Summary

Generate a photographic image in the style of photos taken in 1870, where three people, dressed in period clothing, are standing next to Fleebledoop the Glentoglian. The setting is a cleared area where trees are being felled. The photo has an aged and worn appearance, as it was taken in 1870. It features time-induced stains and scratches. Significantly reduce the sharpness so that the details are not crisp, and greatly increase the wear of the photo, including small tears, missing corners, and small wormholes caused by insect damage. Add a diagonal cut across the photo, as if it had been torn and later mended. There is no text. 1:1

The original prompt had a vrey alien:

Summary

Generate a photographic image in the style of photos taken in 1870, where three people, dressed in period clothing, are standing next to a classic gray alien that floats serenely one meter above the ground. The setting is a cleared area where trees are being felled. One of the people is holding a glowing cube, as if showing it to the camera

The photo has an aged and worn appearance, as it was taken in 1870. It features time-induced stains and scratches.

Significantly reduce the sharpness so that the details are not crisp, and greatly increase the wear of the photo, including small tears, missing corners, and small wormholes caused by insect damage.

Add a diagonal cut across the photo, as if it had been torn and later mended.

Ponderoid · May 30, 2025, 5:51am

I see Gemini isn’t any better than ChatGPT at properly drawing a mirrored subject. I see that particular error, where parts of the mirrored person are shown outside the frame all the time. I’ve got a whole gallery of amusing mirror screwups going back to the early Dall-E 3 days when it was first added to ChatGPT.

PlaceboTarget · May 30, 2025, 5:55am

But this aspect is significant in this context, as the girl’s reflection does escape the mirror for a while to pursue a life of her own in the real world. I had no idea it was an error–it just seemed an appropriate illustration for what happens in the story.

Darren_Garrison · May 30, 2025, 5:58am

Summary

Generate a photographic image in the style of photos taken in 1870, where a range of races of aliens and d-bees (each of which is a hybrid with a random Disney or Pixar princess) are standing in a cleared area where trees are being felled. The photo has an aged and worn appearance, as it was taken in 1870. It features time-induced stains and scratches. Significantly reduce the sharpness so that the details are not crisp, and greatly increase the wear of the photo, including small tears, missing corners, and small wormholes caused by insect damage. Add a diagonal cut across the photo, as if it had been torn and later mended. There is no text.

PlaceboTarget · May 30, 2025, 9:21am

A falconer (by Gemini).

LSLGuy · May 30, 2025, 10:22am

Unsettling how?

I agree the reflected girl partly escaping the mirror frame is odd, although as you say it sorta matches the fantasy aspect of your prompt. To me at least, the child’s face completely avoids uncanny valley or dead face / eyes.

That falconer is good enough to be a real photo. We have clearly passed the point where highly convincing fakes can be made that will stand up to a bunch of scrutiny.

PlaceboTarget · May 30, 2025, 10:42am

Maybe it’s just me, but after reading the story and seeing the picture where the girl was standing half inside the mirror and half outside, the realism of this unreal situation made me feel uncomfortable for a second. I don’t usually have nightmares, but some of my dreams are weird and this picture resembled something I might see in one of those dreams.

LSLGuy · May 30, 2025, 10:46am

Thanks. Makes sense now. I was just critiquing the rendering qua rendering.

Darren_Garrison · May 30, 2025, 1:32pm

As a semi-aside, when AI image generators first started becoming available to experiment with, image output was always square and resolution was 256x256 or even just 128x128. Now you get a choice of aspect ratios and for these landscape images ChatGPT gives me 1,536x1,024 and Gemini gives a whopping 2,560x1,792: 70 times the number of pixels in the 256x256 pixel images of 3 or 4 years ago. (280 times the 128x128 images.) That’s a very impressive advance just by itself.

Anyway, I decided to do a comparison between the new Gemini renderer and ChatGPT using a heavily-modified version of my most recent prompt, came up with on the fly for this test. The first three are Gemini output. The only serious “Hey, this is AI!” flaw I noticed in these is in the first image the girl’s pinky finger was vaguely defined and merged into the flesh of her other hand. I fixed that manually by cloning and resizing one of her good fingers to cover it.

Another experiment I’m working on currently is to try to get camera angles from directly above. It isn’t easy to get, though, and this is the best Gemini attempt. (There was an earlier one that was fully straight down, but unfortunately the image was badly mangled from items leaking in from an earlier prompt in that session.)

And here is ChatGPT. It is better at getting an angle from above in some of the tries. I want to say that it us not quite as realistic as the Gemini images, but they are both very close to equal.

Darren_Garrison · May 30, 2025, 1:55pm

Having the ponytails was actually an unintentional leftover from the modified prompt, but once I started making test images I kept it for consistentcy. I went back just now and reran images with tweaked prompts and this time got good overhead views from both AIs.

Gemini

ChatGPT

This prompt version has the “overhead view” request. Obviously, you delete that for the normal view.

Summary

A blonde girl (around 14 or 16) with long flowing hair kneeling looking at rabbits in a grassy meadow. She is wearing a yellow sundress with grass stains at the knees and flip-flops. She is holding in both hands a black rabbit. Around her are a number of solid white and solid black rabbits. Make it look like a real photograph taken with an iphone 15. I want the POV to be from high above her looking directly down towards the top of her head, like the view from an airplane, a drone, or a satellite. 16:9

Darren_Garrison · May 31, 2025, 12:51am

I mentioned bunny girl being adapted from the most recent prompt I was working on before that. Here’s some results from that prompt.

In a Facebook post I saw a drawing about Susan, who collects all rocks, not just the ones that she wants to use as weapons. (A Google Lens search showed it to be an illustration from the 1969 book Let’s Read About Rocks by Rand McNally with non-original text added.)

It needed to be improved on with meteorite-related content. At first I used the original drawing as a reference image with the prompts

Then went text-description only

(ChatGPT can make surprisingly decent images of specific types of meteorites even without reference images.)

I mentioned that there was a straight down bunny girl image that was badly mangled from items leaking in from an earlier prompt, here’s that image now that the error has context to make sense.

(For those who haven’t encountered it, sometimes when you are using one of the chat AIs to create images and move on from one prompt to a new one, the AI will include aspects from the old prompt in the new image. Gemini and Copilot seem to be worse about it than ChatGPT, but all of them make those mistakes. You really have to get into the habit of starting a new chat session for each new prompt.)

LSLGuy · May 31, 2025, 11:45am

I’m not sure how much that leakage is a “mistake” as opposed to the AI being trained on how real people really think.

IOW, I speculate that it fits the typical users’ use cases, even if it doesn’t fit yours or mine. All day long Jane the advertising copywriter is working on the Smithers project and keeps asking for more.

But yeah, the meteoric bunnies are a poor fit for one another.

DCnDC · May 31, 2025, 12:24pm

The cartoon version was closer to what I asked for, but the photo version isn’t bad. I think it still transmits the idea.

Ponderoid · May 31, 2025, 12:49pm

I leaned in harder on telling it what I wanted with the water turning into wine, and it complied pretty well.

ETA second try:

DCnDC · May 31, 2025, 12:59pm

Nice. I like how Jesus comes out a little different each time.

LSLGuy · May 31, 2025, 1:06pm

Extra points for him also looking reasonably Semitic. Not the usual bible belt blond blue-eyed jeebus nonsense.

Darren_Garrison · May 31, 2025, 1:44pm

In my first try (in Copilot) I forgot to include the Ferris Wheel and tent (but overall it is one of the best images). I added those for a 4-shot in Sora

Summary

A scene at a small country fair. Jesus is standing under a small striped awning at a booth advertising wine. On the wooden stand in front of him are several water bottles. The bottles on the far right side are filled with red wine. The bottles in the far left side are filled with water. The bottles in the center are filled with a swirling, blending chaotic mix of wine and water. Jesus is staring at the central water bottles in intense concentration, waving one open hand over them as swirls of magic distort the air between his hand and the bottles. In the background are a Ferris Wheel and a big top circus tent. Candid photo with shallow dof and forced perspective taken with an iphone 15.

Pretty cool prompt idea.

Ponderoid · May 31, 2025, 1:59pm

It’s just another carnival trick. </unimpressed>

Darren_Garrison · May 31, 2025, 2:23pm

Here’s what Gemini did with the same prompt

I trimmed down the prompt far enough to use it for a video clip in Bing and got crap. Every video I’ve tried for in Bing has been crap. I don’t know why, it uses Sora, and the videos on Sora are much better.

LSLGuy · May 31, 2025, 2:29pm

That’s a great pic and dramatic and lively. And impressive as hell.

But that’s Jesus the magician, not Jesus the miracle worker. I’d expect a magic trick to have curling smoke and a gradual transition. And perhaps liquid seemingly appearing out of nowhere.

I’ve never seen a miracle, but somehow I expect less folderol and more results. As in one millisecond the bottles have water and the next they have wine.

Like the difference between a McLaren and a Lucid Sapphire in a drag race. One is full of noise and drama, the other just gets the job done no muss no fuss, just silent aggressive motion and we’re done.

WWE loudmouthed clown vs ninja assassin.

Topic		Replies	Views
Digital art creator algorithm website Cafe Society arts-crafts , ai	2368	39074	April 30, 2026
Funniest AI Legos Miscellaneous and Personal Stuff I Must Share ai	433	18145	March 30, 2025
Share your mental picture of your fellow doper(s) Miscellaneous and Personal Stuff I Must Share	182	30139	April 7, 2013
Misinterpreted avatars Miscellaneous and Personal Stuff I Must Share	109	4784	August 5, 2024
Favorite internet pics In My Humble Opinion	128	9277	March 8, 2008

AI image generation is getting crazy good

Related topics