AI image generation is getting crazy good

Darren_Garrison · September 23, 2025, 7:18pm

I have sound here.

Ponderoid · September 23, 2025, 7:34pm

Well, there’s at least one more who can hear it. Thanks.

Source image from ChatGPT, first result from Grok when it won’t let me enter a prompt:

And another one where the sound in Grok didn’t download into the file, so there was no sound to play when I uploaded it.

Ponderoid · September 23, 2025, 7:43pm

@Joey_P, The DRMGIRL should have put her car in park before getting out. Amazing what can happen at a Carl’s Jr. next to a TA Express in Blaine, Washington.

Darren_Garrison · September 23, 2025, 9:13pm

I’ve just discovered a shortcoming of Google Photos free Veo3 videos: it only makes 9:16 videos. Grok will automatically make the video in whatever supported aspect ratio closest fits the ratio of the source image. Here’s a manga panel that I converted in Sora (which I posted earlier in this thread) in Grok. It did a pretty reasonable job of animating spider movement, I think.

And here’s Veo3. As you can see, it horribly bungled the aspect ratio. The spider movements also aren’t bad here, but the crowning really came out of nowhere.

Ponderoid · September 24, 2025, 2:33am

I like how ChatGPT just knew who the statue on the left was supposed to be.

Source:

Prompt to ChatGPT:

Redraw the image, change the statues to realistic people. Do it 1:1 aspect ratio and you can crop out the sides of the image, concentrating on the former statues and their base.

Bonus Grok video (took whatever it gave me on first upload without redoing it with a prompt)

Darren_Garrison · September 24, 2025, 2:55am

I just discovered something. In the app, at least, there is a mute audio button. It doesn’t just mute the playback of the audio while playing in the app, it doesn’t download the audio track if you do a download while muted. Unmute, redownload, and you get the audio.

Ponderoid · September 24, 2025, 2:57am

I found that when I googled the issue; there’s online discussion about the problem. It only rarely works for me, and most others as well. The company has acknowledged the problem and still haven’t fixed it.

Darren_Garrison · September 24, 2025, 3:10am

I’ve been digging through my archives for images to convert, in the past two days it has let me do around 50 free videos per day before hitting a limit, which is insanely generous compared to most other AI video sites. They allow like half a dozen videos per month and you tend to have to wait hours in the queue for it to be generated. Those sites have more control, more features, and probably tend to have better output, but still.

Darren_Garrison · September 24, 2025, 6:23pm

Here’s a fun one. I showed Copilot this comic strip

And told it

“Convert the middle panel to a realistic photo with no text. Iphone 15 photo. 9:16”

And got this, which is pretty funny, but doesn’t quite get it.

I then told Copilot

“Look at the original again. His legs are bare and the front of the worm costume continues downward past his torso towards the floor.”

It started adjusting the image with pretty good results, got down to around his knees, and aborted the image.

I tried again, with the strip cropped to just the center panel

And got this

Meanwhile, I was also trying it in Sora. I accidentally didn’t tell it to use the middle panel so it made its interpretation from the whole strip.

And here’s Sora with the cropped single panel (the first of those I concider a pretty strong success)

Jophiel · September 24, 2025, 9:09pm

Devoid of context, he mainly looks ashamed to have pissed himself

Chronos · September 25, 2025, 12:17am

Shofar, sho good.
EDIT: Dangit, beat by @Maserschmidt .

Darren_Garrison · September 25, 2025, 12:27am

You were shoclose, but shofar.

Alessan · September 25, 2025, 8:35am

Here I am again in this mean old town
And you’re shofar away from me
And where are you when the sun goes down?
You’re shofar away from me.

Darren_Garrison · September 25, 2025, 2:19pm

So far Grok seems to be pretty bad at understanding prompts for image to video. Here’s an example from last night.

For the source image, there are certain creatures and objects the ChatGPT/Copilot/Sora is especially good at for whatever reason, with the Necronomicon as protrayed in The Evil Dead being one of them. Months back I took an image of a girl holding a book (found on the Sora feed) and just told Sora to replace the book she was holding with the Necronomicon. I fed that photo into Grok.

Here’s the video it automatically made

I then prompted a new version: “The book is screaming gibberish while the girl looks on in horror”

Which is almost 100% completely unlike what I wanted.

Ponderoid · September 25, 2025, 2:31pm

Did you choose the speech option when you made that second one?
IME, if I choose speech, every character in the image starts speaking the literal thing I put in the field with one voice.
If I don’t choose speech, when the character(s) speak, it comes out sounding like Simglish.

Darren_Garrison · September 25, 2025, 2:34pm

I don’t remember if I picked “speech” or “custom”. Probably “custom”.

Darren_Garrison · September 25, 2025, 2:58pm

And a fresh experiment. From a few months back when we were experimenting with “sisters” prompts and did a set jsing words for Japanese fashion types. I was looking through those for images to upload to see how well Grok interpreted facial expressions/body language in determining the direction of the video output. It occurred to me to try more than one image at a time to see what it would do with the content and the unusual aspect ratio.

So on this instance, at least, it had all three subjects perform a similar action. And it squeezed the aspect ratio a bit from the original 2:1 to approximately 1.8:1.

Darren_Garrison · September 25, 2025, 8:34pm

Showed this to Copilot, only instruction was “Convert this to a realistic photo with no text.”

The only thing it didn’t understand was the drawing of the upside down goose. It replaced the “Gaggle Maps” screen with an actual map route, but I don’t think that hurts the joke.

Darren_Garrison · September 26, 2025, 9:17am

Showed this to Copilot

Told it “Create a realistic photo of the animals that are casting these shadows.”

(It was supposed to be donkeys. I cropped out the US Capitol at the top of the original.)

Darren_Garrison · September 27, 2025, 4:30am

Here’s a prompt that Night Cafe invented for me tonight

Summary

Whimsical illustration. A girl with vibrant, flowing hair rides a giant, friendly snail through a fantastical forest filled with glowing mushrooms and oversized flowers. The style is a blend of impressionism and retro-futurism, with a color palette of bright, saturated hues. Soft, dappled sunlight filters through the canopy. A whimsical, dreamlike atmosphere. 10 keywords: vibrant colors, impressionism, retro-futurism, whimsical, fantasy, detailed illustration, magical, enchanted, dreamlike, high detail.

I trimmed it down to this to use in Copilot

Summary

A girl with vibrant, flowing hair rides a giant, friendly snail through a fantastical forest filled with iridescent mushrooms and oversized flowers. Soft, dappled sunlight filters through the canopy. Iphone 15 photo with shallow dof and forced perspective. 9:16

And I fed the resulting image to Grok.

This is the automatic video, which is pretty good, other than the shift in colors and progressive loss of details typical of Grok.

I tried as custom prompt “The snail rears up and she falls off backwards”. She just begins looking like she is falling off by the end of the clip, but the snail never rears up.

Kling understood the directions and the result is pretty hilarious

Topic		Replies	Views
Digital art creator algorithm website Cafe Society arts-crafts , ai	2368	39074	April 30, 2026
Funniest AI Legos Miscellaneous and Personal Stuff I Must Share ai	433	18145	March 30, 2025
Share your mental picture of your fellow doper(s) Miscellaneous and Personal Stuff I Must Share	182	30139	April 7, 2013
Misinterpreted avatars Miscellaneous and Personal Stuff I Must Share	109	4784	August 5, 2024
Favorite internet pics In My Humble Opinion	128	9277	March 8, 2008

AI image generation is getting crazy good

Related topics