AI image generation is getting crazy good

Speaking of dancing and Grok vs Sora, one prompt I tried was “a ballerina playing a concertina in a marina”. In that case, Grok’s output was hugely superior to Sora’s.

I suppose I was lucky to get even that out of Sora, the restrictions are so tight. I tried these song lyrics as a prompt, Sora rejected it

Summary

It was an itsy, bitsy, teenie, weenie, yellow, polka dot bikini
That she wore for the first time today
An itsy, bitsy, teenie, weenie, yellow, polka dot bikini
So, in the locker, she wanted to stay

Grok did this.

I tried these lyrics, Sora rejected it

Summary

Well I saw the thing comin’ out of the sky
It had the one long horn, and one big eye
I commenced to shakin’ and I said “Ooh-eee”
It looks like a purple people eater to me

Grok did this

I tried for campers around a camp fire and Smoky Bear comes up and yells at them. Rejected. Changed it to a bear in a ranger hat telling them only you can prevent forest fires. Rejected. I had to go completely generic to get a result

A campfire at a summer camp surrounded by campers roasting marshmallows. A brown bear in a ranger hat runs up and curses them out.

Sora did this

Grok did this

I tried a free trial of Veo3 and used it to make some short “photo to video” clips. The results were mixed but I really like this one of my friend dancing with a statue in Carmel.

The statue seems to be enjoying it.

They both seemed to be having fun. Shame about the juxtaposition of the streetlight.

The quality of the image / video generation aside, that was some pretty creative cursing from the Smoky the Bear substitute: “Hey! What in the f***in’ pinecones do you think you’re doing? You can’t just roast sticky sugar bombs out here! You tryin’ to start a 4-alarm forest barbecue? Put those f#%@#% sticks down right now!”

Funny that the reaction from the campers is laughter, though. If I was out in the woods and a talking brown bear the size of an RV suddenly came up behind me and started cursing at me to stop what I was doing, I think my reaction would be “yes sir, Mr. Bear, sir!” :laughing:

It is interesting how Sora treats source images. Other AIs will strictly take the image and use it as a start frame for the video. Sora will sometimes do that, too. But itger times it will just flash the original up for maybe one frame, then make something different inspired by that image. For instance, I saw a recent photo of a toy. I showed that to Sora with the prompt “He is giving a stand&up routine”. There was a quick flash of the original image, then this.

Well, now I want to know the end of the evaporation joke…

I’m pretty sure it understood what it was looking at and was referring to this:

A depression on the head, called a “dish” (sara), retains water, and if this receptacle is damaged or if its liquid is spilled or dried, a kappa becomes severely weakened.

From

Something I learned about from Usagi Yojimbo a long time ago. :smiling_face_with_sunglasses:

Ohhhh…. I’m familiar with kappas but didn’t catch on from the image.

Two kids telling knock-knock jokes.

Two Horror movie villains telling knock-knock jokes.

I had done horror movie villains at a convention a few days ago. I tried to make the big guy with a cleaver into a character and it told me it may violate their guidelines, even though it gad just minutes ago created it.

Horror movie villains sitting in a semicircle of chairs in a room sharing their feelings in a support group.

Whatever censor checks characters for acceptability is way more restrictive than the censors that watch the main video generations. And it’s inconsistent. You should try again, perhaps with a different 3-second segment of your source video.

ETA: Oh, I just watched that video, I see you’re not gonna get a different segment to try. I’ve found for generating characters, it helps to have that purpose in mind when you generate the source video. You can make sure the character speaks (if they speak) a variety of things, and shows all sides of them, closeups, etc. to give the cameo generator as much ‘resolution’ as possible.
And if your character isn’t a realistic human, you might do better to create them in Grok instead, you won’t have to waste your limited Sora generations trying to get that perfect vignette that you can only use 3 seconds of anyway.

Well, for that one my prompt was literally nothing more than “horror movie villains at a convention”.

I just now screencapped him from the video and fed it to Grok

Sora, of course, refused to deal with the screencap.

Grok dgaf.

gory stuff you might not want to see

https://youtube.com/shorts/ahpQpcxfZdY?si=l2LGKJhnrjXaymAM

Jesus. Didn’t really wanna be seeing that.

I’ve been playing with making geoduck videos in Sora. More than half the time it censors them after they have been made.

A geoduck driving a Cybertruck. Great geoduck, lousy Cybertruck.

A gooey duck meets a geoduck, prompt only.

A gooey duck meets a geoduck, using a Copilot image.

A cat stalks a geoduck.

A bunch of geoducks on a beach.

Sora is too clever for my own good. I just tried this:

A girl with curly black hair sitting in a small, crude wooden booth. The top of the booth has a sign that says “Psychiatric help 5¢” at he bottom of the booth is written “the doctor is in”. A middle-aged man is standing beside the booth describing an emotional problem to the attentive girl.

It refused to make the video, telling me “This content may violate our guardrails concerning similarity to third-party content.”

Try presenting the request as a metaphorical poem.