Open AI's latest wonder: text to video with Sora

The link embedded the video directly instead of giving a link like I expected. Here’s the site

Thanks for posting the free site…

There’s a lot of technology I can’t stand, even music, and can’t listen to “drum machines” et et., but this new AI stuff (starting with vocals) is great for someone who wants to do something who might not have the money and/or patience to deal with people who won’t show up.

Wow, I just used it, but only the free trial… How do you describe what you wanna get? It doesn’t seem to understand what I say, but 4 seconds isn’t much time, although some reactions are very weird and nothing close to what I typed out.

That Haiper is much, much less sophisticated than Sora. (Or Runway, whichever one you tried.)

Awesome.

The newest contender, Dream Machine.

Jesus, many of those are terrifying in the Salvador Dali acid dream kind of way.

Quite possible that within a decade, everyone can be their own private movie director. You don’t need $200 million anymore, you might be able to create a full-length blockbuster film for just $20/month.

I’ve tried Dream Machine now (free accounts get 30 videos per month). Unlike Runway and Stable Diffusion Video, which do not allow text prompts to be included with an image prompt, Dream Machine requires it. Out of 8 videos that I’ve reated from images, there is only one that I consider highly successful. The rest are from simply disappointing (only moving the camera around but not the subject) to weird failures. The one success is a Dall-E created image that I have put through all three engines. Dream Machine has the best results of the three.

And here are two failures.

What I hoped to see was the cat moving some and the thing on the wall wiggling around a little. The prompt with the image was “A cat in a room with an octopus alien behind it.”

What I hoped for was the face on the seashell being animated. (That is a photo of a real helmet shell with a face that was infilled using Stable Diffusion.) The prompt was “A laughing talking seashell monster”.

I love this one. Using this image

And the prompt “An alien and a man posing for a photo at night.”

Got this:

The queue time to create a Luma Dream Machine video is up around 6 hours now. This one submitted last night is my favorite yet, not only cute but also technically very impressive. Notice how it manages to separate the individual whiskers on the cats and hairs on the elephant from the background and move them around mostly realistically.

On an opposite note, there is censorship involved in the system. When I submitted this image (with a prompt of something like "strange biological specimen flopping on a dissecting tray) it was rejected with a message about an unsafe image detected.

An article I read about the engine says that it can create 120 frames in 120 seconds. (Since the videos are 5 seconds that of course means they are at 24 fps.)

And now there is new version of Runway, announced yesterday.

It’s heéeeeeeeere.

It’s still awful at text:

Nevertheless, just about everything else about it is perfect. The talking heads, the scrolling chyron, the hard-to-describe quality of the different video feeds, etc.

It’s just happening to freeze on a panel where there are cars on the sidewalk in odd places & the roadway/sidewalk are blended, but it does look plausible.

Sora is now open as of yesterday, but they’ve suspended account creation because of heavy traffic, even if you already have an OpenAI user ID. But I’m looking forward to playing with it!

Saw this just now. Possibly Sora, possibly one of the others.

Can’t wait to try it. I am a Chat GPT Plus user and I think we get some free credits though it doesn’t seem to be open yet.

I wonder what the first use cases at scale will be. This thing seems perfect for amateur music videos and small music groups will probably use it for that. Shouldn’t be too difficult to integrate real video footage of a group with video generated by AI. Commercials for smaller companies would be another big use case.

Yet another video model: