Open AI's latest wonder: text to video with Sora

Atamasama · February 18, 2024, 2:46am

Gah, forgot I was in Cafe Society.

Mangetout · February 18, 2024, 1:07pm

Or that the phone isn’t the camera that would have been recording the image. There are half a dozen ways to keep cameras out of reflections, so the footage isn’t necessarily unnatural in that regard.

Sam_Stone · February 18, 2024, 9:52pm

Gemini 1.5 can do video, and it now has a context window of a million tokens. It can ingest an entire movie and make sense of it. In fact, one of the demos was that they gave it a copy of a n entire silent film with no text, no subtitles, and no description of the film. The AI was able to accurately summarize each scene and the whole film, describe the themes and plot, etc.

Astounding how far video has come in just a few months.

Lantern2 · February 18, 2024, 10:59pm

This is the Gemini demo and yeah it’s quite astonishing. It will probably revolutionize police/security/military video analysis very soon among other things.

Darren_Garrison · February 18, 2024, 11:09pm

This was state-of-the-art text to video in March of 2023. At the time all the rage was making x eating y (typified by Will Smith eating spaghetti). These are some of the very few tries I made. (Note that the AI thought all video should have a Shutterstock watermark.)

A cat eating a frog

Taylor Swift eating a cat

Bojack Horseman eating honeydew

Sam_Stone · February 18, 2024, 11:15pm

Well, that could be Bojack eating a Honeydew, filmed woth a macro lens…

Yeah, the latest videos re actually production quality for things like commercials and corporte videos. Second unit directors shooting B-roll and commercial directors are probably in trouble.

Darren_Garrison · February 18, 2024, 11:31pm

If you want to try text-to-video that is better than my above examples but not as good as Sora, you get a number of free seconds at Runway.

I used most of my free time on image-to-video, which takes a still image and tries to convert that to video with no descriptive prompt at all. Results for that can be a little okay, but fall apart fast even in the short time available. I’d love to see what Sora could do with image-to-video.

Here’s some of my Runway results coverting some of my text-to-image images into video.

And here are some using Stable Video Diffusion, which so far does only image-to-video, no text-to-video.

Darren_Garrison · February 21, 2024, 12:19am

Some more AI video

Darren_Garrison · February 21, 2024, 12:21am

Also, this happened recently.

Maserschmidt · February 21, 2024, 1:14am

They’re still struggling with hands, but impressive nonetheless.

Darren_Garrison · February 21, 2024, 7:44am

Awesome

Darren_Garrison · March 1, 2024, 11:11am

Another day, another mind-blowing AI video demo.

The video embedded in the article:

Johnny_Bravo · March 1, 2024, 12:28pm

Ew, no. Those are deeply on the uncanny valley. Impressive technically but not nice to watch.

Pardel-Lux · March 1, 2024, 1:57pm

Same applies to the reference images IMO. Most of them anyway. Why did the company choose those? Are they supposed to express normality? Or do they choose slightly weird to hide the evident weirdness of the generated videos?

Maserschmidt · March 1, 2024, 3:37pm

Most of those would have fooled me at a casual glance. Hepburn and young DiCaprio were definitely both uncanny valley for me, however. Impressive technology either way.

nate · March 2, 2024, 11:43am

I don’t think I’ve been more impressed with technology in my life. In fact, watching the YouTubes of the videos it creates is causing me to see the world in the same way as if I’m watching an AI created scene. How the fuck does it get the physics 99.9% right?? That’s mind-blowing. I predict that in the not so far future, a matter of months, that the AI will be able to play our human emotions like a fiddle. And we’ll know it but can’t really do much about it… and then kind of lose of confidence that we know anything is “real”, that we are all inferior to what our computers can create now.

Chronos · March 2, 2024, 2:57pm

And I’ll just remind everyone here to not put too much stock in released demos. If they made 500 attempts and got one that looked good and 499 eldritch horrors, they’re only going to release the one that looked good, and won’t mention the signal to eldritch ratio.

wolfpup · March 2, 2024, 9:57pm

There’s probably a fair amount of truth to that, and the released demos are undoubtedly the best so far. But it’s also worth noting that the diversity of the different scenes and objects suggests that this isn’t just a one-trick pony, but apparently an extremely versatile tool with a very broad visual knowledge of the world and its physical dynamics.

Darren_Garrison · March 3, 2024, 1:41am

I also suspect that with videos of weird lengths (27 seconds or 52 seconds, for random made-up example) instead of the full 1 minute available, they were originally made full length but something went weird before it finished.

Darren_Garrison · March 15, 2024, 12:51am

New video generator. Not as good as Sora, and only makes 2 second clips, but it is free to try. Here is my first try, a clip of Bread Climp.

Topic		Replies	Views
Is anyone working on an AI composer? Or singer? Or musician in general? Miscellaneous and Personal Stuff I Must Share movies , ai	6	251	March 15, 2024
AI image generation is getting crazy good Miscellaneous and Personal Stuff I Must Share ai	121	841	April 27, 2025
My (slightly) new perspective on the actors strike Cafe Society ai	4	480	December 28, 2023
AI entering the real world Miscellaneous and Personal Stuff I Must Share ai	4	248	January 8, 2024
Video Art Cafe Society	8	3077	December 30, 2011

Open AI's latest wonder: text to video with Sora

Related topics