Today I got fooled by an AI video

Another potential issue with AI-generated images and videos is that they could make people doubt the authenticity of real footage, leading to skepticism even when something is genuine. This could also desensitize viewers to urgent situations—like a GoFundMe campaign for someone in need—making it harder for real causes to gain support.

Whales do have capacity to learn though.

Interestingly, in recent news simple “conversations” have been held with whales, using deep learning to try to decipher their vocalizations. I put “conversation” in scare quotes because I think the researchers basically just said “hello” repeatedly until 20 mins later the whales got bored and went away (like a phone call where both parties are unsure if the other person can hear them :laughing:)

But no doubt it will get better from here. And arguably just saying “hello” might already be sufficient to hypothetically enable interactions like in the OP video.

(This is not me trying to defend my credulity, just an interesting tangent)

“Stupid humans. The only word they know is “hello”. It’s a wonder they ever survived coming down from the trees. Just proves it was a bad idea after all.”

I think that’s one of the things that makes it rather dangerous though - the effort required is really low. I think a lot of people have a significantly baked-in subconscious metric for how impressive or complex or large-scale a thing is, and this maps somewhat to their acceptance that it is real.

Right. Whales do need some Human help sometimes when they get tangled in nets, etc. I have seen some vids of this occurring.

Right. Whales do need human help with nets and such. But Right Whales (specifically, the North Atlantic right whale (Eubalaena glacialis) absolutely need human intervention, and quickly. They are in serious trouble, with only about 360 individuals left. Without active conservation efforts, they could go extinct within our lifetime.

Most of the upvoters are also bots.


That’s bad.

I see comments like that all the time on Facebook: people claiming that a video is AI when the content is far beyond the state of the art of what AI is able to generate at the moment. That and an utter lack of understanding of the distinction between AI and traditional CGI.

But you get your choice of toppings.

You underestimate the gullibility of large swaths of our population.

And you seem to have misunderstood my comment?

Ah, I see. I thought you were talking about comments like mine on FB, but you were talking about other people’s comments. You are forgiven.

But, people are gullible.

Can I get barnacles on mine?

Have you ever scratched a Dachshund’s belly once? Trick question! No body has! If you do it once, you’ll end up doing it forever! They will not let you stop!

Some of the moving pictures in the video linked below are, I think, real-looking enough to fool most casual observers - these are technical demos of a video generator that works from only one still photo of a subject, plus a voice recording, and outputs a fully animated, expressive performance of the person talking. There are still tells in most of the clips - some more obvious than others (the second clip of the boy talking looks like a CGI render, for example, but that might be mostly determined by the qualities of the still subject image that was fed in):

Mind-blowing videos, but it looks like the review was made with clips provided by the company, not made by the reviewer, and the AI isn’t available for mere mortals to use yet. I’m impressed that the model was created from only 18 thousand hours of training video.

https://www.datacamp.com/blog/omnihuman

But your mentioning this reminded me that there was a “lip synch” tab at Kling AI when I was trying it yesterday. I didn’t look into it at the time and hadn’t thought about it again. But I looked at it and it doesn’t directly convert still images into lipsynched videos but does convert text-to-video or image-to-video clips created in an earlier step or clips that you upload. It seems to be limited to 5 to 10 second lipsynch clips. You can either upload a sound clip or generate audio with text-to_speech. I searched through my archives of image-to-video clips I had made in the past to find a suitable candidate and picked one generated by Luma Dream Machine (from a still image created with SDXL). At first I looked around the web for an appropriate sound clip but then choose to go with a text-to-speech tongue-twister. I actually started to make this reply last night but the video stayed in Kling’s queue for several hours overnight.

The result isn’t as impressive as the Omnihuman clips, but it is available right now. And pretty cheap, too. Generating a 5 second image-to-video clip costs 20 credits but lipsynching a 5 second clip costs only 5. I would have already experimented more if it wasn’t hours waiting for the first clip.

Meanwhile I looked back at Luma to see if they happened to have added a lipsynch option. They hadn’t, but there is an option to auto generate audio that the AI thinks is appropriate for the video. I tested a few of my videos archived there and the results were creepy.

I thought it was interesting that the AI interpreted the movement of a subject in the third clip as a cough.

I think that AI girl (with the inhumanly huge mouth) to the right of vid 2 is trying to swallow my soul. :shudder:

Okay, the queue time is down to minutes. Here’s a try with an audio clip.

The original already had soundless mouth movements, but the AI did a pretty convincing job, I think, given that it wasn’t a photorealistic source in the first place. (I don’t have an immediately handy 5 second clip of a real person talking to test it with.)

I’ve tested Kling AI with a real photo. I didn’t really have an idea what I wanted to go with, so I searched “famous photos” on Google Images until something interested me. I outpainted the original photo to get a 16:9 aspect ratio and converted it into a video clip with Kling. I then tried several audio clips for lipsynching. The audio clips came from scouring the web for sites with movie/tv quotes clips. (I didn’t find much.) Then I browsed for female quotes close to but not over 5 seconds. I ended up using two Buffy quotes (one Willow, one Cordellia) a Harry Potter quote (Luna Lovegood) and a Firefly quote (River.)

Here are all the results.

Kling isn’t flawless, but it is pretty good.