Is there a way for a program to identify what is being said (transcribe conversation into text)? For example, i want to know if there is swearing in a video. Can i insert keywords for words i dont wanna hear and have a program scan the video for them?
This technology exists. For example, Youtube uses it to automatically generate closed captioning on some videos. From what I’ve seen, it doesn’t work especially well. I don’t know of any version for the home user.
With respect to the bolded phrase (emphasis added), Youtube autocaption works hilariously well. For humorous effect*, anyways. For serious purposes, not so much.
*my cite is Googling “Youtube automatic caption fail”.
Voice command recognition is still shaky, dialogue from a video is going to miss the words you don’t want to hear and remove words you do want to hear.
As has been said, there is no consumer version of this available (yet) but voice recognition (speech recognition) is coming on strong. With Google, Apple and Microsoft each investing billions into this technology, you can expect it to keep improving.
There is no question the Star Trek version of computing is coming. Computers are ubiquitous, unseen, and to “use” one, you just talk. It either talks back, or displays whatever you need on whatever screen you happen to be close to.
I thought the OP title was referring to lip reading from just video (no sound). IOW what HAL did in 2001. I’ve never heard or seen of anything like that yet (and here it is almost 2015!)