Using google's voice recognition to convert audio files to text

The_Sheikh · December 11, 2014, 7:55am

Google’s api for voice recognition is more accurate than most voice recognition software that I have tried, including Nuance’s Dragon. In addition, it recognizes a multitude of languages.

Is there any way I can exploit Google’s API to convert audio files into text? I want to upload lengthy audio files which would be then converted to text; not speak into a microphone.

Reply · December 11, 2014, 8:43am

You could try YouTube’s automatic captions feature.

Reply · December 11, 2014, 8:53am

Or you can try one of the following API demos:

gist.github.com

https://gist.github.com/alotaiba/1730160

google_speech2text.md

# Google Speech To Text API
Base URL: https://www.google.com/speech-api/v1/recognize  
It accepts `POST` requests with voice file encoded in FLAC format, and query parameters for control.

## Query Parameters
`client`  
The client's name you're connecting from. For spoofing purposes, let's use `chromium`

`lang`  
Speech language, for example, `ar-QA` for Qatari Arabic, or `en-US` for U.S. English

This file has been truncated. show original

But even if the underlying engine is great (and it is), you don’t get the full UI of something like Nuance. You can’t easily go back and correct mistakes, customize how it’s trained, etc.