Voice to Text for Meeting Minutes

Bone · July 17, 2019, 8:34pm

I’m looking for a solution that would allow the capture of meeting minutes without having for a meeting Secretary to record, and manually type out meeting minutes. Is there any system that folks are aware of that would do this? Think board meetings with large numbers of participants.

I have used an audio recording and then played it through my computer and had google docs use voice to text and have it transcribe, however this doesn’t capture who is actually talking, or do punctuation very well.

I’m willing to pay if there is a system out there that would do it. Ideally all the speakers have designated microphones and based on whose mic is being used it would attribute that text to that person.

control-z · July 17, 2019, 8:39pm

From everything I’ve seen, no such technology exists. I’ve used Google voice recognition a lot, and it’s only mediocre accuracy at best. Some of the typos are hilariously wrong, others are subtle. And that’s just my voice, after years of trying. I don’t think there is any way voice recognition is going to accurately recognize what different speakers are saying.

Bone · July 18, 2019, 8:44pm

I’ve been looking at a product called Voicea, but can’t find a lot of user info. Anyone?

Dewey_Finn · July 18, 2019, 9:50pm

Take a look at Otter and Trint.which are designed for transcription, either real time or in a short time. Both were mentioned in recent New York Times articles in which Times journalists described the technology they use. I really doubt either will identify who is speaking, though.

Pork_Rind · July 18, 2019, 9:54pm

I’ve used Trint a couple of times to capture the conversation from working sessions that I was leading where I couldn’t bring along a note taker. I thought the quality was good, although there were difficult to follow sections where several people were talking at once. I don’t recall it had any way to identify the speaker. I had to go back later and do that by ear.

Tee · July 19, 2019, 12:00am

If recorded, they wouldn’t be minutes, they’d be transcripts. Legal minutes record board actions and not every spoken word.

Bone · July 19, 2019, 7:45am

The purpose is to assist in minute taking. Some minutes can be very short, others much more verbose. To reduce manual typing a transcription can be a good starting point.

Tee · July 19, 2019, 2:30pm

I don’t blame you for trying to make the process more efficient. When I do it, the job is to condense and summarize verbosity (for hours) and use it rather sparingly in describing board actions. This creates a formal document that satisfies legal obligations. A transcript would be a separate documentation that is not legally required, but which now legally exists, and not everyone wants that. Just a heads up.

Bone · July 19, 2019, 10:01pm

I was looking at Amazon Transcribe, but then I exceed my technical knowledge. It looks like it’s a service that is reasonably priced, but I don’t know how to actually test it. It talks about development in AWS, and APIs but I don’t know how to interpret it.

After experimenting with Voicea, I don’t think that would work. I also tried Trint and it seems to work well so that’s good.

Reply · July 20, 2019, 12:42am

Came in here to suggest Trint. Play with it some; it’s actually very good at what it does. Sure, you’ll have to do a little bit of manual cleanup afterward, but a lot less than typing it all out by hand.

And they do have speaker separation as a feature, but last I tried, it was less than stellar:

https://support.trint.com/hc/en-us/articles/360000235517-Speaker-Separation

Madam_Librarian · July 21, 2019, 12:46am

A good number of voice-to-text programs require one to teach the software to interpret voices, so you’ll have to consider the set-up time involved (this includes all of the speakers providing speaking samples) for the sake of accuracy. Moreover, because it’s often difficult for them to distinguish among multiple voices, there could be several hours of editing the transcript, identifying who said what. And, it’s miserable when people talk over each other. Most, like Dragon, and Transcribe are going to present the same types of problems you have with Google Docs.

Working in a university oral history program for decades, this was an ongoing discussion that, upon several attempts to use this type of software, always resulted in our decision to return to more traditional transcription from the sound recordings.

don_t_ask · July 21, 2019, 1:54am

Arbie’s dragon fought a porpoise and bin elbow to achieve excellent results - as toucan see here.

Reply · July 24, 2019, 4:31pm

Madam_Librarian:

A good number of voice-to-text programs require one to teach the software to interpret voices, so you’ll have to consider the set-up time involved (this includes all of the speakers providing speaking samples) for the sake of accuracy. Moreover, because it’s often difficult for them to distinguish among multiple voices, there could be several hours of editing the transcript, identifying who said what. And, it’s miserable when people talk over each other. Most, like Dragon, and Transcribe are going to present the same types of problems you have with Google Docs.

Working in a university oral history program for decades, this was an ongoing discussion that, upon several attempts to use this type of software, always resulted in our decision to return to more traditional transcription from the sound recordings.

Have you tried Trint? I’d be curious as to your thoughts on it, as someone who’s used similar stuff for years.

Machine learning has drastically improved speech recognition in the last 5-10 years, using new technology entirely different from the old Dragons and such. Even speaker separation is making huge strides. This has to do with huge companies like Google and Amazon investing big-time in the tech, powering things like Alexa, Siri, and the Google Assistant, using machine-driven statistical analyses over tens of thousands of hours of recordings (an approach that wasn’t yet quite feasible in decades past).

For example, now YouTube can automatically caption (to maybe 80% accuracy?) uploaded videos in several languages with no prior training from the speaker(s), entirely for free. It’s not as user-friendly as Trint, but is still a very affordable way of getting semi-usable transcripts that you can edit in much less time than manually transcribing from scratch.

Kropotkin · July 25, 2019, 10:24pm

This is an important distinction and it’s not clear from the OP that it has been grasped. Minutes do not, should not, report who said what on what issue but motions and votes and directives to officers and the like. They don’t include committee reports, speeches, questions, and the random lunacy of meetings. A 3 hour strata council meeting might generate 2 pages of minutes, for example

Topic		Replies	Views
How to transcribe text from a DVD? Factual Questions	15	8191	May 2, 2005
Anyone sought a good mp3 to text conversion software, free, for pay? In My Humble Opinion	12	814	October 10, 2019
Do they still use stenographers in US courts, and if so why? Factual Questions	35	11542	October 30, 2014
How good is [Voice Recognition] software now? Factual Questions	19	4058	December 27, 2013
Voice Recognition Software Question Factual Questions	5	1049	February 22, 2011

Voice to Text for Meeting Minutes

Related topics