Voice to Text for Meeting Minutes

I’m looking for a solution that would allow the capture of meeting minutes without having for a meeting Secretary to record, and manually type out meeting minutes. Is there any system that folks are aware of that would do this? Think board meetings with large numbers of participants.

I have used an audio recording and then played it through my computer and had google docs use voice to text and have it transcribe, however this doesn’t capture who is actually talking, or do punctuation very well.

I’m willing to pay if there is a system out there that would do it. Ideally all the speakers have designated microphones and based on whose mic is being used it would attribute that text to that person.

From everything I’ve seen, no such technology exists. I’ve used Google voice recognition a lot, and it’s only mediocre accuracy at best. Some of the typos are hilariously wrong, others are subtle. And that’s just my voice, after years of trying. I don’t think there is any way voice recognition is going to accurately recognize what different speakers are saying.

I’ve been looking at a product called Voicea, but can’t find a lot of user info. Anyone?

Take a look at Otter and Trint.which are designed for transcription, either real time or in a short time. Both were mentioned in recent New York Times articles in which Times journalists described the technology they use. I really doubt either will identify who is speaking, though.

I’ve used Trint a couple of times to capture the conversation from working sessions that I was leading where I couldn’t bring along a note taker. I thought the quality was good, although there were difficult to follow sections where several people were talking at once. I don’t recall it had any way to identify the speaker. I had to go back later and do that by ear.

If recorded, they wouldn’t be minutes, they’d be transcripts. Legal minutes record board actions and not every spoken word.

The purpose is to assist in minute taking. Some minutes can be very short, others much more verbose. To reduce manual typing a transcription can be a good starting point.

I don’t blame you for trying to make the process more efficient. When I do it, the job is to condense and summarize verbosity (for hours) and use it rather sparingly in describing board actions. This creates a formal document that satisfies legal obligations. A transcript would be a separate documentation that is not legally required, but which now legally exists, and not everyone wants that. Just a heads up.

I was looking at Amazon Transcribe, but then I exceed my technical knowledge. It looks like it’s a service that is reasonably priced, but I don’t know how to actually test it. It talks about development in AWS, and APIs but I don’t know how to interpret it.

After experimenting with Voicea, I don’t think that would work. I also tried Trint and it seems to work well so that’s good.

Came in here to suggest Trint. Play with it some; it’s actually very good at what it does. Sure, you’ll have to do a little bit of manual cleanup afterward, but a lot less than typing it all out by hand.

And they do have speaker separation as a feature, but last I tried, it was less than stellar:

https://support.trint.com/hc/en-us/articles/360000235517-Speaker-Separation

A good number of voice-to-text programs require one to teach the software to interpret voices, so you’ll have to consider the set-up time involved (this includes all of the speakers providing speaking samples) for the sake of accuracy. Moreover, because it’s often difficult for them to distinguish among multiple voices, there could be several hours of editing the transcript, identifying who said what. And, it’s miserable when people talk over each other. Most, like Dragon, and Transcribe are going to present the same types of problems you have with Google Docs.

Working in a university oral history program for decades, this was an ongoing discussion that, upon several attempts to use this type of software, always resulted in our decision to return to more traditional transcription from the sound recordings.

Arbie’s dragon fought a porpoise and bin elbow to achieve excellent results - as toucan see here.

Have you tried Trint? I’d be curious as to your thoughts on it, as someone who’s used similar stuff for years.

Machine learning has drastically improved speech recognition in the last 5-10 years, using new technology entirely different from the old Dragons and such. Even speaker separation is making huge strides. This has to do with huge companies like Google and Amazon investing big-time in the tech, powering things like Alexa, Siri, and the Google Assistant, using machine-driven statistical analyses over tens of thousands of hours of recordings (an approach that wasn’t yet quite feasible in decades past).

For example, now YouTube can automatically caption (to maybe 80% accuracy?) uploaded videos in several languages with no prior training from the speaker(s), entirely for free. It’s not as user-friendly as Trint, but is still a very affordable way of getting semi-usable transcripts that you can edit in much less time than manually transcribing from scratch.

This is an important distinction and it’s not clear from the OP that it has been grasped. Minutes do not, should not, report who said what on what issue but motions and votes and directives to officers and the like. They don’t include committee reports, speeches, questions, and the random lunacy of meetings. A 3 hour strata council meeting might generate 2 pages of minutes, for example