Twisted Wave Speech Recognition to the Rescue – applying tools in the VO studio

Using the Speech Recognition tool in Twisted Wave. The utility will recognize the words and transcribe a matched text resource which is linked to the audio file. This can help to locate phrases within your recordings.
Using the optional Speech Recognition tool in Twisted Wave on MacOS. The Speech Recognition utility will identify the words and then transcribe a matched text resource which is linked to the audio file. This can help to locate phrases within your recordings.

Speech Recognition is functionality you can add to the core Twisted Wave recording application. Since it requires an additional purchase, and not everyone seeks that capability, it’s a tool I only lightly touch upon during my Twisted Wave Deep Dive workshops. I had upgraded to Speech Recognition (as well as Twisted Wave’s Video editing tool) back when it first came out.

After the initial setup, Speech Recognition worked reasonably well – though it (like most transcription tools) had a little trouble with certain words and phrases. It was possible to scan the generated text to find certain locations, and it synched with the recording when highlighting the words. However, I hadn’t found a specific production workflow which really demanded its use. Certainly a cool tool, but it tended to stay on the shelf in my studio.

An audiobook production challenge

Recently I faced an audio production challenge where Speech Recognition turned out to be the perfect tool for the task. The ability to implement precisely the tool I need remains one of the key reasons why I prefer a versatile working environment such as Twisted Wave. The application contains a multitude of useful and precise utilities which can be implemented when you need them. Twisted Wave may seem “simple” due to its refined user interface, but it contains deep tools under the hood.

In this case, I had completed a long form audiobook project. After passing the book into final review, the client realized that they did not specify a slightly unique pronunciation of a fairly common phrase. Since this phrase was used in their marketing and branding, that minor difference mattered. The phrase appeared heavily through one particular section, so all instances through those chapters needed to be replaced. The client had offered to listen through and log all the timestamps, but it struck me that there would be a quicker way to fix things.

Time for Speech Recognition

Speech Recognition started to appear in a few audio applications in the past couple years, built upon the MacOS system capability to recognize speech. To implement the feature, you have to first turn on a utility inside the computer System Settings. It can sometimes take 24-48 hours for the language recognition utility to fully download and install (the full file is more than a couple gigabytes). Turning it on does not occur as quickly as simply downloading a new plug-in. 

Luckily, I had gone through the initial setup back when first adding the feature to Twisted Wave. Since the MacOS System had already implemented the voice recognition framework, it was a simple matter of choosing “Recognize Speech” from the SPEECH menu in Twisted Wave. My stalwart studio Mac Mini (the last of the Intel-based Mini models) chewed its way through the chapter and generated a visual text track which floated above the audio waveform.

This simple example above shows how the text document synchronizes with the audio file. I had highlighted "the lazy dog" in the text, and Twisted Wave found and highlighted the same portion of the audio waveform. This can expedite finding certain phrases in a longer segment of your project.
This simple example above shows how the text document synchronizes with the audio file. I had highlighted “the lazy dog” in the text, and Twisted Wave found and highlighted the same portion of the audio waveform. This can expedite finding certain phrases in a longer segment of your project.

A further trick with “Find and Replace”

Once that process completed, it was simply a matter of saving the text file. The Twisted Wave Speech Recognition tool syncs between the text and the audio. When highlighting a term in the text document, Twisted Wave highlights the audio waveform of that same phrase. A quick insert edit at the highlighted point in the file let me directly record and replace the incorrect pronunciation. 

To make sure that I didn’t visually miss that phrase in any of the chapters, I added a simple text manipulation to make the phrase more visible. Since Twisted Wave allows you to save the synced text document, it was fairly trivial to open the text file within Google Docs and use the “Find and Replace…” tool to track down all instances of the phrase. This also gave me a count of how many times the phrase appeared. I replaced the original with an ALLCAPS version that helped to contrast it when viewing the document. Then, I saved that text document and resynced it with the audio in Twisted Wave.

At this point, it was a simple matter of scanning through the document for the “ALLCAPS PHRASE”, highlighting it within the text document, and inserting the revised phrase into the highlighted audio file. It went smoothly, if somewhat repetitively. 

Luckily, this sort of thing is a rare occurrence. It was satisfying to know that when faced with a production challenge this sometimes underutilized tool proved to be a robust asset that solved things effectively.


Wondering if your audio quality meets professional standards? For a free review of your vocal recordings, please use the upload tool on my Audio Review page.

Please share! If this resource has been helpful to you, please share with one of the buttons below!
You can also sign up to receive these and other advance announcements via email each week.

2 Responses to “Twisted Wave Speech Recognition to the Rescue – applying tools in the VO studio

  • A wonderful hack and step-by-step guide to saving a lot of time in this situation. Looking forward to exploring this feature!

    Thanks, Jim!

  • Jim! I have used TW for many years – but recently thought i’d try using this with a rather long, 13 hr book. Thank you times 20 for making this guide…definitely worth a dollar a month added to my subscription…I’m going to loop in my editor to this little utility and see if he can use it in his work flow….again, thanks for making this guide…cheers! Dave

Leave a Reply

Your email address will not be published. Required fields are marked *