VO Studio Basics: Numbers We Might Need
While it’s important not to get bogged down by numbers and measurements, there are some important ones to know about in your home voiceover studio.
Most voice actors did not pursue VO due to an overwhelming love of numbers. For the most part, we don’t have to deal with a great deal of them in VO (unless you are doing a lot of scientific eLearning work, perhaps). Numbers can be pesky little beasts, but you do have to get them right. They pop up in VO studio workflow, and it can be easy to get them confused. In my classes and when working with clients, I try to untangle things.
Why Use Numbers At All?
We don’t really begin to depend upon specific numbers until audio moves into the digital realm. The audio input signal is “analog” when it is voltage traveling down a wire – as from your microphone through an XLR cable. But when that signal goes into your interface, it changes to the digital 1’s and 0’s that your computer can deal with. When converting from voltage into the digital signals, numbers start to matter.
A Bit About Bits
The first number you probably come across is the “Bit Depth.” This often gets abbreviated to just “bit.” Your interface (or even your USB-direct-connected microphone) contains a converter circuit (likely a single hardware chip) which is commonly 24 bit. Early converters used a 16 bit chip. The difference in Bit Depth is all about accuracy.
When you convert from the continuously variable voltage in the XLR cable to the 1’s and 0’s for your computer, a higher Bit Depth like 24 lets you be more accurate in the conversion. It’s like using a ruler with a finer measuring scale. Rather than seeing 1.23”, you would be able to measure 1.23751”.
24 bit converters give plenty of accuracy for what we do. More accuracy – such as a 32 bit converter – doesn’t matter all that much because the digital audio is converted back into analog before we hear it. The signal changes from those 1’s and 0’s back into continuous voltage. That drives our speakers or headphones, effectively “smoothing” the signal. The analog signal fills in any missing gaps between the data.
Bit Depth also comes into play when we create an audio file. I usually recommend using a Bit Depth of 24 when creating a new file. That ensures there’s no loss between the precision of the hardware converters and the accuracy of the stored data. With 24 Bit, there’s also a benefit in the theoretical noise floor of the recording – though even 16 Bit is likely below your actual noise floor. More importantly, a Bit Depth of 24 means an improvement in the dynamic range of your recording. Increased dynamic range lets you record your raw audio a bit more conservatively, allowing you to increase volume later if necessary.
What is “Sample Rate” in Recording?
Sample Rate is the amount of times per second in which we “look” at the incoming audio when converting from analog to digital. If the Bit Depth is the accuracy of our ruler, then the Sample Rate is how often we take the measurement. The two most common Sample Rates we use for VO recording are 44.1 kHz (44,1000 Hz) and 48 kHz (48,000 Hz). We basically inherited 44.1 kHz from audio CD’s, while 48 kHz lines up nicely with the 24 frames per second rate of most video (48,000 divided by 24 is a whole number).
That means when converting to digital, your interface is measuring the value of the incoming signal at more than 44,000 times per second, and generating accurate values which then travel down the USB cord to your computer. That’s what your recording software is capturing. Yes, it’s a lot of data.
These days, most non-audiobook projects request a Sample Rate of 48 kHz. 44.1 kHz is still more common on ACX or other audiobook recording projects. To eliminate an extra step in my project workflows, I usually use 48 kHz as my project template for all non-audiobook projects. While I would always try to record at the Sample Rate I’ll be delivering in, it is fairly trivial to change between those two if you get in a bind with a client. (In Twisted Wave, you simply go to “Effects” menu, then “Change Sample Rate” and make sure “Resample” is checked. In RX, there’s a semi-hidden “Resample” tool in the “Modules” menu.)
One more thing about Sample Rates – they determine the theoretical maximum recordable frequency. The highest frequency you can record is half the sample rate. A 48 kHz Sample Rate let’s you record a pitch up to 24 kHz, which only dogs and teenagers can hear. But if you accidentally set your Sample Rate at 22,050 Hz (22.05 kHz), then it would cut off all sound above 11,025 Hz, which might be noticeable to most humans.
I Hate To Be Negative, But That’s The Way We Measure Audio
“Wait! You mean minus 6 is louder than minus 12?”
Unless you are using Audacity’s proprietary “Linear” scale, you’ve probably seen that the “dB” or Decibel scale we use for recording runs from -999 dB up to a maximum value of -0 dB. Acoustic energy changes logarithmically, so with each jump up the scale, loudness changes by a factor of 10. That’s why input levels react quickly if we change position on the microphone. Microphone placement distance matters – but that’s a bit of a different subject.
Suffice to say that you’ve probably already heard the recommendation to have the peaks of your raw input somewhere in the -12 to -6 dB range – and that -12 is quieter than -6.
There’s One More Bit Rate Hidden Behind The Curtain
Output Bit Rate is different than Sample Rate or Bit Depth
So far, so good. We’ve used an appropriate interface hardware with a good quality 24 Bit converter, then recorded our audio with 24 Bit Depth Accuracy and a 48 kHz Sample Rate into our WAV (or other full-spectrum) audio file. Did all the editing and mastering to bring our deliverable audio to spec. If our end client wants an MP3, we just do a “Save As” or “Export” and we’re done, right?
In fact, MP3 format audio allows you to define an “Output Bit Rate” which is different than any of the other values we’ve discussed. Output Bit Rate determines the accuracy in the encoding of the file and it directly affects the quality of your MP3. (In some software, there is also a separate “Quality” setting in the MP3 options, but that is a different variable).
The first thing to do is turn off the “Variable Bit Rate” option in your recording software – “Variable” lets the software modify the Output Bit Rate so that it changes during playback. You want “Constant Bit Rate”. In Twisted Wave (shown below), you actually uncheck the “Variable Bit Rate” option. In other software, you might have to select “CBR” or “Constant Bit Rate”.
Once you do that, you should have the option to select a specific value for your Output Bit Rate. My default here is 192 kbps (kilobits per second), as that is the ACX-mandated minimum. Most people can hear the difference when quality is increased from 128 (a common application default value) to 192 kbps. Above that value, the benefit is negligible.
The most common confusion I encounter is between initial Bit Depth (where I’d recommend 24 Bit) and Output Bit Rate for MP3 files (where 192 kbps is optimal).
Also – bear in mind that after you have saved and closed a file you can’t go “back” – if you save a 24 Bit/48 kHz file as an 8 Bit/12 kHz file, it will never sound good again – even if you re-saved it into the original format. Once that accuracy and frequency information is lost, it does not come back.
Set It And Forget It (But Do Check It)
Luckily most of these settings are tenacious – which means they should remember what we set the last time we used the application. Your recording settings of 24 bit/48k should not change between sessions. Nor should your output bit rate reset. But it’s a good plan to just take a moment to confirm those things. An errant mouse click or a computer crash could change things.
This information recently went out directly to my email community.
If you would like to join in to receive those emails the day they publish, please take a moment to share your contact information through this sign up form.
Thank you.