The Problem
For every podcast, there’s at least one editor. That means roughly 850,000 podcast editors are out there mucking around in the audio space. Anyone who has ever lived with a roommate knows that for every person, there are different definitions of the word “clean” and “loud.” Audio editing is no different.
If I were a hungry junior creative director for an ad agency and I wanted the maximum people to be aware of my newest commercial, what is stopping me from telling my editors to simply crank the volume as loud as it will possibly go? Yes, it may cause some ear and speaker damage, but all press is good press, right?
For troubled suburban mothers, this battle was fought out long ago in what’s theoretically been called “The Loudness War.” This was a modern myth perpetuated in the ’90s that claimed as music labels fought for our attention through volume, their increasing loudness would eventually lead to widespread hearing loss for their delicate children. For audio engineers, this posed a different kind of threat: distortion. Classical music composers used wide dynamics (the difference between the loudest and quietest part of the audio) for a reason, high volume does not mean high retention. The louder your audio is, the higher your risk for losing audio depth and the less interesting your audio will sound. Why do horror movies feature very quiet scenes followed by huge orchestral stings? Low lows and high highs create captive audio experiences.
So what am I, the average podcast producer, supposed to do? Manually increase and decrease the levels of every peak and trough in my waveform? Sort of.
The Technicals
We have standards for this sort of thing and if you don’t comply, there are machines that will make your audio comply. TV and Radio are required by law not to have commercials much louder than programs. To protect themselves, everything that gets broadcast is run through a process of normalization and/or loudness averaging.
That means all of the samples in a piece of audio are assigned amplitude values, we refer to these as volume. They’re charted by how high the line goes on a waveform. If the line goes up and out of the waveform, that means the sample’s amplitude is too great for the machinery to accurately record and that information gets distorted or it “clips.”
Since these numbers can all be charted, a computer can create an average for all of the samples. That average can either take into consideration only the peaks (the loudest parts of the samples) or the root mean square or RMS for short.
If you listen to a very consistently loud song, and a softer song with some crescendos, measured by peaks, their averages will be the same despite one being objectively louder.
LUFS or Loudness Unit Full Scale is an attempt to quantify audio with moving averages. In practical terms, its a way of bringing humanity back to volume averaging. It reads average loudness over time and as a whole. It still brings your lows up and your highs down when normalizing, but it does so within the context those samples appear; within the context of the audio as a whole and not based solely on the highest levels of the audio.
When an editor or machine averages a piece of audio, as needed, the machine adjusts the values up or down to get them closer to the mean, higher than the mean, or lower than the mean as directed. The problem with this is when adjusted instantly by machines, your dynamics can get lost. To ensure that they don’t get sued, broadcast entities crush your audio and kill dynamics so that the average loudness of the audio fits their constraints. They don’t much mind this because people who listen to TV and radio, don’t typically do so with headphones.
The Solution
Podcasts exist in a new (read “unregulated”) space and that means there isn’t a governing force that requires loudness compliance. This means there’s a lot of wiggle room for bad audio. My goals writing this are twofold. One, get more people to start volume averaging in an attempt to save all of our ears. And two, to help people save their dynamics when averaging.
Firstly, you’ll need to Google how your DAW deals with loudness levels. I edit in Adobe Audition and there’s a tool called “Match Volume” under the “Window” tab at the top of the screen. In a full-scale waveform readout, 0db is the loudest audio can go without clipping. That is why LUFS is measured in negative numbers. It’s a measurement of the confines put on the audio’s average amplitude to avoid clipping, but also to bring lower sounds closer to the mean.
The average loudness level for podcasts agreed upon by engineers is -16 LUFS for stereo audio and -19 LUFS for mono audio. For comparison, TV and radio are constrained to -23 LUFS. This means your podcast has more wiggle room than TV and radio and brings me to my second point.
The art of audio comes in balancing loudness averaging with dynamics. You want all of the audio you produce to be of the same average LUFS value, but you don’t want to lose the troughs and peaks that make your audio interesting. Here’s how I go about doing this:
Ensure that when you record VO, you’re in a well-insulated space with as little background noise as possible. The more silent it is when you’re not speaking into the mic, the better your “audio floor” (or quietest sounds) will be. That means if your averaging brings your lows up, you won’t get noise as a result of raising the volume.
Adjust your recording volume to be as close to -12db as possible. Ideally, this happens in the hardware rather than in post, but you can do it after the fact if you forget. -12db is a solid volume that allows your average volume room to breathe.
This room to breathe is called “headroom.” You lose audio information as your volume approaches the 0db limit. You, therefore, want to keep your audio a healthy distance from it.
Normalize sparingly. If you do have some audio that peaks a lot, go ahead and normalize it, but do so only when needed. Normalizing crushes your peaks to bring your audio into a defined range. If you’re going to be LUFS averaging your audio at the end of your project, you’re doing this twice.
Run your loudness averaging on your finished audio. When your multitrack is finished, you need to export all of the contained audio onto a single track. Then run LUFS averaging on that single audio track so it can take into consideration all of your VO, SFX, and music. If your VO is close to -12db, then you’ve left yourself room to add music on top without clipping when averaging.
Edit at a standard volume. It may sound simple, but I’ve seen dozens of editors make this mistake. You need to set your computers output volume to a specific level and then use that specific level when you’re editing so you can truly hear how loud your audio is. For me, I edit everything as loud as my computer will go in my headphones. That way, if something sounds too loud to me, it’ll definitely sound too loud for my listener.
By following the above guide, you’re doing your part to stop the march of terrible audio on the podcasting frontlines. Good luck, soldier.
Sources:
https://theaudacitytopodcast.com/why-and-how-your-podcast-needs-loudness-normalization-tap307/
https://www.sweetwater.com/insync/what-is-lufs-and-why-should-i-care/
https://www.audiodraft.com/blog/audio-levels-101-all-you-need-is-lufs/
https://dynamicrangeday.co.uk/about/
https://productionadvice.co.uk/loudness-means-nothing-on-the-radio/
https://productionadvice.co.uk/online-loudness/
https://en.wikipedia.org/wiki/Commercial_Advertisement_Loudness_Mitigation_Act
https://transom.org/2016/podcasting-basics-part-5-loudness-podcasts-vs-radio/