Monitor Level Calibration: Engineering Secrets

Have you ever worked on mixes for hours only to be surprised when you play them in your car or on a friend’s speakers? Have you ever turned a mix up as loud as you can but it still doesn’t compete with professional CDs? Have you ever listened to your songs only to be annoyed that they’re all different volumes? Does it take you more than four hours to mix a song? If you answered yes to one or more of these questions, you’re not alone. There is a simple explanation for the problems you are experiencing: It has to do with your monitor calibration.

What I am about to reveal is a secret technique that separates the professionals from the hobbyists. It involves setting up your playback system in a specific manner. This practice is not often mentioned on Internet forums and conference panels. Why? Because this way of working is second nature to many engineers, especially those of use who grew up on consoles and analog tape machines. But for those raised on digital audio workstations (DAW), this is a fact-of-life that no one has bothered to discuss.

The reason mix engineers must calibrate their playback systems stems from the nature of human hearing. The following will explain this phenomenon, examine how contemporary monitoring got so messed up, provide some steps that you can follow to fix your setup, and delineate the advantages of this approach. Other than the investment of your time (and some of this stuff may require a double-take to absorb), this solution relies on free or low-cost technology to implement at a basic level.

Not All Sounds Are Created Equal (At Least To Humans)

Studies have shown that human hearing is not linear. We do not perceive different frequencies at equal loudness. Midrange signals in the 2—4kHz range are most noticeable while the perceived loudness of other frequencies is less (now you know why the telephone sounds the way it does – those upper mid-range frequencies are most discernible to humans). Scientists reading that statement automatically grasp its meaning and implications. But if you’re scratching your head wondering “what?” it probably means you’re normal! It’s easier to explain with examples.

When listening to music at soft levels, humans are not as sensitive to bass frequencies as much as high frequencies. (Ever notice those bass-boost buttons on music players or loudness buttons on old-school stereo receivers? Those circuits raise the bass levels for low-volume listening. Now you know why they were added to so many playback devices). Conversely, at louder levels, we are more sensitive to bass frequencies and less attuned to the high end. Back in 1933 researchers Harvey Fletcher and Wilden A. Munson conducted hearing tests on humans. They graphed their results to show sensitivity to frequency by level of playback. Their names got attached to the curves they published. To this day audio engineers refer to the Fletcher-Munson curve when discussing this phenomenon. Technically, the correct term is “equal-loudness contour,” but the core idea is the same.

Okay, you’re probably thinking, but what does this have to do with my recording studio? As it turns out, there is a loudness “sweet spot” where most professional mix engineers work. When listening in this range, engineers are able to make decisions regarding the levels of various mix elements that hold up across many playback environments and at varied levels. As an added bonus, this “sweet spot” happens to be lower than the fatigue threshold, allowing you to work long days without fear of long-term hearing damage. (Fig. 1, below.)

calibration

Fig. 1 This graph depicts the SPL required for humans to perceive a given frequency signal. For example, the lowest red curve shows a 20Hz bass signal would have to reach an SPL of 60 for a human to hear it. But a 10kHz signal would only need to reach an SPL of 10 for perception.

Where Did We Go Wrong?

Computer-based audio workstations have given millions access to recording technology that was previously kept in the hands of few. But that doesn’t mean everyone who buys a DAW automatically knows how to use it. As Steven Slate (Maker of Trigger and Steven Slate Drums) notes, “As recording applications become more accessible and more and more people start to record audio, I’d like to make sure that they are aware of how to use their tools in a way that will ensure that they can make music in a professional way.”

When people come to me with these problems, invariably the setup is similar: The person purchases a digital interface, connects their monitor speakers to the main out (L & R) on the back of the box, and uses the master fader in the DAW to control their playback volume. A variation is the person turns the volume knob on the interface down too low, or simply picks a level that sounds loud enough depending on their mood. Working this way has two major pitfalls. First, there is no standardized playback calibration, meaning all the benefits of mix translation and gain staging are forfeited. But perhaps worse, with no standard level, the engineer will resort to turning up faders to make the song loud based upon the arbitrary volume setting. This usually distorts the DAW; hides clicks, pops, and bad cross-fades; and provides a false sense of how much compression is happening. Remember: the faders exist to blend the audio relative to one another, and to provide the software with guidance regarding the summing of the overall signal. They were never designed to serve as a playback level for your monitor speakers. Working this way is like hammering a screw into a wall – yes, the hammer will get the job done, but it’s the wrong tool, the results won’t hold up, and you’ll probably have to start over if you want to do it right.

{pagebreak}

Steps To Calibrate Your System

Calibrating your playback system does not take very long, and does not require expensive gear. You will need: a sound pressure level (SPL) meter, a pink noise audio file*, monitors, and a knob that controls the speaker volume (*pink noise is used because its energy is distributed uniformly by octave throughout the audio spectrum). Having earplugs or shooting muffs handy will help keep your ears comfortable during the testing. Some professional monitor controllers (such as the Cranesong Avocet) are marked in dB, which is the ideal, but the method below will let you calibrate any monitor controller. Please note: Mastering engineers and those who work with acoustic designers go through a more rigorous and detailed process. But the following steps will help:

Load the stereo pink noise file into your DAW. Route it to play out the main Left and Right speakers with your faders at 0dB (unity gain) and pan pots set to full left for the left channel and full right for the right channel. Mute the right speaker on your monitor controller, or disconnect the cable to the right speaker. Then perform this adjustment on the left speaker, and repeat it for the right speaker. If the results for the left and right speaker are different by more than 0.5—1dB, then consider getting a new monitor controller, as yours must not be accurate enough to make good monitoring judgments.

Before playing the file, turn your monitor volume control down all the way, position the SPL meter to be between the speakers, as close to where your head would be if you were mixing. Set the SPL meter to slow (average) or RMS readings. If there is a weighting choice, choose “C,” which is tailored for music readings. Put your earplugs in, press play, and start to turn up your monitor controller. Keep going until the SPL meter reads 83dB (warning: This will be loud). Using tape, grease pencil, or whatever you want, mark this level on your monitor controller as “0.”

Now proceed to put marks on your volume control every 3dB. When you’re done, you’ll have marks at 0dB, -3dB, -6dB, going down to -18, which should be low enough. Now that your monitor control is calibrated, you’ll find that when working with normal, well-recorded music and with your loudspeakers at about 6'—9' from your ears, you’ll find a comfortable monitor gain at around -6 to -9dB. The lower you have to place your monitor gain control, the more compressed or squashed the recording must be. In other words, Snoop Dogg really is a hot recording, as you’ll probably have to put your monitor at -15dB or lower to keep your ears from bleeding. This means the peak-to-average level of the recording must be very low as the recording is compressed and all the transients and life have been pulled out of it.

On the other hand, if you have to turn your monitor up to -3dB to make the recording sound loud, the recording itself was likely either mixed low or it has a very high peak-to-average ratio. Then you have to make the decision whether it needs more compression in order to compete with other recordings, or whether you want to be brave because you like the sound quality of a recording with good-sounding transient peaks. Depending on the kind of music, a happy medium will probably be found when the monitor is somewhere between -6 and -12dB. The closer your loudspeakers are to your ears, the more you will have to turn the monitor control down, and you’ll have to learn where the sound quality turns for your particular monitor setup. Transients are a very important part of sound quality and if you pull them out even before the recording is mastered, chances are it’s going to sound spongy, squashed, or even smashed, and that distortion will translate to fuzzy sound, reduced stereo separation, and poor translation to the end medium, whether that be the radio, Internet streaming, iTunes, or CD.

Try to work at the recommended monitor gain as much as possible from here on out. If your average SPL on loud passages at your listening position is around 83dB with your monitor gain at around -6 to -9dB, you can be assured you are in the ballpark for producing open-sounding, pleasurable, wide-range recordings that compete with anything that was produced before the loudness race got really bad.

Also, turn down the monitor control further to see if the vocal level is not lost in the mix if people listen at lower levels. But get the bass right at your reference position (likely -8 to -9dB) or it will not translate to higher or lower levels because of that Fletcher-Munson effect. Turning up the controller to -5 dBFS is a great way to “get with the bass,” make sure there is chest-moving impact, and verify that the mix is not distorting or breaking up and that the bass is not too fat or muddy. (Fig. 2, below.)

calibration {pagebreak}

Once you have marked the levels on your volume knob, take off the ear protection. Pull up an excellent reference recording, one with good headroom and transient response. These are getting hard to find in the days of the loudness race. I’ll bet it sounds great at around -8 or -9dB. Now play a current overcompressed recording and try to play it at -8 or -9. I’ll bet it sounds distorted, and very fatiguing. Here is a gestalt moment. When you find a recording that forces you to turn the monitor knob super low to not sound too loud, you’re lying to yourself about what is really going into the mix.

Now that your system is calibrated, you’ll begin to intuitively know your mixing levels. It’s even possible to mix blind, without using a meter, if you set the knob to 0dB and just mix. This allows you the freedom to leave the meters and just use your ears because the meters will never overload. Just as guitarists hear out-of-pitch strings and drummers hear wonky drumheads, your ear will start to discern these differences. As Bob Katz explains in his book Mastering Audio, “When monitor gain is calibrated so average SPL is 83dB at -20 dBFS, and you then mix by the loudness of the monitor, then the music will never overload and you will never have to look at a record-level meter!” Yes, to compete with latter-day masters, the recording will have to be mastered later, but please don’t try to master and mix at the same time. Since distortion accumulates, the better the mix you make, the better the master that the mastering engineer can make from it. (Fig. 3, below.)

calibration

Advantages Of Professional Calibration

If all this seems like too much trouble, or not worth the effort, please consider why we go through these steps. There are many advantages to mixing at a calibrated level. Most importantly, mixes will translate better across playback systems. No one wants a mix that sounds good in the club but terrible on iPods. As you need fewer revisions to dial in the final version, you’ll get better as a mixing engineer, and have happier clients. You will be able to do more projects in less time, which is good for the wallet. When dealing with fellow engineers, your mixes will travel to other studios with fewer surprises, and you will be less likely to have the mastering engineer calling to ask for a remix, adjustment, or edit. Speaking of mastering, if you provide the mastering engineer with a mix that comes in with an RMS level on average at -18dBFS to -12dBFS, he or she will be able to spend their time polishing the mixes. If the mix was monitored too low, the mastering engineer will need to compensate for the equal-loudness equalization changes that come with adding enough gain to be as loud as contemporary recordings. If the mix is too hot, the mastering engineer will not be able to use his or her gear (such as de-essers, compressors, and limiters) in the most transparent and artistically pleasing way, and the distortion will accumulate and produce a poor master when it hits the radio or iTunes. Especially iTunes since AAC is not a medium that likes hot levels. Give the mastering engineer the headroom your recording needs for his or her equipment to do a good job. These things save time, and advance your reputation as a professional, or just make your own recordings sound much better.

Finally, a calibrated playback environment can help protect your hearing, allow you to work for longer hours, and decrease fatigue. While many things in music have wiggle room for taste, aesthetic, and vision, the correct way to set up your monitoring environment is not one of these areas. There is a correct and incorrect way to do things. Do it the right way. 

Supplies

SPL Meters are available as standalone hardware, or for iPhone and Android phones
• Radio Shack has a great one for $49 (Model: 2055)
• iPhone users can get studiosixdigital.com/spl_meter.html
• Android users can get Sound Meter from the Google App Store

Pink Noise Signal

• You need a pink noise signal calibrated to exactly -20 dBFS RMS level. I suggest you download a test file from: digido.com. Register (it’s free), and go to the downloads section.

Monitor Controller

• For the time being, the built-in knob on your interface can work. Better fidelity, more features, and ergonomics can be had with the Kush Audio Main Gain: store.wavedistribution.com/kush-audio/main-gain.html
• More features for larger setups in an analog system: dangerousmusic.com/stsr.html
• If your budget permits, an industry flagship is the Crane Song Avocet: cranesong.com/avocet.html

Glossary

Crest Factor — the difference between the average level of a recording and its peak level. Also look for the new term, which has been coined by the PLOUD group at the EBU, “PLR,” known as “Peak to Loudness Ratio.”

Sound Pressure Level (SPL) — measures sound intensity relative to a reference value. Sound pressure or acoustic pressure is the local pressure deviation from the ambient (average, or equilibrium) atmospheric pressure caused by a sound wave.

Level — a measure of sound intensity. However, it must accompany another defining term — otherwise it doesn’t mean anything. (e.g. Voltage Level, Sound Pressure Level).

RMS Level (Root Mean Square Level) — the average level of program material. Humans tend to hear things on average levels versus peak levels.

Peak Level — a maximum level of signal in a sound program. In digital audio, this peak level is usually noted in relation to how far it is from the maximum celling of 0. For example, a -2 peak is 2 decibels lower than the maximum 0 dBFS.

dB — a relative quality of sound compared to a reference.

dBFS — decibels full scale in a digital PCM system. This is the max level a digital recorder can encode or playback.

Fletcher-Munson Curves — often used to mean equal loudness contour curves. Named after the researchers who first widely published research indicating human hearing is not linear with respect to frequency.

Volume — is a term invented by marketing people for the name of the knob that makes things louder and softer. It has no precise definition in audio engineering.

For More Information

Katz, Bob. Mastering Audio: The Art And The Science, Second Edition.

Although the title suggests this is a book for mastering engineers only, over half of the text is devoted to digital audio fundamentals. I consider this a must-have reference book for anyone running a digital audio workstation. This is a reference that should come with every new DAW.

Brixen, Eddie. Audio Metering: Measurements, Standards And Practice / Edition 2

For a more detailed look at measurements, applications, and use across industries, Brixen’s book is considered one of the best reference books by broadcasters around the globe.