This blog has been edited from its original format; some references have been changed to reflect Ozone 7.
Recorded music in the present day is widely distributed and consumed in a variety of lossy compression formats. The most common compression formats in digital audio are MP3 and AAC. These two codecs represent the necessary compromise between file size/convenience and audio quality to deliver easy-to-download but still acceptably listenable material to an audience.
These mediums are likely to be around for some time, particularly with the popularity of iTunes, SoundCloud, and streaming services such as Spotify and Pandora. When digital audio is compressed for delivery on these platforms, it is inevitable that there will be artifacts and audio quality tradeoffs.
Have no fear, though! It is possible to minimize or even prevent some of these issues from occurring down the line by following some best practices during the audio mastering process.
What is Mastering?
There are many definitions of audio mastering. Most commonly, though, the term mastering is used to refer to the process of taking an audio mix and preparing it for distribution. There are several considerations in this process:
Unifying or adjusting the sound of a record to correct any mix balance issues, or enhance a particular sonic characteristic.
Maintaining consistency across an album so that each track sits comfortably within the overall aesthetic of the playlist.
Preparation for distribution, which could mean traditional duplication or replication onto CD/ vinyl or preparing for digital download, depending on the intended delivery format.
This article takes a look at preparing your audio for digital downloads.
What do MP3 and AAC do to your audio?
Lossy compression refers to a class of data encoding methods that uses inexact approximations (or partial data discarding) for representing the content that has been encoded. (Read the Wiki article for more.) In simple terms, lossy compression formats utilize psychoacoustic models in an attempt to remove the audio information our ears won’t detect as missing.
This often means removing audio information from both the high end and the wider elements of the stereo image. Any lossy encoder introduces an approximation error, a noise which can increase peak levels and cause clipping in an audio signal, even if the uncompressed source audio file appears to peak under 0 dB. It can be hard for the average ear to isolate and detect the artifacts, so consider the following audio examples.
NOTE: SoundCloud streams all audio at 128 kbps MP3, rendering this comparison redundant. For a valid comparison, the source files are available for download.
Here’s an uncompressed, 44.1kHz, 16-bit CD quality .WAV file.
Here’s the same .WAV file, compressed to a 320 kbps MP3, the highest bit rate available in the MP3 standard.
Here’s the same MP3 file, but with the ‘side’ channel isolated. These same artifacts were present in the previous example, but are suddenly much more apparent when heard in isolation.
There is a wide variety in the how different lossy encoders work and their resulting quality. As usual, use your ears to audition the results and determine the encoder and codec that sounds best to you.
An audio mastering engineer can take steps to anticipate and prevent these artifacts, regardless of the encoder being used.
Mastering for iTunes
Since Apple moved to iTunes Plus in 2007, iTunes downloads are 256 kbps AAC.
iTunes uploads require uncompressed, 24 bit .WAV files which are transcoded to 256 kbps AAC further down the line.
Here are some recommended settings when mastering audio for iTunes:
Use a True Peak limiter, like the Maximizer in Ozone 7, to ensure that the margin is set to –1 dBFS. Apple recommends leaving 1 dB of headroom to prevent any clipping from occurring due to the noise added by the AAC encoder.
Forget about the Loudness War and go easy with any compression, limiting or dynamics processing. Compress a track for the purpose of improving the sound quality, not for increasing the volume. SoundCheck in iTunes uses an advanced algorithm to determine perceived loudness (not simply the peak/RMS values), level match each track to –16 dB, and then add this volume information to the metadata in the header of each audio file. A ‘competitive’ track with no dynamic range now sounds less good in iTunes when played next to a track with greater dynamic range.
Use iTunes as a tool to compare your masters to reference tracks from other artists that you like the sound of. With SoundCheck enabled, hearing your tracks side-by-side at the same perceived volume as the work of others can help to determine whether your masters will hold up for an iTunes listener. Adjust your tracks based on your listening, then render and submit the full quality uncompressed .WAV or .AIFF masters to iTunes.
Mastering for SoundCloud
SoundCloud transcodes uploaded audio to 128 kbps MP3 to prepare the audio to stream from the site. If an audio file on SoundCloud is made available for download, the downloaded version will be in the original format.
Uploading an MP3 is redundant since SoundCloud will transcode it anyway, which could in turn introduce more artifacts to audio that’s already compressed. Therefore, the best practice is to upload an uncompressed, 24-bit .WAV file and allow SoundCloud to process it.
Here are some recommended settings when mastering audio for SoundCloud:
Use a True Peak limiter, like the Maximizer in Ozone, to ensure that the margin is set to –0.3 dBFS. This is an acceptable threshold to mitigate most of the clipping that occurs during the encoding process. However, depending on the source material, you may find a margin of –0.5, –0.7, –1.0, or –1.5 dBFS sounds better, with less distortion. In these cases, you simply have to perform trial and error, perhaps by uploading several versions and deleting all but the best sounding one.
SoundCloud does not have a feature like Apple’s SoundCheck, so an audio master destined for SoundCloud has more freedom to raise the overall RMS level for competitive loudness. Consider this a practical and aesthetic choice. Make sure to use volume matching, such as the ‘automatically match effective gain’ feature in Ozone, to evaluate loudness increases objectively.
Using a stereo imaging tool like the Imager in Ozone, narrow the high end between 5–20%. 128 kbps MP3 is the lowest commonly acceptable audio quality. As such, a lot of information is lost during encoding and an extremely wide mix is more susceptible to noticeable artifacts. Ironically, some pre-emptive narrowing can help avoid perceived loss of energy and width.
Mastering for YouTube
YouTube transcodes all uploaded video (and the contained audio) in order to offer streaming qualities at 360p, 480p, 720p, 1080p, 1440p (2K) and 2160p (4K). Youtube uses the H.264 video codec with the AAC audio codec. The quality of stereo audio playback depends on the user selected streaming quality setting as follows:
360p and 480p video will playback audio at 128 kbps
720p, 1080p, 1440p (2K), 2160p (4K) video will playback audio at 384 kbps
YouTube can only down-convert video, so it’s best to upload the highest quality level you can within the H.264 codec. Why not upload a .MOV with uncompressed audio? For best results, YouTube actually recommends uploading media that is already encoded, rather than uploading a .MOV that contains a full quality .WAV file.
Here are some recommended settings when mastering audio for YouTube:
Use a True Peak limiter, such as the Maximizer in Ozone, to ensure that the margin is set to no higher than –1 dBFS.
Not all encoders are created equal. Render from the video editor in full, uncompressed quality for both video and audio, and then audition the audio visual qualities of different media encoders.
The True Peak Limiting option can be enabled right below the Threshold Meter in the Ozone Maximizer.