SPATIAL AUDIO CODECS OVERVIEW
Though spatial audio in various forms has been around for decades and Dolby Atmos had become synonymous with cinema and movie soundtracks, it wasn't until late 2020 Apple when announced their "Spatial Audio" branding of Dolby Atmos that spatial audio and other codecs (as below) have fast become rapidly adopted across the wider audio industry.
There has simultaneously been an increase of hype, confusion and misinformation about this exciting audio revolution.
As we are developing codec agnostics hardware post-production platforms we felt it was a our responsibility to share un-biased facts about these codecs so as to help the industry choose which system best meets their workflow needs and commercial objectives...thus we will grow this section of our website.
There has simultaneously been an increase of hype, confusion and misinformation about this exciting audio revolution.
As we are developing codec agnostics hardware post-production platforms we felt it was a our responsibility to share un-biased facts about these codecs so as to help the industry choose which system best meets their workflow needs and commercial objectives...thus we will grow this section of our website.
How is Spatial Audio different?
Though there is a plethora of audio systems from mono, stereo, 4-track, Dolby surround, DTS etc. the primary systems of audio can be seen as the following:
Though there is a plethora of audio systems from mono, stereo, 4-track, Dolby surround, DTS etc. the primary systems of audio can be seen as the following:
- Stereo Sound is a two-channel audio system where sound is divided into left and right channels. It gives a sense of width but doesn't provide much sense of depth or any height.
- 5.1 (or 7.1) Surround Sound involves multiple speakers placed around the listener. It includes front, rear (and in case of 7.1, side) channels, as well as a subwoofer for bass. It provides a sense of width and depth but doesn't incorporate height channels.
- Spatial Audio refers to techniques that create a three-dimensional sound field, providing a sense of width, depth, and height. It uses discreet channels (as stereo and 5-7.1) It can make it feel like sound is coming from all around you, including above and below.
- Object-Based Audio goes a step further by treating individual sounds as "objects" that can be positioned anywhere in a 3D space. The position of each object is defined within a given space and the codec renders the sound environment in real time depending on how may speakers are defined for said environment eg. 5.1.2, 7.14 or 9.1.6 (with the last digit being the number height channels). This enables the intent of the sound designer to be accurately reproduced no matter the size of room or system configuration.
- Binaural Audio is not a "codec" but a method of using phase/timing of audio signals sent to Headphones to create a 3D spatial illusion.
The Main Codecs in Spatial Audio:
- Dolby Atmos
- DTS:X (incl. IMAX Enhanced)
- Auro-3D
- MPEG-H
- SONY360 Reality Audio
- Spatial Inc.
- Ambisonics
Key differences between the codecs:
- Object-Based vs Channel-Based: Dolby Atmos, DTS:X, MPEG-H 3D Audio, and Sony 360 Reality Audio are object-based, meaning they can position sound anywhere in a 3D space. Auro-3D and Ambisonics are channel-based, where sound is mixed into specific channels or layers.
- Height Information: Dolby Atmos, DTS:X, Auro-3D, and MPEG-H 3D Audio all support overhead sound natively. Sony 360 Reality Audio can reproduce a sense of height through headphones using HRTF (Head-Related Transfer Function) techniques. Ambisonics requires additional processing to reproduce height information.
- Flexibility: Dolby Atmos, DTS:X, MPEG-H 3D Audio, and Sony 360 Reality Audio can create an immersive audio experience regardless of the number or placement of speakers, as long as the playback system supports these formats. Auro-3D requires a specific speaker setup to perform optimally.
- Ideal Application: Each codec has its own specialty. Dolby Atmos and DTS:X are popular in home theaters and cinemas. Auro-3D is also used in cinemas. Sony 360 Reality Audio is optimized for music, creating a "live performance" feeling. Ambisonics is commonly used for VR and 360-degree videos, and MPEG-H 3D Audio is widely used in broadcasting.
- Maximum Channels: Dolby Atmos supports up to 128 audio tracks (or objects) in cinema, DTS:X supports up to 32 discrete channels, Auro-3D supports up to 13.1 channels, Sony 360 Reality Audio's max channels are not specified (as it's more about object positions in the spherical sound field), Ambisonics depends on the order (First order has 4 channels, second order has 9, etc.), and MPEG-H 3D Audio supports up to 64 discrete channels.