With all the exciting developments around Spatial Audio recently, Abbey Road's Head of Audio Products, Mirek Stiles, decided it was time to sit down with an artist and talk frankly about the potential of Spatial Audio and what the technology means to them from a creative perspective.
So we turned to Abbey Road regular, Stephen Barton, who himself has carried out extensive work looking into applications of spatial music, utilizing high/ultra-high order ambisonics and object-based audio for film, non-VR games, and VR/AR/XR.
Stephen's extensive credits as a composer for film, television and games include scores to Titanfall, Titanfall 2, Call of Duty: Modern Warfare, 12 Monkeys, Unlocked, Cirque du Soleil 3D: Worlds Away, Jennifer’s Body, and many others. His current focus is primarily on spatial composition and mixing, writing immersive music for both Atmos and other spatial formats, and exploring how spatial audio can enhance both movies and television/streamed content as well as recorded and broadcast music.
Mirek - From an artist's point of view, what does Spatial Audio mean to you?
Stephen - The word “revolutionary” is thrown about very frequently in audio, and often without much justification - but it is justified here. When spatial audio is done well, it’s like the walls of the listening space, or the headphones, drop away completely - and the emotional effect of that is absurdly powerful. We all understand space in music, but we know it largely subconsciously, as one component of why most music is more powerful live, for example. As a composer and music producer I can transport the listener not just to hearing a performance, but to hearing a place and a time, be it a realistic acoustic space…or something much much more than that. The emotional power of that for both recorded music and live broadcast is impossible to describe.
Mirek - There are a lot of “buzz words” doing the rounds at the moment, Immersive audio, 3D sound, Binaural, Dolby Atmos, Object Based Audio, 360-sound and Spatial Audio – what are your thoughts on how we avoid potential consumer confusion?
Stephen - There’s a push to avoid calling it 3D Audio which I think is smart, because it doesn’t describe it all that well to the consumer, in truth. Whilst I also love what Dolby do, and I think they play a huge role in the future of this (particularly as regards movie theaters and that content reaching home entertainment systems, plus content streaming) I think the more that it can be a market with easy access both for producers and consumers the better. It’s no good if only 10 people can make music in this way and 1000 people can play it back. The good news is the tools and the tech already exist; what is missing right now is the seminal content, but that’s coming rapidly. Until last week I hadn’t heard a truly great pop/rock mix in a spatial format, one that made you never want to drop back to stereo again…but I have now. If everybody can hear that they’ll get the point instantly. The question is how that happens. I think if by and large we stick to “spatial" and “immersive" audio, and then are able to show people with sound rather than explain with words, we’ll actually get past that problem, but the issue is for a while now we’ve had to explain this more in concept than in practice, which is going to gradually change.
Mirek - You have done a few SA experiential recordings in Studio One to date – what has been the biggest surprise so far?
Stephen - The biggest surprises have been those moments where you’re listening back with everything balanced and mapped out correctly and you get the out-of-body moment where what you are hearing and what you are seeing no longer match. Close your eyes and you’re hearing something that feels borderline indistinguishable from standing in the space - although we’re striving for something bigger than reality the same way we do with stereo and 5.1 recording now.
Other surprises have included having to think about 3D layout when recording - high frequencies in particular, if you have them off to one side, will be very noticeable; not necessarily bad but you have to think about what you’re after and be prepared to throw out traditional rules of thumb and layouts. It’s challenging years of convention but you have to be guided entirely by the music and if that means changing the layout of something fairly fixed, like a symphony orchestra, you just have to go for it. Why shouldn’t the lead vocal come from above your head or the kick drum from behind you? There’s no rule that says it shouldn’t…if it works for the track.
Mirek - What challenges are facing Spatial Audio?
Stephen - The biggest challenge is uptake. Predominantly that we have to have a way for people to hear this that requires little to no setup, cost, and is foolproof. It has to be streamable and broadcast-able. Binaural technology is very, very nearly there, but still missing a few key pieces. Soundbars improve on a daily basis. There are so many companies working on this I have no doubt it’ll be cracked, which is why it’s important the content is there for when it is. I have no doubt immersive audio will be a key selling point of everything from cars to mobile phones within the next 2 years.
The funny thing is from a consumer standpoint, it’s one of those things that people notice most when it’s taken away…play someone a stereo mix and then an immersive mix and they’ll notice the difference to an extent - but drop back from immersive to stereo and it’s like you just turned the Wi-Fi off. Everybody wants the immersive version back.
Mirek - Do you have any advice to other artists and producers looking to explore SA techniques?
Stephen - Have absolutely no fear of it - lots of the tools are, once you get into them, easier than traditional techniques. Surround mixing is seen as immensely scary, but lots of the spatial tools simply have a ball on a window that you move around. You place things where you want them. Even Atmos sounds scary to mix in but totally isn’t. Want something above you to the left? Pick it up and put it there. It’s that simple. My other big bit of advice would be to avoid being too literal, especially for work in binaural, where exaggeration really helps. It’s no different than what we do in a normal mix now - an amped up guitar is infinitely louder than a vocal, but we aren’t literal in that way when mixing - go by our ears to create a mix, and the same is true for spatial audio.
Lastly, nothing should be spatial for spatialisation’s sake, like the early goofy 3D video experiments where the monster jumped off the screen at the audience largely because they needed a way to show it off to justify the ticket price. People tire of that after one or two times, but they never tire of something that transports them to another place entirely.
Mirek - What do you think SA has in store for the future of music?
I think spatial audio literally is the future of music! I think within a couple of years if you’re stuck in a traffic jam on the M4 or the 405 or wherever, you’ll be able to turn on a radio broadcast and be transported sonically to a concert, to a news report, to another place. If you’re working out, you’ll have a full 360° audio experience over earbuds.
Your phone will project 3D audio of live concerts. VR and especially AR and MR/XR will rely on it and instead of music videos, we’ll have music experiences once the visual side of the technology improves and the headsets get practical. I’d be surprised if stereo goes away, it’ll be a slow transition, but it could easily end up being the equivalent of what mono is now. Nothing has displaced stereo in the last sixty years because nothing really offered that much more in terms of experience. This does.