Make it sound right or break immersion

June 2017

If you close your eyes you can still see the world… through your ears. Not only do you separate the barking dog from the light footsteps of a woman, but you can also place them physically in the space surrounding you at a surprisingly high level of accuracy. The soundscape paints a picture for your brain, with you smack bang in the middle of it all. This feeling of presence is the golden promise of Virtual Reality. But it’s a fragile illusion that can easily be broken.

Localization and Spatialization

With our ears we can localize many sounds with a high degree of resolution from all angles.

The human outer ear, The Pinna collects incoming sound. Together with the head, shoulder and torso it provides filtering cues to our brain about the direction of a sound source. So where localization describes our ability to determine where sound is coming from – spatialization is the process of synthesizing the same cues in order to «trick» your ear to experiencing a localised sound.

Headphones, head tracking, and HRTF filtering tools suddenly give the sound designer a workflow to make convincing spatialized audio that trick your ears into believing audio is localized. Spatial audio is important in immersive experiences because it can enhance the perception of size, space and distance.

It can help orientation and sense of balance and direction. It can expand the storytelling by putting action in audio where the eyes can’t see. If you hear a branch snap behind you, you turn your head to see what’s s there.

Recording and capture of audio for VR and 360° film

In a 360° film production a 360° camera is recording the action from a first person point of view.

It is therefore common, and makes sense, to record audio from that same point of view. Ambisonics is a full-sphere surround technique suited to record and reproduce audio in full 360° surround.

So – thats it then? Stick an ambisonics mic at the camera position and we’re done?
Well – could be. And in some cases that might cover it.

But just as in regular audio production, building the mix with production dialogue, ADR, foley, ambiences, voiceover, music and sound effects makes for a richer and a more detailed experience. And for animation, games and interactive experiences you would of course need to create the world from scratch.

Sound design and post-production

Back in the studio we start the process of how to best support the storytelling with all of the collected audio material available to us. The sound designer would approach this much as he would do for a regular movie or game production. But again we have to take in account the head tracked immersive character of the VR medium, and decide upon how we spatialize…or not. So why not just spatialize everything?

Again you could – but there are few strict rules, and artistic choices to make.

But think of this; how would you react if you followed a localized sound cue only to find out that it had no visual source? In an experience that mimics how we explore environments much like we would in the the real world, we should be precise about what we treat as diagetic (sound with a visible source) and non-diagetic sources – and avoid making non intentional phantom-sources.

Spatialization of a narrators voice would probably cause confusion, unless the narrator is actually present as a person inside the visual scene. So, music is obviously non-diagetic and should be head-locked and not spatialized, right? Maybe – this is the established way to treat it in regular film and games.

But again there are few rules, and maybe with VR it’s time to experiment with this. Maybe head-locked music has the effect of pulling you out of the immersive experience? In «Farlands» Tom Smurdon experiments with quadraphonic mix of the music where the four tracks has fixed directionality. The «north» track is always playing from north regardless if you turn your head. But to avoid the phantom-source issue the distance is always is constant. The listener will never «reach the music».

In Land’s End Todd Baker is doing a blend between a head-locked stereo music track, and spatialized point sources tuned to global musical key – effectively blurring the line between the two.

With VR, we are at the starting point of experimentation, exploring and discovering new storytelling techniques. Audio is a big part of the experience, and when done right it can take the immersive experience to a higher level.