We’re offering concrete, directional ideas to the future of immersion in music creation – we need to move away from traditional channel-based thinking and into object-based thinking.
First, let’s define the problem spaces we’re talking about:
- Immersive music creation
- Immersive audio experiences for cinematic and episodic TV
- Immersive sound experiences for theme parks, rides and events
- Live events – sports, shows, performances –
- “Garageband” – immersive capture of rehearsals
- User-generated content, e.g. VR videos on your phone
- Performer / music creator is inside a virtual space, creating music
- Would artists create in a real environment or virtual environment?
- If we’re thinking 5 years out, then we’ll assume current “comfort” issues in virtual environments have been solved.
For the context of this report, when you add height, you have immersion – getting off of a 2D plane and into 3D space.
Once in 3D space, playback could be in an Atmos system, could be 3D audio in headphones, or could be in a VR/AR/MR environment with head-tracking.
If height is the thing, then what would musicians want to do with height? We must consider creative and re-creative applications / creating new experiences vs. re-creating orchestrations and performances, e.g. environmental sounds, chimes, rustling, a stack of guitar amps, a tall pipe organ.
In this report, we are exploring a few perspectives:
- The artist / musician is in a VR/AR environment creating music – either for himself, or working on a production
- One may also be producing non-immersive (traditional) music within this environment
- Multiple artists & musicians may collaborate within this environment
- Artists may be creating an immersive environment for the audience / listener
- Events and exhibits
- VR tools to put the content creator in a simulated, walk-through environment for the purpose of creating soundscapes for real places.
Object-based music creation
In this brave new world, we no longer know what the rendering is going to be. Games have been addressing this for years – all audio in games is object-based. However, musicians don’t think like game audio programmers and sound designers!
Dolby Atmos, DTS MDA, Fraunhofer MPEG-H are all object-based audio formats, but we need a descriptive language / notation that allows musicians to express immersive music. There are examples of this work going on today:
- ADM (audio definition model) is an open standard in ITU that defines metadata for describing audio objects
- International working group: SDIF Sound Description Interchange Format
Our challenge is that current object-based formats talk about where a sound snippet lives in space, but they don’t describe anything about the sound itself from a musical / compositional perspective. We need objects that have properties that synthesizers will control – the glTF group is working to address this https://www.khronos.org/gltf. We’d like to see controls for parameters like:
- Size & radiation patterns
- Dynamic behaviors that change over time
Also missing from our current object-based audio descriptions is the scene / context in which the object is deployed. There is some work going on in neuroscience re: what is the cognitive model of the space you occupy (or virtually occupy, or in an augmented/mixed reality space). We must also consider that the context of the music creator may be different from the context of the audience. Further, the content creator may want to re-shape and influence the context of the audience – bring the audience out of their physical environment and into a virtual or augmented reality that changes the perception of where they are.
Music instruments and effects
New virtual instruments need to be designed from inception to create immersive sound. MIDI already supports XYZ position controls, but it would be great to have access to and manipulate object metadata. Further, the HD Protocol adds acceleration/ramps… (for future work – how does this tie into spacial processors for live immersive/AR/VR playback or real-time interaction?) https://www.midi.org/forum/midi-hd-protocol
We will need to create 3D effects processors, as well as take advantage of new opportunities presented in object-based effects processors, i.e. directionality processors way beyond stereo panners!
Tools that we currently use to created object based audio are fragmented – what software platform should I be working in?
- A traditional DAW?
- An object-based tool like Wwise?
- An Object-Based DAW – what would it do? What is the next generation DAW that can be used for 3D sound object design, from creation to delivery? We at least need:
- Ability to create and manipulate objects in real-time
- DAW environment should allow for real-time preview in a 3D immersive space
- Ability to “paint the soundscape” in real-time while using an HMD (headset)
- Should I be using 3D positioning plug-ins, or should I be thinking about object-based sound generators?
- What is my “audio quill”? (tilt brush for music creation)
- What is an event? A pre-existing sound or waveform?
- What are the gestures / constructs?
- How / where is the sound then produced?
- How long does that sound last – aurally and visually?
- What if I sing something and then waveform projects from my mouth and across the room, it hits a wall and falls to the ground – but I can pick it up later and use it as a sample
- What are the artistic sensors and triggers?
- What happens when multiple people interact
- What happens when individual performers create sound objects – the traditional rules of music collaboration are out the window b/c players have direct access to manipulate each other's sounds
- In my AR environment
- Everything could be a sound emitter or manipulator
- We can grab (sample) sound objects in a mixed reality space
- When we talk about immersive music creation, are we assuming there is always a video / visual component?
- In this space, what is good for live and what is good for post, wrt both tools and audience experience?
- How can we break the old paradigm in how we make music today so that a new one (paradigm) can be created to adapt to AR/VR experiences?
- Snapshot of current tools / pipeline
- Gaming tools are leading in this space
- Audio gaming middleware engines like Wwise and FMOD
- World-building gaming middleware engines like Unity
- Ambisonic capture
- Likely requires heavy post-production work
- Oculus Rift immersive 3D VR tool
- Using ambisonics and audio objects, you are creating a virtual soundscape you can walk around
- Objects have properties that synthesizers will control: (see gLTF rogue group)
- Size & radiation patterns
- Dynamic behaviors that change over time
- Vive application that uses triggers to create waveforms through particle generators, painting sound
- Aaron’s collaborative Minecraft hack to build synths
- The user’s body should be emitting sound
- As a visual environment – how do visuals complement and enhance the sound?
- Moogfest hack (see Bobby)
- Ordered, mathematical ways of manipulating sound spatially, e.g. billiard balls, spirograph
- There can be 2D controllers for 3D music and 3D controllers for 2D music (matrix?)
- Justin Lassen: Shapesong VR https://youtu.be/5k8EynUtqOQ
- Alto Nova: real-time house/trance concerts – every sound is represented by an interactive live waveform projected in the club
- Tiltbrush “Audio Reactive Brushes” https://youtu.be/uFzAB4mr3KI
- SoundStage for HTC Vive – emulating current instruments: http://www.soundstagevr.com
- Representing waveforms in VR: http://groovesizer.com/tag/waveform/
- Existing tools
- IRCAM SPAT
- ICST (Max for Live)
- Max MSP
Project Bar-B-Q 2014
Interactive Music Creation and Adoption - The Quest Continues!!!
Relevance: Using game audio tools for non-game music creation
Project Bar-B-Q 2013
Enabling More Profound Human Expression with Modern Musical Instruments
Relevance: A study of how people want to control musical instruments
Project Bar-B-Q 2011
Making Spatialization Work Within Constraints of New Form Factors
Relevance: Automatic spatial rendering that preserves the artist intent
Project Bar-B-Q 2010
Wherever You Go, There You Are: Audio that Understands Context and Mobility
Relevance: Focusing on the context problem we identified
Project Bar-B-Q 2005
New Approaches for Developing Interactive Audio Production Systems
Relevance: A deep discussion of interactive audio standards, as well as extensions to game audio platforms
Project Bar-B-Q 2002
User Interface Design Issues for Audio Creation Tools
Relevance: UI considerations for audio creation tools, that could apply to VR/AR as well
Project Bar-B-Q 2000
The Multichannel Audio Working Group
Relevance: End-to-end multi-channel audio considerations including authoring, metadata and rendering
Project Bar-B-Q 2000
General Interactive Audio
Relevance: Requirements and frameworks for generic interactive audio that could apply to VR/AR