home  previous   next 
The Eleventh Annual Interactive Music Conference
PROJECT BAR-B-Q 2006
brainstorming graphic

Group Report: Providing a High Level of Mixing Aesthetics in Interactive Audio and Games

   
Participants: A.K.A. "Mixolydians (Bob in a Box)"

Chris Grigg, Beatnik

Guy Whitmore, Microsoft Games
Pat Azzarello, Microsoft Jonathan Pilon, Ubisoft
Fabien Noel, Ubisoft Scott Snyder, Dancin’ Mouse Productions
Pierre Lemieux, Dolby Peter Drescher, Danger
Tracy Bush, NCSoft Alain (Dr. Mad) Georges, Madwaves
Peter Otto, UCSD Oren Williams, Dolby
Matt Tullis, Dolby Jim Rippie, Independent
  Facilitator: David Battino, Batmosphere
 

Putting Bob in a Box
The Art of Automated Interactive Mixing


Executive Summary

Traditional methods of producing and delivering audio experiences have never managed to overcome a universal problem: the listener never hears audio in an “ideal” context, and each listener’s situation presents different challenges (sometimes slight, sometimes great) for the audio professional. Audio in electronic, interactive games presents unusually complicated cases to manage, since the content of the audio itself changes depending on what actions the listener takes in the game. What’s needed is a new approach, where high quality audio is mixed “on the fly,” specific to the user’s changing context. Fortunately, the audio industry, and game audio developers in particular, have years of experience and techniques to apply to the problem. Specific kinds of new technical innovations could help us provide even more consistently memorable and stunning experiences to our listeners. In an attempt to improve the state of the art, we provide some preliminary conclusions and propose specific educational resource efforts to increase our collective knowledge, share our hard-won experience, and prompt some technical innovations to propel the industry forward.


Introduction

Many individuals representing all sorts of perspectives (technology development, game development, game publisher, academic, journalist, game enthusiast, professional audio developer and professional musician) joined together at the 2006 Project BarBQ conference to debate common problems and explore common solutions to one of the deep unsolved dilemmas of contemporary audio development: to enable a complete and artful integration of the “listener’s context” into all aspects of audio in the gaming environment. In different areas, we all strive to provide listeners with excellence, both artistically and technically, and we all realize the ways in which our efforts fall short of our imagined potential.

If only, we thought, we could cram a little homunculus clone of famed audio mixing engineer Bob Clearmountain into each and every computer system, game console, and home music amplifier, we’d have it: audio excellence for each and every listener every time, as good as their equipment can sound, Bob in the Box! Though in reality, we figured we’d have to settle for something a little less.

Every bit of produced audio heard by every set of ears is the product of compromise: trying to get one assembled result to sound as good as possible over a range of playback equipment and playback environments involves compromise. This is a difficult problem and a primary objective for any audio professional, whether attempting to help a listener experience the profound experience of a world-class orchestral or jazz ensemble performance, or the energized drive of a loud, buzz-guitar pop band.

Electronic games (for PC’s, game consoles, handheld devices, etc.) represent an additional complication beyond that of standard audio production: the game player drives the nature of the audio experience through continued input and interactive decision-making. The output signals of game audio may never be exactly the same twice. So, how do you mix audio for that - the unpredictable audio experience?

The goal of improving audio-for-games represents our biggest challenge, our toughest set of problems, and our biggest opportunity. We set out to define the problem, find some answers, summarize the areas of experience where we felt able to produce results, and explore the unfamiliar ground where our experience provided no ready-made solutions.


Mapping the Problem

Problem Statement

“How do we introduce a high level of mixing aesthetics to interactive audio and games (at a level that compares with the best musical and cinematic examples)?”

Examples

Some game audio developers have an uncanny ability to make games sound good, despite severe constraints on time and equipment. Even so, most electronic game players have experienced at least some of the hallmark problems of poor audio. A few unfortunates have experienced them all, many times. Some examples:

  • Inappropriate or accidental “dead” audio zones – game areas with no sound
  • Pileups – unintentional sound clashes
  • Static focus – microphone location or auditory POV never changes
  • Lack of psychological perspective – character’s emotional state doesn’t affect the mix (e.g., during intense moments, all sound except player’s breathing could drop out)
  • Slavish realism (consistency that becomes predictable and boring)
  • Repetition
  • Repetition
  • Not enough variation
  • Too “in your face” (all foreground and loud, little 3D perspective, poorly controlled dynamics)
  • Distortion, Truncation
  • Lack of masking control
  • Dialogue is unintelligible

Contributing Factors

Poor audio results stem from many root causes, including:

  • Lack of clear language for communicating about game audio (between software developers, artists, directors, composers and sound designers)
  • Lack of time
  • Lack of budget
  • Lack of run-time resources dedicated to audio (RAM, CPU, real time DSP)
  • Rendering differences between playback systems
  • Lack of consistency between localized assets
  • Lack of communal knowledge – developers constantly reinvent the wheel
  • Lack of automation and high-level control tools

Throwing more money at the problem could clearly solve some of it, but only some; we’d still be left with an unsolved problem, and a larger financial crisis. We believe that the industry can ameliorate much of this problem by changing our production processes.


Surveying Our Assets

Part of our solution is already in hand: we occasionally triumph over our imposed limitations and create some stunningly great audio for our listeners. The group set out to identify things that human mixers already do to produce pleasing mixes (often outside the context of game audio), and the data, parameters, and engines that we would need to accomplish this within a game. We realized that:

  • We would not be able to complete the list during the conference proceedings - an ongoing effort would be required
  • The list would not guarantee successful mixes
  • To a large extent, tools already exist (XACT, FMOD, ISACT, WWISE, CRI, Miles, SCREAM, DARE, PUNCH, KICK, SNARL, etc.), but we still have a long way to go to “ideal”
  • Designing a master tool that would handle all cases is not efficient or desirable; a plug-in architecture would be preferable.
  • Education and training would be needed to reach our objective.


The Double-Sided Coin of Technology and Processes

To reach our goal of “audio always delivered as good as it can sound,” it will take new audio development processes (pre-production, production, and post-production) and new tools. The group agreed on the need to generate a list of parameters for controlling essential aspects of the audio experience, plus the technical feature sets in our tools and playback systems that would be required to deliver this experience successfully.

We recommend working backwards from the aesthetic experience to determine those parameters, provide necessary background work, and raise the bar slowly.

  • We will develop clear language for communicating about game audio.
  • We will share techniques with developers, through a site containing featured articles and a public forum.
  • We need to raise awareness and find examples of how great game audio sounds.
  • We envision a more powerful tool set, and will continue exploring the requirements, that will include parametric control for items such as:
    • Numeric scaling (number of objects being heard (i.e. one cricket, two crickets, thousands, etc.).
    • Psychologically-oriented mixing
    • Contextual mixing
    • Consistent preproduction and rendering tools (authoring tools use the same parameters, arguments, data values and DSP/rendering architecture as the eventual runtime engine).

Defining and exploring parameters to control psychologically oriented mixing may be one of the bigger challenges in designing these tools. The most effective mix may not be the most realistic in the strictest definition. For example, consider any number of movies where the protagonist walks down a busy city street, and the audience hears primarily the interior monologue of the character’s thoughts, not the traffic sounds.

Future Perfect: Imagining New Tools

While we all agreed there are many excellent audio tools available now, and there has been significant progress over the past several decades, we each found it easy to envision a new, more profoundly capable tool chain that a competent professional could use to ensure a great audio experience.

The tool chain involves several key procedural stages and technical aspects:

  • the ability to define discrete parameters “around” individual categories of playback sounds (music, dialogue, effects, spontaneously generated audio, etc.).
  • a mechanism to collate and communicate those parameters as sets of “metadata”.
  • a game engine smart enough to collect real time player and game environment variables.
  • an audio playback engine smart enough to apply the metadata to those game state variables and thus create a full mix representing the game designers’ intent.
  • a mastering stage smart enough to delivery polished mixes for the users.

Figure 1 illustrates the basic process for collecting and creating component sounds for the final mix, assembling them as a unified set ready for mixing, and handing them to a smart playback mixing engine.

Figure 1. Basic procedural components for delivering interactive audio.


Figure 2 illustrates the technical relationship of the assets, their metadata, and the playback engine.


Figure 2: High-Level Smart Audio Mix Engine


Next Steps

Group members considered this report to be the first step in a long process toward the human and technical development required to support an “intelligent mixer.” Members accepted some specific tasks to drive the effort (they know who they are!):

  • Set up a wiki and/or forum, perhaps on the IASIG site, to share techniques that will advance the art of interactive mixing
    • NOTE: See the section below, “Enter the WikiBlog™: 'The Art of Interaction Mixing Journal'”
  • Contributors will write one technical or artistic article for the site, addressing, for example, one of the following topics:
    • Analysis of existing games for examples of successful mixing
    • Reviewer Guidelines
    • Defining Example Scenarios
    • Identification of features that exist in current audio tools
    • Define and explore “Dynamic Ambience Tracking”
    • Field recording
    • Write an Annoying Audio blog about mix annoyances and their solutions
    • Sound Source occlusion and obstruction
    • Specification of “Dynamic Music Systems”
    • Define and explore “Contextual Audio”
    • Specification of new systems/new tool architectures
  • Communicate by e-mail reflector, with conference calls when necessary
  • Start an IASIG working group on interactive mixing and present a progress report at GDC
  • Present our findings to Project Horseshoe (a conference for game designers)
  • Investigate SMU’s interactive audio program


Enter the WikiBlog™: “The Art of Interactive Mixing Journal”

As part of the first and primary action item, the group discussed and took steps to create an ongoing project: an educational resource for audio professionals (and professionals-to-be) that can grow, evolve and feed necessary input to technical innovators for improving audio creation and delivery tools. The Interactive Audio Special Interest Group (IASIG) provides a natural forum for hosting and developing this recourse. An abstract follows below.

Art of Interactive Mixing Journal
Strawman

This memorandum is a strawman for the Art of Interactive Mixing online journal. This concept was conceived at the Project BBQ 2006 think-tank.

Format

The Journal will focus on documenting techniques, wisdom, experiences and tools for the advancement of the art of interactive audio mixing. It is meant as a neutral ground for the exchange of ideas between professionals in the field.

The Journal will be structured as an infrequent blog, à la memepool.com, with a new entry every 2-4 weeks. Each entry will consist of a full-length article (2000 words) and an associated discussion between registered participants. Ultimately the collection of articles may be turned into a book.

Sections

The Journal strives for simplicity and focus. It will consist of the following sections.

  • Front Page. The Front Page will contain a list of recent articles in chronological order. Each entry will consists of author, abstract, date and pointer to associated discussion.
  • Contributors. The Contributors section will contain a short biography of all contributors to the Journal.
  • Glossary. The Glossary section will consists of a number of pages, each focused on a particular technique or term of art, e.g. cue.
  • Archive. The Archive section will contain a chronological list of all published articles.
  • About. The About section will provide a brief description of the Journal.

Organization

The Journal will be edited and moderated by an Editor (Pierre-Anthony Lemieux, Dolby, Peter Otto, UCSD, Associate Editor). The Journal will be the main work item of a newly-created Art of Interactive Mixing group within the IASIG, chaired by the Editor.Conference calls may be scheduled to address specific topics and business related to the Journal.

Milestones

  • December 2006. Journal launches with its first article.
  • March 2007. 3 articles published. The Journal broadly announced at the Game Developers Conference 2007.

Access

The Journal will be open to all, but editorial control will remain with IASIG members. Specifically, anyone may post comments to any article and anyone may submit articles for publication to the Editor or Associate Editors. However, glossary entries and other editorial content may be added only by IASIG membership. To improve ease-of-use and focus, the Journal will have its dedicated domain, e.g. interactivemixing.com.

Proposed Articles

  • Occlusion and Obstruction (Fabien Noel, Ubisoft)
  • New system architectures (Peter Otto, UCSD)
  • Field Recording (Guy Whittmore, Microsoft Game Studios)
  • Game Audio Reviewer Guidelines (Matt Tullis, Dolby)
  • Dynamic Ambience Tracking (Tracy Bush, NCSoft)
  • Contextual Audio (Scott Snyder, Dancin’ Mouse Production)
  • Mix annoyances and solutions (Peter Drescher, Danger)

Visual Appearance


Deep Thoughts on Interactive Audio

Group member Guy Whitmore contributed some well-considered thoughts from his vantage point as an industry veteran and hands-on director of audio for a large game publishing company.

Welcome to the Art of Interactive Mixing

Introduction (Pre-production)
Welcome to The Art of Interactive Mixing. At this year’s Bar-B-Q Interactive Audio Think Tank, a group of audio industry professionals racked their brains on the topic of the current state of mixing sound for games. The very idea of a ‘mix’ for a game is something that currently gets very short shrift not only from our industry, but also from the very sound designers that create and implement audio for games. We’re often happy if we get great sounding assets working at a good relative balance; but that’s where it all too often stops.

How do we, as audio directors, designers, and producers, advance the art of game audio mixing? …and what does that even entail? What is the possible range of expression we can give games through our design and mixes?

Body (Integration and Mix)
I posit that we’re only touching the base of potential here. Game mixes tend to be overtly utilitarian,i.e. if it’s closer, it’s louder. But that’s a very limited paradigm, and I would like to challenge us to increase our palette of expression, and create mixes that not only rival film mixes in quality, but do things that linear mediums are incapable of. To that end we have set up this site as a central repository of ideas, techniques, opinions, white papers, post mortems, as well as a meeting place for discussion.

While at Bar-B-Q, we agreed that one person or small group could not come up with the UBER-solution and present that to the world on a golden platter. No single one of us has the full range of experience necessary to be so audacious. Solutions will be emergent as we all create the best mixes possible for the games we work on. Successful techniques, ideas, and tools will float to the top organically. As we experiment with mixing concepts on the various games we create, we learn what works and what doesn’t. I also believe that over time, it won’t be one uber-tool or uber-technique that wins out. Hopefully, a broad collection of approaches and tools will be known and available, and we can choose the most appropriate to accomplish the artistic goal of any particular project.

The intention of this site is to accelerate the growth and development of game audio mixes. Audio folks are often sequestered at work on our individual projects and our sphere of influence is limited to the company we work for. Therefore we’re all solving the same problems in isolation. Furthermore, we are failing to educate students and audio designers new to our industry, and that means wasting time and money to get new personnel trained. Sharing ideas and techniques can only help the still nascent field of game audio.

No excuses. (Noise Reduction)
For the first time in our industry, there is a small but good collection of tools available to us (most, if not all will be discussed on this site over time). With this current crop of tools, much is already possible, and there’s much to be explored without adding a single new feature. That said, we are and will be the greatest influence on these and future tools. Therefore, let this site also be a place where sound designers and tool makers can meet.

Lack of time, money, and resources, are real problems, but let’s not beat this drum to excess. We often come off as the gripey, whiney sound people, and that doesn’t further our cause. To advance our art (or just get that must-have feature), become a positive evangelist and educator to your team, company and game audience. Don’t simply ask for new features or resources; convince your team through demos and mock ups. In other words, don’t ask them - show them! That’s the only way I’ve ever made any headway.

Conclusion (Mastering)
So I encourage everyone to approach this site with an attitude of openness and sharing. We often feel we should keep our best ideas to ourselves, but I suggest that this is counter-productive, not only to game audio as a whole, but to your company and even to your personal career. Over my 12 years in this industry, nothing has helped my career more than openly sharing my ideas and techniques. Of course there’s a place for proprietary technology and many companies can and should develop tech that gives them an ‘advantage’. But even here, the big concepts can be expressed and shared. In the end, techniques and tools only take you so far; the key ingredient is a creative mind and a killer set of ears, and these you cannot give away even if you wanted to.

Contribute. Participate. Enjoy!

- Guy Whitmore, Microsoft Games


Other Resources

A blog entry from group member Peter Drescher imagining the possibilities of better tools:

http://www.oreillynet.com/digitalmedia/blog/2006/11/the_homunculonic_aestheticator_1.html

A report from a prior Project BBQ exploring a “produce once, playback anywhere” model for delivering positional audio (whether to headphones, stereo speaker, or surround sound):

http://www.projectbarbq.com/reports/bbq00/bbq00r5.htm


section 11


next section

select a section:
1. Introduction  2. Speakers  3. Executive Summary  
4. Ensuring that PC Audio Editing/Rendering Plug-ins and Processors Always Work
5. Making the Configuration and Utilization of Audio Systems Much Easier
6. To DRM or Not To DRM?
7. A Consumer-friendly Quantifiable Metric for Audio Systems
8. Improving the PC Sound Alert Experience
9. A Prescription for Quality Audio
10. Facilitating Remote Jam Sessions
11. Providing a High Level of Mixing Aesthetics in Interactive Audio and Games

12. Schedule & Sponsors