home  previous   next 
The Fourth Annual Interactive Music Conference
brainstorming graphic

Group Report: The Big Picture

Participants: David Baker; Postscore
Keith Charley; Creative Labs George Sanger; Fat Labs
Duane Ford; Staccato Systems Sandi Geary; SingleTrac
Chris Grigg; Control-G David Javelosa; Yamaha
Michael Land; LucasArts Danny Petkevich; Staccato Systems
Eric Scheirer; MIT Media Lab Brian Schmidt; Microsoft
Keith Weiner; DiamondWare

Facilitator: Linda Law; Fat Labs, Inc.


Day One

The "Big Picture" working group started by listing the major perceived rants about audio systems and how audio is designed.

Main rants were:

  • Audio programmer screws up/doesn't include my audio
  • Division of tasks is a messed up
  • Iteration is far too slow and difficult
  • Not enough tools to do interactive audio
  • Audio not conceived as important part of game
  • The people who care about the audio don't have the means to control it
  • It's too hard to do simple things.

Based on these rants, fairly quickly, the group focused in on how a major problem is based on the reliance on the programmer for audio integration. Though this didn't try to tackle "the big picture," we determined that putting power into the hands of composers and sound designers was essential to solving this problem.  Thus, the "big picture" was tabled for another venue, such as the ia-sig.  This resulted in the focus for the group: "Making powerful interactive audio easy: Games-A case study"

Although many applications besides games were determined to suffer from the same or similar issues, games was chosen to be a prototypical subject, and that, with careful thought, the results could be extrapolated to a larger scope.  Also, there was consensus that there was an immediate and pressing need to solve these issues for this particular example.

Out of the rants came the brain storming sessions.  These attempted to categorize issues that can help make everything easier.  The basic categories were:

Authoring Tools
Any solution to "making powerful audio easy" must include an authoring component.  The authoring component manipulates the media and also control information for how the media needs to be utilized.

  • Scalable construction kits, where anyone can do the first one
  • Tying audio composer designer directly to the process without the programmer.  Can do some stuff right out of the box, more as you get better.
  • Making rework easy and shorter
  • Standalone auditioning tools
  • In-context auditioning without rebuilding: rapid prototyping tool

The following are partially beyond our control.  They involve integration with scene graphics. Work has been in done in this area (particularly in the non-game areas) and by some game audio API's, but have not been addressed in a generic manner:

  • Direct attachment of audio to graphic objects where appropriate.
  • Rules that go along with sound.
  • Real-time parameter control.
  • Geometry-based Model-based: world geometry: prop object, where appropriate.

Advocacy Issues

  • Several issues were deemed to be outside the scope of the workgroup, but rather fall within the existing audio advocacy moniker.
  • Learn from film.apply post production techniques.move beyond film.
  • Develop/enable richer aesthetics about non-linear composition.  Worried about this one being a big can of worms, which may be fun to follow up on.
  • Making it profitable: making it cost-justifiable.  What's the business model/ROI.  Selling time to sell quality
  • Having more powerful pc audio.  This was considered almost a given (that it is needed), but the need to bring the pc audio message outside the audio guys, and to the PC OEM's, game producers, etc.  Point of big contention.  Worry is that we've "given up" as far as trying to push audio hw acceleration, etc.  But, M. Land makes point that aawg is handling these issues, and is out of the scope of this group

Concepts and Definitions

  • Abstracting details of the implementation
  • Clear definition of tasks and role
  • Clear definitions of terms and concepts
  • Disambiguating the sound designer's vision. "Common unit of currency (the cue)"
  • Easy definition of translation layers: Control variables->control of sound
  • Use appropriate synthesis type for the sound you're trying to get.
  • Focus/Rethink API architectures

Technosocial Issues
Technosocial issues refer to the bridge needed for content creation people to get their vision and idea into the heads of the game programmer/developer.  This includes traditional notions of "turf" as well as in-house/out-house designer issues.

  • Enabling different communities to talk to one another
  • Sound designer -> programmers.hw guys to software guys.vendor to vendor

Special concern should be given to these.  They're good global issues.

  • Empowering the sound designers with more control over how the media is presented.
  • Think cross-platform.platform agnostic.but beware the dark side of lowest common denominator.

Day two started with a point of contention:

Evolution vs. revolution.  Should we present from evolutionary point of view?  Are there things that need to be done that are so different, that we're tying our hands by considering existing methodologies too much?  Since the common thread in many or most of the rants boiled down to "the audio designer doesn't have enough control over audio integration" this was where the area of focus was.

The most abstracted, interoperable system that allows an audio expert control of the audio integration into an application was determined to be a cue based interface between the application and the "audio vision."

DEFINITION: A cue is a mapping between an abstract request for sound services and the supplied service.  It may also refer to the event itself, and may be used as a verb.

The Cueing System
Consensus was that several items need to be specified.  An abstracted layer (see diagram) that results in a simple "Cue" based API is used by the programmer.

Cue Chart

The cue interpretation layer does all interpretations of cues.  This could be either a simple language interpreter using a script-like language, or be implicitly controlling synthesis parameters.  Especially for things like MP4, some things that have been traditionally been game logic now becomes "audio logic".  For example, algorithms that expose high level knobs to the game, but cause multiple changes to underlying audio processing parameters.

The abstracted layer provides two types of information to the cue interpreter: Cue control, Parameter Query.  The application can also receive callbacks or notifications from the audio engine, for such things as determining when certain musical or audio events occur.

It was determined that standardization of certain components of the system should occur.  In the above diagram, light shaded items represent candidates for standardization.  By defining a standard control interface between application engine and cue interpreter allows applications to utilize various proprietary audio engines without affecting their main application code.  The engine would provide both a rendering engine and a cue interpretation layer that understood standard cue events, parameter queries and notification requests.  These requests would then be translated into specific parameters for the specific audio renderer.

The realization was made that a cueing system would likely be neither practical nor desirable for 100% of the audio integration tasks.  For this reason, an application would also have a (proprietary) connection directly to the rendering engine.  This would allow for situations that were either impossible to describe using the cue-based interface, or where it is simply more expedient to do so.

Common Cue Formats:
Do we need to standardize a common cue map format?  Consensus was that it may be very useful and should be investigated.  It was noted that there is danger in insufficiently specifying common format resulting in a "GM" like problem.  The scope and definition of a common cue format was determined to be best left to a body like the ia-sig.


  1. Email campaign re: interest in "big picture" working group (ia-sig)---CG
  1. New ia-sig group for author's cue data and game/cue system interface
    a) propose cue data file format concept (GS)
    b) develop cue data file format (GD)
    c) propose system interface concept (KW)
    d) develop system interface e. develop cue data file generator (KW)
  1. Getting our "block diagram" out


section 4

next section

select a section:  1. Introduction  2. Speaker Summary  3. Executive Summary  
4. The Big Picture Group  5. The Impact of Digital Distribution on the Various Constituencies Within the Music Industry Group  
6. The Apostles of the Church of Appliantology Group 
7. The Sound Pressure Lobby Group   8. Schedule & Sponsors