|The Fifth Annual Interactive
PROJECT BAR-B-Q 2000
Group Report: The Multichannel Audio Working Group
|Participants: A.K.A. "The Story of O"||Jack Buser; Dolby Laboratories|
|Keith Charley; Creative Labs, Inc.||Trudy Culbreth-Brassell; Microsoft|
|Todd Hager; Dolby Laboratories||Jonathan Hoffberg; Dolby Laboratories|
|Jean-Marc Jot; Creative Advanced Technology Center||Phil Lehwalder; Intel|
|Scott McNeese; Philips Semiconductors||Adam Philp; Sensaura Ltd.|
|Jim Rippie||David Roach; SigmaTel, Inc.|
|Larry The O; LucasArts||
Keith Weiner; DiamondWare, Ltd.
|Facilitator: Linda Law; Fat Labs, Inc.|
The goal of this group was to outline a "write-once, deliver anywhere," platform independent, format-agnostic approach to 3D interactive multichannel audio delivery from authoring through to the final consumer experience. We began our investigation with a survey of the delivery scenario, and worked back to the authoring process.
The primary intent of our discussion was to identify issues, not investigate them in detail. The issues cited in this report all must be explored and solved before a complete multichannel signal chain can be realized. It is our hope that those solving individual problems will do so with awareness of the holistic context we describe here, and we believe that if that awareness is maintained it will result in the highest integration and efficiency through the entire multichannel signal chain.
This report touches on and incorporates subjects that have been part of previous Project Bar-B-Q groups, including the multiformat audio work group of '98 and the interactive audio "big picture" work group of '99.
This report also reflects on the Interactive Audio Special Interest Group's (IASIG) Multi-Format Audio Working Group, which has made substantial progress toward a report entitled "Recommended Practices For Handling Multi-Format Audio," which awaits ratification by the IASIG (at which time it will become publicly available). The observations and recommendations in this report will be forwarded to the IASIG in the hopes that the Multi-Format Audio Working Group will explore these concerns and proposed enhancements, incorporating them into a "Recommended Practices" version 2 document.
This Bar-B-Q group's recommendations revolve around three main proposals:
Note: another Project Bar-B-Q 2000 working group focused on Interactive Audio authoring issues in detail, including multichannel authoring and delivery issues. Readers looking for supplemental information are strongly encouraged to seek out this report at http://www.projectbarbq.com/reports/bbq00/bbq00r7.htm.
Delivery World-Stating the Problem
We began with a major concern: How does a developer account for different users' listening environments?
Currently, audio playback chains take on many configurations: Stereo speakers, stereo headphones, 5.1 channel speaker environments, 6.1, 7.1 channel environments, etc.
Furthermore, audio streams can be encoded for more efficient transmission using Dolby Digital, Dolby Pro Logic, or otherwise processed to achieve a greater sense of spatiality using techniques such as Head-Related Transfer Function (HRTF) or stereo field expansion algorithms.
Finally, customizable platforms, such as PC and Macintosh, can have wide variance in available resources and CPU power, which results in a range of capabilities for reproducing multichannel sound. (This is in contrast to console platforms, which have the same resources from unit to unit.)
The combination of these delivery variables, together with aspects of an end user's listening environment (reflective surfaces, bass absorption, etc.), and the lack of sufficient connectors on the back of devices supposedly equipped for multichannel audio delivery, provide wide-ranging challenges to audio providers seeking to deliver the optimum listening experience to the most users.
Delivering audio properly formatted for many configurations is beyond impractical for audio providers, even if we ignore issues specific to each listener's environment. Audio providers need a platform independent solution that will realize the audio designer's goal on as wide an array of systems as possible, without requiring custom authoring for any particular configuration.
Defining the Listening Environment
A couple of terms are useful in quantifying and qualifying the listener's environment, so that the delivery mechanism may more accurately deliver an audio experience.
Once the end user's system is profiled and calibrated, applications playing audio to that environment can adjust their audio content to the system to provide predictable results. With a calibrated playback chain whose attributes are known by the delivery mechanism, any given end user will be that much more likely to receive an optimized performance.
Of course, variances between listening environments occur for many reasons, including
Aspects of a self-calibration system could include:
As an existing real world example, many games already perform such a profiling on installation and select (either automatically or through user choice) an installation appropriate for the resources found. This is a good model for multichannel profiling.
The group recognized that system profiling itself is only part of the long-term solution for a "write-once, deliver anywhere" approach to multichannel audio-the solution should also provide a sensible way to handle variances between many different types of systems. (More detail on how these systems can vary and the problems that poses appears further below.)
These variances can be accommodated by having each type of system adaptively determine how best to resolve the audio the author has designed. To that end, the group proposes the introduction of a new data protocol that will accompany an interactive audio stream for the purpose of assisting the playback system in determining how best to reproduce the associated audio.
The group defined this set of instructions as a metadata layer.
The Metadata Layer
The group worked to identify ways that audio experiences could be authored and delivered to provide the end users with maximum benefit from their varied playback environments.
It was agreed that a metadata layer is required to supply information to the playback system regarding channel routing, and location, orientation, and other attributes of accompanying audio data. The format of this metadata layer could be based upon a 3D audio API, such as Interactive 3D Audio Level 2 (I3DL2). That API defines interactive simulations of spatial acoustics and real time audio environment modeling and is intended to deliver highly realistic experiences.
The authoring tools should generate the metadata layer, which will accompany any audio stream produced in the tool, and will instruct the playback system how to play the associate audio.
Aspects of this metadata layer and the systems that use them include:
The metadata layer should also contain parameters that are sensitive to the following playback chain limitations:
The group also recognized that variances between systems will inevitably lead to cases where some systems will be incapable of delivering the audio designer's intended experience as expressed through the metadata layer. Some form of graceful degradation must be available to accommodate situations where the platform is not capable of reproducing full, discrete multichannel data (based on a profile of the user's circumstances).
For example, a system capable only of stereo output should have a way of resolving audio content such as streamed 5.1 format audio that outstrips its abilities, through a recommended process of downmixing or by other means, such as alternate compressed versions. The group sees a revision to the IASIG's "Recommended Practices For Handling Multi-Format Audio" report as ideally suited to providing these recommendations.
Authoring Tools and Methods
Platform independent development is the grail of the interactive multichannel authoring tool quest. It was proposed that, ideally, there should be one authoring platform for both linear and interactive content. The environment should have the following attributes:
The discussions that produced this document were quite broad in scope, but there are some key ideas that surfaced that could form the basis for significant advances in multichannel and positional audio delivery on PCs and game platforms:
It is our hope that these ideas will continue to be explored and refined to the point where useful recommendations can be made to equipment and tools manufacturers for the development of technology which addresses this fundamental problem: How to deliver a 3D interactive multichannel audio experience in a complex, multiplatform world.
The working group will flesh out and distribute guidelines.
The working group will seek participation of representatives from other potential multichannel platforms, and more content providers for the working group.
Jim Rippie will forward the group's recommendations to the IASIG to co-opt the multiformat working group if Michael Land and the executive advisor of said group assent.
Individual members will evangelize the recommendations of the BBQ workgroup and the IASIG group to their respective constituencies.
select a section: 1. Introduction
2. Speakers 3.
Copyright 2000-2014, Fat Labs, Inc., ALL RIGHTS RESERVED