| The Seventh Annual Interactive
Music Conference PROJECT BAR-B-Q 2002 |
|
![]() |
Group Report: Proposal for Latency and Uncertainty (Jitter) Management By Enumerating Renderers and Sources |
| Participants: A.K.A. "The Plumbers" |
Dan Bogard; SigmaTel |
| David Zicarelli; Cycling '74 | Chris Grigg; Beatnik |
| Mak Jukic; Yamaha | Ron Kuper; Cakewalk |
| Steve Pitzel; Intel | Jim Rippie; Sonic Network |
| Brian Smithers; Full Sail | Keith Weiner; DiamondWare |
| Devon Worrell; Intel | Nathan Yeakel; Gibson Guitar |
| Facilitator: Linda Law; Fat Labs, Inc. | |
|
Executive
Summary Left unaddressed, these limitations risk customer dissatisfaction and slow adoption of our next generation consumer products. We propose a problem-solving approach that applies principles of object-oriented computing to computing hardware and software. We also explore the challenges inherent in applying aggressive Digital Rights Management technologies to next generation products. By creating a software-based management system that treats all elements of the audio chain as interconnected components, we can control and reduce latency, ensure that all media is properly synchronized, and provide a new generation of successful products for the emerging digital lifestyle in the home, and the professionals who provide that content. Table of Contents
Introduction Existing media streaming architectures are already very mature and full featured, and they continue to evolve and improve. Recent steps towards media convergence, e.g., the computer in the living room controlling the TV set, present a new proliferation of multiple input and output streams and disparate media types. The market will reveal new and no doubt unpredicted ways to use these multiple streams. The market for these products, however, appears to be developing faster than the products themselves. While there are many and varied reasons for slow product innovation, including the lack of a widely accepted market solution for dealing with copyrighted (and possibly copy-protected) content, there are clearly technical limitations hampering our efforts. In this report, we will deal with two principal issues: latency and synchronization. In particular, existing architectures do not adequately solve the synchronization and latency management problems that arise in the converged home media environment. At the same time, the unique needs of the professional media content creator have not been adequately met. Major software and hardware manufactures (who shall remain nameless) are currently wrestling with issues of latency and synchronization in next generation platforms, so there is an opportunity at BBQ 2002 to provide some guidance to their efforts. This report addresses synchronization and latency in a framework for real-time media. It is platform agnostic and addresses the needs of the consumer as well as the professional content creator. We hope that the general principles we outline will form the basis for new hardware platform and operating system development, and when necessary, software standards that provide the media infrastructure for home entertainment and professional production. Definitions
and Assumptions This paper describes a media system that represents streaming components as blocks connected in an acyclic graph, a la AVStream in Windows. We will use the term "component" to describe an element of the graph, such as an audio renderer or codec. We will use the term "graph manager" to describe the presumed system element which manages the connection and lifetime of components in a graph. Latency:
Background
Based on the usage scenarios, we see two key requirements for the latency of a media system on common hardware platforms:
Latency:
Proposed Solutions "Discovery time" describes when new media components come to life. For example, if an end user plug a DV camera into a 1394 port on a desktop computer, a new media component appears on the graph, perhaps as a DV input and decoder, with video rendered on the computer screen and an audio rendered through the computer's speakers. A user downloading and installing a new audio processing component, like a three-dimensional spatializing or reverberation processor, represents another kind of media component appearing in the same system context. At discovery time, the graph manager asks each component to report its latency characteristics. These characters include, but are not limited to:
To avoid complicated logic among components, it is assumed that only graph manager or (optionally) an application should need to traverse the graph. In other words, any determination of "global" latency is the sole responsibility of the graph manager. During runtime, changes can occur to the latency of the system. To allow for this:
The graph manager is a suitable mechanism for global latency management because it has the bird's eye view of the graph, and can make good decisions about how to minimize latency. For example, suppose a USB component can optionally have its buffering disabled, thereby exposing all of its jitter. If this USB component lived upstream from an FFT plug-in which required large buffer sizes, then there is no reason for the USB device to buffer too-this additional stage of buffering is redundant and adds unnecessary latency to the system with no benefit. Since the graph manager is responsible for latency management, it could disable buffering on the USB device and minimize overall latency. Synchronization:
Background At the same time, whatever solution is devised cannot degrade the experience for professional content creators-professionals and consumers will increasingly run different applications on the same basic platforms. As with latency, usage scenarios shed some light on synchronization problems confronting professionals and consumers:
Scope
and Goals for a Synchronization Solution The system must be able to tolerate multiple simultaneous clocks, such as an audio card's sample clock and video frame rate clock. The system must support multiple media types at once, such as audio and video and animations and MIDI. If hardware based synchronization is available, as would be the case in a professional authoring environment, it should be utilized. In a similar vein, because software based ASRCs are lossy, there needs to be a way to configure the graph without them. Some kinds of media streams are inside a "closed pipe", for example, a DVD movie that is being rendered directly to a TV set. The synchronization system needs to allow and account for these kinds of streams. The
Relationship between Synchronization and Latency Also, latency that exists in the processing chain makes the task of actually implementing synchronization much more complicated for the developer. For example, suppose your processing chain has 1 second of latency. When you deliver a buffer into your graph, you are doing so based on the clock skew defined as "right now," even though the buffer won't actually be heard until a second from now. If your sync algorithm doesn't allow for this delayed effect, you will oscillate as you try to synchronize and never achieve useful sync. Synchronization
and DRM This would have tragic consequences for content creators, the very people that DRM purports to protect. Product adoption rates will slow dramatically as these customers turn instead to products that allow them to do their jobs on spec and on deadline. Since synchronization problems affect both consumers and creators, consumers will also reject products that don't behave the way they expect. While the group acknowledges the challenges faced by copyright owners, successful DRM cannot be the sole criterion for any new platform development, and should not be allowed to compromise the core platform's sync capabilities. Synchronization
Solutions The graph manager must actively detect clock skew,e.g. by receiving a periodic notification from the master clock that an "abstract tick" has occurred. On each abstract tick, the graph manager can compare the reference clock to all other clocks and determine skew values. Based on the choice of the master clock, the graph manager must insert ASRCs at the appropriate path(s) before the renderer. To match the buffer size variation that will occur at the outputs of the ASRC, the graph manager may optionally required that the sources vary the output buffers that they produce upstream. In a professional production environment, all of the above mechanisms must be able to defer to external or hardware sync. Because ASRCs will vary buffer sizes, this implies that actual latency of the system may vary dynamically in cases where synchronization is being used. Fortunately, we've already designed a latency manager to assist in keeping components of the graph notified about these changes. Nonetheless, ASRCs should be avoided when possible. Some leverage in sync can be obtained by using the error correcting capabilities of certain formats to avoid ASRC. For example, in AC3 you can remove a sample and the codec will reconstruct that missing data (transport error correction). Similarly, phase vocoder techniques can be used during the decompression of compressed sources (such as MP3) to avoid ASRC. Finally, any DRM or trusted components must expose control inputs for synchronization, in the event that it is not possible to ASRC the unprotected data stream directly. Conclusion
and Action Items
Appendix:
A Call for a Standardized Audio Component Framework Before moving to the specific problems of synchronization and latency management, the PLUMBER group discussed another barrier to new product development in the home and in professional markets: software audio components. This general term can be applied to a wide range of product types, from 3D expansion processors in hardware or software, to sophisticated host-based audio reverberation and synthesis processors. The professional audio market offers a variety of audio component formats, some hardware and some software based, all entirely incompatible. While they can be considered "standards," they are proprietary, and the companies responsible for their development assume a documentation and support burden. In other industries, web standards for example, organizations independent of any single company take responsibility for the growth and development of the standard. In the absence of any industry wide audio software component standards organization, audio component developers are required to develop multiple formats, or take the business risk of focusing on a single format. Application developers are frequently required to support and host multiple formats to satisfy customer demand. We discussed the definition of a component framework and the characteristics of components, their roles, responsibilities and duties:
A component framework:
Furthermore, this component framework is:
Several members of the group will assume the subtask of further defining and promoting such a component framework with an appropriate standards organization, such as MMA. (See "Action Items," above.) Benefits
to Consumers, Software Developers, Pro Users, and Hardware Manufacturers Consumers
Pro users
Software developers
Hardware manufacturers
section 4 |
|
|
Copyright 2000-2014, Fat Labs, Inc., ALL RIGHTS RESERVED |