home  previous   next 
The Eleventh Annual Interactive Music Conference
brainstorming graphic

Group Report: Ensuring that PC Audio Editing/Rendering Plug-ins and Processors Always Work

Participants: A.K.A "Prug and Pray"

Gary E. Johnson, SMSC Austin

Len Layton, C-Media Electronics
Keith Weiner, DiamondWare Benjamin Masse, Double V3
Peter Lupini, 3dB Research Ltd. Trenton Henry, SigmaTel
  Facilitator: Van Webster; Webster Communications

Problem Statement

Today’s PC-based Audio Editing / Rendering equipment doesn’t always work well, especially with plug-ins, and can be more expensive than necessary due to piracy. In a complex computer environment, a broad range of tasks compete for CPU time causing glitches, latency and delays. We will propose a potential solution to multitasking demands in a PC environment.

Specific problems to be addressed:

  • Current applications have excessive latency (non-realtime OS), hardware assist increases latency.
  • Plug-ins are application dependent and not universal (as opposed to the analog XLR 600 ohm connection standard).
  • Obsolescence occurs unacceptably too often.
  • Current plug-ins are not cross-market applicable (i.e. home theater receivers vs. ProTools).

A Brief Statement of the Group’s Solutions to Those Problems:

The model for our solution approach is the analog world rack mount outboard processors and patch bays. In these systems, each processor (i.e. UREI 1176) is a self-contained, fully functional unit that operates independently of the host mixing console. The unifying properties of the device are its audio interconnections, physical mounting and AC power connection. Applying the same principal to plug-ins and audio processing, we propose to define a combination of packing, goesinto, goesouta, control and powering that can be modularly implemented across a wide range of applications. The core concept is that the application includes its own processing power and is not CPU dependent for operations.

With “It Just Works”, we propose dedicated, module-based processing units, similar in scale to a compact flash card, containing the software, processing capability, control communications inputs, outputs, timing and power connections. As hardware solution, this approach protects both the application IP and provides a known working product. The modules include full rendering capable signal processing so the PC is not required for any signal processing at all. The modules are shipped with software PC-based Graphical User Interface (GUI) for control.

It Just Works; small format; mini version of 19” rack concept with standard backplane; protects applications from copying; guaranteed interoperability; signal processing on modules; markets could be expanded beyond audio to photos, video, graphics, etc.

Examples of modules include the main application (Pro Tools, SonicFoundry, Acid, etc.); plug-ins for such applications (reverb, eq, mixing, scaling, etc.), DSP for loudspeaker compensation and spacialization. The control and signal interfaces will be standardized so any unit will interoperate with any other.

Action Items:

Review the concepts with industry participants, particularly Line 6, Sony and Cakewalk. Keith, Gary and Van to identify industry key sponsor.

If concept is feasible, an open group would then need to determine the specifications. BBQ reflector.

Expanded Problem Statement:

Plug-ins and audio processing all depend on the use of a central CPU which is burdened with a wide range of tasks, both specific and general. While there are options to prioritize the housekeeping tasks, implementing the options is tedious and often incomplete. In complex audio applications, the insertion of plug-ins and mixing functions can overload the CPU. There are also incompatibilities between audio application programs that interfere with the operation of processes on a cross application basis.

The goal of a solution of this problem is to insure that plug-ins and other processors always work and that the burden of processing be reduced on the CPU by dedicating outboard processing power to each specific application.

Expanded solution description:

The proposed solution is to create an outboard software and hardware modular packaging, bussing and interface that will combine all the functions needed to implement an application, independently of the host CPU. Because each modular package is functionally self contained the interconnections can be summarized as goesinto, goesouta, control, timing and power. Each module would contain the software, and processing necessary to accomplish the application. By optimizing the processing capability to the task at hand and operating in parallel with the CPU and other modular units, computer capacity can be increased without replacing the CPU and conflicts within the operating system can be avoided.

The form factor of each module is a single plug in card with scalable dimensions. Given a common connecting scheme and physical mounting, cards could be constructed in varying sizes including alternatives for length and multiple layers. The docking hardware could be an outboard device, similar in size to an AC plug strip, slots in a host piece of hardware, or an internally located, non-user changeable mounting.

An additional benefit of this hardware/software-based approach is to reduce the likelihood of incidental piracy and unauthorized copying. As the software and hardware are a matched set, casual copying is unlikely and mass piracy is prohibitively expensive. There could be a version of the product that would permit software upgrades to existing hardware.

Pro audio and musical applications can include all of the types of processing currently implemented in VST and similar software, mixing applications. Semi-virtual synthesizers and loudspeaker DSP could all be implemented in this system. Multi-plug outboard buss systems could be used for audio applications using PC’s of any flavor as controllers. Plug-in slots could be included in keyboards, amplifiers and other musical instruments to apply custom processing of the user’s choice, controlled by hardware knobs and/or small scale LCD screens with menu implementation.

Mass market CE applications may be essential to establish the production volume necessary to attract chip manufacturers to this product category. There is also a growing capacity with overseas manufacturers to produce custom chips in low volume at a reasonable price point. Some mass-market applications could include loudspeaker compensating DSP for consumer sound systems. Receivers and other audio devices could include a slot for a processor matched to a specific set of speakers. Each speaker system would come with its own card for field installation. Other applications could be dedicated processing for mobile devices, automotive sound and workspace environments.

We further feel that this infrastructure could be applied to non-audio applications including video editing, special effects, graphics and photographic manipulations. Self contained computing modules may also be applicable to other commercial and business applications. Medical equipment is a likely candidate for this packaging scheme. Machine control and automation are possible. Industrial testing and measurement could find this approach to applications commercially viable.

Brainstorming List:

  • Digital hardware/software add in card system
  • Music rendering tools
  • Example products available: VST standard for plug-ins; Euphonics, Neve & Harrison automated consoles;
  • Sound Forge; Transputer, Nuendo, Cuebase, Pro Tools.
  • Compact Flash physical size as an initial model for packaging. Scaleability of the package size would be needed to accommodate a range of processing capabilities.
  • We view the “buss” based module housing sized to fit 10-12 devices max.
  • This standard will need to define – control, audio/data input, audio/data output buses; file formats; physical descriptions for modules, chassis, etc.
  • GUI on PC, defined for use in all applications (VST standard?)
  • To make this work we will need to Identify key partners with the capacity and means to implement a solution.
  • Each module must have dynamic enumeration discovery: unique ID; type / capabilities; ins / outs; parameters
  • The processing modules will require a defined sample rate / bit depth for both final output and intermediate processing (to avoid compounding of errors).
  • Define transport format
  • Sneaker net application key transfer by using hardware modules enabling multiple users in a work group to have access to the processing as needed with out the necessity for multiple licenses.
  • Bus speed must be scalable for future proofing
  • One of our tasks is to build a business case.
  • A key opportunity at this time is that contract manufacturers can now do small runs of custom designs cost effectively.
  • Concept can be extended beyond studio tool to home/office
  • One CPU, memory, etc. per module for complete signal processing on card, pushing control off to PC

Other reference material:

“A Freely Configurable Audio-Mixing Engine with Automatic Loadbalancing”, M. Rosenthal, M. Klebl, A. Gunzinger, G. Troster; Electronics Laboratory, Swiss Federal Institute of Technology, March 7, 1995

“Automatic Reconfiguration of a Digital Audio Mixing Engine”, M. Rosenthal, P. Kohler, A. Gunzinger; Electronics Laboratory, Swiss Federal Institute of Technology

Technical Issues:

32-bit (integer? Floating point?) 48kHz linear PCM is sufficient resolution for now and forever due to human limitations (validate with group). Avoid accumulating round-off errors.

If this is so, then a given card needs only N times ~3Mbps bandwidth (N = number of simultaneous channels).

  • 4 bytes * 8 bits/sample * 48000 samples per second * 2 (full duplex)

Key concept: The bus should be switched fabric where bandwidth to each card is independent of others
This means no obsolescence for a card (just like 1960’s vintage effects—if you still like it, keep using it forever!) It also means small systems (e.g. < 8 cards) can run slower fabric than large systems (just like Ethernet switches).

Each card has a DSP and real-time OS.
Latency should be < 1ms (plus algorithm delay).

Kinds of functions:

  • Source
    • No inputs (in system scope)
    • Provides data at a fixed rate
    • No parameters(?)
  • Sink
    • No outputs (in system scope)
    • Consumes data at a fixed rate
    • Upstream must be synced to Sink
    • No parameters(?)
  • Processor
    • Input and output
    • Parameters control processing
  • Mixer
    • Maps X inputs to Y outputs (X not necessarily > Y, i.e. could also be a “Y” cable)
    • Parameters control processing
      • Volume / cross-fade / panning
      • Spatial position
      • Etc.
    • Upstream must be synced to the mixer
  • Hybrid
    • No reason why multiple blocks can’t be on one card

Each card reports:

  • GUID
  • Manufacturer ID
  • Product ID
  • Friendly name
  • Number of functions
  • Type of each function (e.g. source, processor, etc.)
  • Number of parameters
  • Type of each parameter, range, etc. (needs to be defined)
  • Capacity: number of simultaneous instances

Cards must be dynamically discovered.

A generic user interface should be able to present a screen to control all parameters for each function (GUID could allow manufacturer to find its specific hardware and provide a better plug-in for a complex function like reverb).

The design must define data format (e.g. RTP?) number of channels, number of samples, sample rate, etc. Should a block be able to constrain the system (e.g. don’t give me < 1000 samples or > 2000 samples)?

PC runs user interface. Alternative, small touch panel, remote control UI on TV (for home theater receivers), etc.

Interesting problem: a speed-changer function provides output samples at a different rate than it consumes them. Buffering doesn’t solve if consumption is faster than production. Also finite buffer size would overflow eventually. Solution: traverse chain from sink to source and query: to product Y samples of output, how many samples of input do you need?

KW/DiamondWare will put some code into the pool if this becomes real.

section 4

next section

select a section:
1. Introduction  2. Speakers  3. Executive Summary  
4. Ensuring that PC Audio Editing/Rendering Plug-ins and Processors Always Work
5. Making the Configuration and Utilization of Audio Systems Much Easier
6. To DRM or Not To DRM?
7. A Consumer-friendly Quantifiable Metric for Audio Systems
8. Improving the PC Sound Alert Experience
9. A Prescription for Quality Audio
10. Facilitating Remote Jam Sessions
11. Providing a High Level of Mixing Aesthetics in Interactive Audio and Games

12. Schedule & Sponsors