Project Bar-B-Q 2013 report section 4

home previous next
The Eighteenth Annual Interactive Audio Conference PROJECT BAR-B-Q 2013

Group Report:
HD Audio Capture in Consumer Devices

Participants: A.K.A. "Dark Side of Devon"

Phil Brown, Dolby Labs

Devon Worrell, Intel

Scott McNeese, Cirrus Logic

Diby Nandy, Knowles

Leng Ooi, Google

Ted Kao, DTS

Michael Jessup, Dolby

Mikko Suvanto, Akustica

Facilitator: Phil Brown, Dolby Labs

download the PDF

Problem Statement

Consumer “smart” devices are used to capture audio for multimedia, voice and speech. Such user generated media is becoming an increasingly greater fraction of media consumed through video and audio sharing services and social apps. Ubiquitous audio capture is being driven by access to easily available multi-purpose devices like tablets, smart-phones at very accessible price points. Such devices tend to be limited by the platform, which may be designed primarily for one purpose (usually not audio) and be re-purposed to handle other purposes, which include audio as a minimum requirement check-list item. Constraints are also imposed on acoustic design by evolution of devices to be smaller, thinner and lighter. Thus, audio captured by current mobile devices has low quality and fidelity.

The group identified use cases that highlight the deficiencies in audio capture and provide opportunities for high quality consumer audio

Use phone/tablet as camcorder – long range capture
- record concert, kids playing, activities, lectures, conference
Capture people talking
- Voice communication – skype, facetime
  - Transmit and receive clear speech
- Speech recognition
- Simultaneous communication + speech recognition
  - Distinguish and manage communication speech and command & control words
- Biometric analysis
  - Voice Recognition
  - Stress, emotion detection
Acoustic scene analysis
- Activity detection during low power standby
- Sound track acoustic analysis to determine context of the content
- Use mic to monitor and optimize playback performance
Directional/focused capture
- Full band audio capture (concert) – at 30’ but not interference at 5’, record based on proximity
Wireless/Remote capture
- Lavalier mic on a speaker broadcast over a local network
Control directional capture automatically
- While changing camera or when the device is rotated. E.g. during Skype/Facetime capture
Capture audio on wearable devices
- Command and control on wrist in any position
- Context based audio capture for smart eyewear for multi-media, command and control, communication
Multimedia capture
- Capability to provide mono, stereo, surround, spatial depending on playback mode
Capture and stream real-time or store it for later playback
Capturing ultrasonic data
- Impact of location of mic, port geometry

Key Problems
The problems may be defined by limitations that arise in old and new use cases for audio capture enabled on “smart” mobile devices.

Dynamic range limitations in the transducers
- Noise floor of microphones limit lower end.
- Acoustic handling capabilities limit high end.
Use of multiple microphones on a device:
- Unable to select a subset of microphones e.g. horizontal pairs of microphones based on orientation.
- Unable to use more than 2 microphones simultaneously
- Different types of microphones are being used on a device although where they are located and which microphone(s) to use in a specific application and orientation is unknown
Devices are not capable of fully determining the desire of the content creator, even in limited contexts. It is challenging to determine what to capture e.g. environment, individuals, wideband, narrowband, speech, voice, etc.
- Sensors, like accelerometers and gyroscopes, which may provide context are not being exploited for controlling audio capture.
- Power management: Sensors are not on same power domain and may not all be accessible in the same power state.
- Components like microphones and codecs usually come from different vendors and have different performance characteristics.
Processing solutions/algorithms
- Algorithms come from multiple vendors and they don't interoperate.
- Most noise reduction produces monaural where spatial audio is preferred.
- OS is impediment to high quality audio capture.
Audio quality is compromised due to BOM cost of devices and software

Proposed Solutions

The group determined that solutions need to be defined in terms of the full platform design.The diagram below defines the interdependencies between the different components. The following are necessary to enable such capabilities

More microphones
Better microphones with improved SNR, dynamic range, resonance, sealing & isolation
Glue-only microphones to improve fidelity and to lower cost
Single package microphone arrays
Better speakers for better echo cancellation and playback and recorded content
Better algorithms that work well with microphones and codec – robust to microphone placements, distance and quality
Improve dynamic range through microphone control of amplifiers
Microphone characterization / parameters available to algorithm developers and in real-time to system
Real time availability of sensor data to improve ambient contextual awareness e.g. orientation, geo-location, focal distance of lens, distance of the object, face recognition, distance of object, time stamp, format, position
Pluggable compute architecture to extend processing capability
Ensure that needed sensors are ON when microphones are used
Real time algorithm change based on sensor data
Standardize info reporting so codec, microphone, algorithm developers can acquire info for device customization, updates, etc.

sensor, components, block diagram, what the app/algorithm developers need for development

Smart processing – AGC/ALC, Spatializer, are available

Block diagram

References

2011 Definition of Audio Quality and Happiness
Explores audio quality in terms of experience and presents 6 metrics that attempt to revitalize the definition of ‘quality audio’ by focusing on consumer experiences.
2008 Smart Ambient Sound Sensor
Proposes the creation of a new form of acoustic monitoring for the PC space that can be used to improve user experience with minimal user interaction.
2006 A Consumer-friendly Quantifiable Metric for Audio Systems
A proposal for a consumer-friendly quantifiable metric for audio systems that can help provide a great listening experience for the user, as well as generate market growth through increased awareness of the value of quality components.

The action item list

	Who’s Responsible	Due Date	Description
1	Diby	11/21/2013	Complete report for publication
2	Devon	On going	Make recommendations to OEMs on designs
3	Diby, Mikko	On going	Microphone to improve design
4	Leng	On going	OS: Microsoft, Apple, Google to provide methodology to provide sensor data,
5	Phil & Mike and Ted	On going	Algorithm developers to update algorithm

section 4

next section

select a section:
1. Introduction
2. Workgroup Reports Overview
3. Ubiquitous Networked Audio
4. HD Audio Capture in Consumer Devices
5. Enabling More Profound Human Expression with Modern Musical Instruments
6. Using Sensor Data to Improve the User Experience of Audio Applications
7. When is Hardware Offloading Preferable, Now and in the Future?
8. Schedule & Sponsors