Consumer “smart” devices are used to capture audio for multimedia, voice and speech. Such user generated media is becoming an increasingly greater fraction of media consumed through video and audio sharing services and social apps. Ubiquitous audio capture is being driven by access to easily available multi-purpose devices like tablets, smart-phones at very accessible price points. Such devices tend to be limited by the platform, which may be designed primarily for one purpose (usually not audio) and be re-purposed to handle other purposes, which include audio as a minimum requirement check-list item. Constraints are also imposed on acoustic design by evolution of devices to be smaller, thinner and lighter. Thus, audio captured by current mobile devices has low quality and fidelity.
The group identified use cases that highlight the deficiencies in audio capture and provide opportunities for high quality consumer audio
- Use phone/tablet as camcorder – long range capture
- record concert, kids playing, activities, lectures, conference
- Capture people talking
- Voice communication – skype, facetime
- Transmit and receive clear speech
- Speech recognition
- Simultaneous communication + speech recognition
- Distinguish and manage communication speech and command & control words
- Biometric analysis
- Voice Recognition
- Stress, emotion detection
- Acoustic scene analysis
- Activity detection during low power standby
- Sound track acoustic analysis to determine context of the content
- Use mic to monitor and optimize playback performance
- Directional/focused capture
- Full band audio capture (concert) – at 30’ but not interference at 5’, record based on proximity
- Wireless/Remote capture
- Lavalier mic on a speaker broadcast over a local network
- Control directional capture automatically
- While changing camera or when the device is rotated. E.g. during Skype/Facetime capture
- Capture audio on wearable devices
- Command and control on wrist in any position
- Context based audio capture for smart eyewear for multi-media, command and control, communication
- Multimedia capture
- Capability to provide mono, stereo, surround, spatial depending on playback mode
- Capture and stream real-time or store it for later playback
- Capturing ultrasonic data
- Impact of location of mic, port geometry
The problems may be defined by limitations that arise in old and new use cases for audio capture enabled on “smart” mobile devices.
- Dynamic range limitations in the transducers
- Noise floor of microphones limit lower end.
- Acoustic handling capabilities limit high end.
- Use of multiple microphones on a device:
- Unable to select a subset of microphones e.g. horizontal pairs of microphones based on orientation.
- Unable to use more than 2 microphones simultaneously
- Different types of microphones are being used on a device although where they are located and which microphone(s) to use in a specific application and orientation is unknown
- Devices are not capable of fully determining the desire of the content creator, even in limited contexts. It is challenging to determine what to capture e.g. environment, individuals, wideband, narrowband, speech, voice, etc.
- Sensors, like accelerometers and gyroscopes, which may provide context are not being exploited for controlling audio capture.
- Power management: Sensors are not on same power domain and may not all be accessible in the same power state.
- Components like microphones and codecs usually come from different vendors and have different performance characteristics.
- Processing solutions/algorithms
- Algorithms come from multiple vendors and they don't interoperate.
- Most noise reduction produces monaural where spatial audio is preferred.
- OS is impediment to high quality audio capture.
- Audio quality is compromised due to BOM cost of devices and software
The group determined that solutions need to be defined in terms of the full platform design.The diagram below defines the interdependencies between the different components. The following are necessary to enable such capabilities
- More microphones
- Better microphones with improved SNR, dynamic range, resonance, sealing & isolation
- Glue-only microphones to improve fidelity and to lower cost
- Single package microphone arrays
- Better speakers for better echo cancellation and playback and recorded content
- Better algorithms that work well with microphones and codec – robust to microphone placements, distance and quality
- Improve dynamic range through microphone control of amplifiers
- Microphone characterization / parameters available to algorithm developers and in real-time to system
- Real time availability of sensor data to improve ambient contextual awareness e.g. orientation, geo-location, focal distance of lens, distance of the object, face recognition, distance of object, time stamp, format, position
- Pluggable compute architecture to extend processing capability
- Ensure that needed sensors are ON when microphones are used
- Real time algorithm change based on sensor data
- Standardize info reporting so codec, microphone, algorithm developers can acquire info for device customization, updates, etc.
- sensor, components, block diagram, what the app/algorithm developers need for development
- Smart processing – AGC/ALC, Spatializer, are available
2011 Definition of Audio Quality and Happiness
Explores audio quality in terms of experience and presents 6 metrics that attempt to revitalize the definition of ‘quality audio’ by focusing on consumer experiences.
2008 Smart Ambient Sound Sensor
Proposes the creation of a new form of acoustic monitoring for the PC space that can be used to improve user experience with minimal user interaction.
2006 A Consumer-friendly Quantifiable Metric for Audio Systems
A proposal for a consumer-friendly quantifiable metric for audio systems that can help provide a great listening experience for the user, as well as generate market growth through increased awareness of the value of quality components.
|
Who’s Responsible |
Due Date |
Description |
1 |
Diby |
11/21/2013 |
Complete report for publication |
2 |
Devon |
On going |
Make recommendations to OEMs on designs |
3 |
Diby, Mikko |
On going |
Microphone to improve design |
4 |
Leng |
On going |
OS: Microsoft, Apple, Google to provide methodology to provide sensor data, |
5 |
Phil & Mike and Ted |
On going |
Algorithm developers to update algorithm |
section 4
|