We’re coming up on 23 years of collective brainstorming and problem solving across a broad swath of audio and music related topics. We’ve gone from trying to get crappy 8-bit SoundBlasters to work, to pervasive microphones listening to everything we say.
Have themes arisen and faded away? Why did they arise? Why did they fade away?
Is any old problem becoming new again?
Is the world becoming a better place for music and sound, or worse?
Why aren’t we celebrating the durability of MIDI???
I propose we look back on all of the subjects covered in past BBQs and look for patterns and predictors.
Why are music visualizers so artless? Most look like hardware spectrum analyzers or jiggling oscilloscopes. Perhaps it’s because the input data is limited to snapshots of bass and treble levels.
Let’s explore ideas for extracting emotionally meaningful information from audio to produce next-gen visual experiences. Imagine if a screen, garment, or virtual-reality experience could react to chord quality (e.g., major/minor/augmented), subtle tempo changes, solos, or even lyrics. A combination of deeper audio analysis and metadata could create experiences as exciting as movies and live dancers.
Sitting listening to the Orchestra in a Concert Hall it struck me how different the experience, and even the sound, was compared to listening to recordings of the same music on my “high quality” headphones.
Our microphones, ADCs, mixers, effects, file formats, DACs and drivers measure better than ever but we are still nowhere close to recreating the concert experience with headphones. Even the lowly triangle sounds better in real life than it does through headphones!
Fellow Bar-B-Qers I suggest we try and
We might want to consider new recording techniques, more DSP algorithms, 100 channel headphones, haptic suits, spatial audio, virtual reality. Let your imagination run wild, dream big, have fun!
The MIDI standard went from bright idea to being embraced by Roland, Yamaha, Korg, Kawai, and Sequential Circuits to shipping in product in a span of only 18 months. Notwithstanding currently views limitations of MIDI, that’s a remarkable time frame considering how enduring the standard has been.
My proposal is that we come together again as an industry to create a new standard. But, this time around we create a standard for data.
Advances in machine learning are allowing breakthrough new approaches to solving previously hard- or impossible-to-solve audio problems. We can identify and label objects within an audio stream. We can unmix a mixture. We can blindly segment a recording based on musical or other events. We can interpolate between acoustic and timbral spaces.
Where machine learning differs from the traditional science in music technology is that we need – in addition to tech, know-how and creativity – large amounts of well-labelled and structured training data.
Those of us pursuing this work are having to do this ourselves, likely duplicating work to create proprietary sets. The academic community provides some stuff, but it’s not enough.
In the biotech world, The Broad Institute was founded to support exactly this kind of collaborative partnership. Companies pool their data, collectively gain access, do work … and have been able to create major breakthroughs in genomics and biomedical research. Some companies in that space are going public without a business model or even a product … simply because of the tech they have created via access to this communal data set.
So, what if we joined forces in our corner of the world?
Can an ultrasonic communication provide helpful connectivity to Voice Interface and IoT devices, without facilitating intrusive data mining?
Ultrasonic communication, and in particular, near-ultrasonic communication has for the most part been an untapped resource in the voice interface and IoT worlds. It has been described by Wired as the “wild west of wireless tech”, in an article which explores the possible privacy and security risks with the technology:
There are many reasons why it is appealing as a protocol:
Alternative communication method to WiFi or BLE, where they are either not optimal on unavailable
Voice Interface devices already have necessary hardware (a microphone and a speaker)
Allows cross platform sharing of information
Proximity verification of a user
There are also reasons why it is concerning as a protocol:
Unauthorized tracking of individuals movements or habits
Spamming the audio spectrum
Considerations beyond humans – dogs, other ultrasonic sensitive animals
Some companies such as Google, have their own internal protocol like Google Nearby which has a fully secured end to end solution which works in conjunction with WiFi and BLE. Other companies like Chirp provide secure encryption protocols to allow ultrasonic data exchange to operate independently.
From an acoustics point of view, is it possible to implement sufficient protocols in an ultrasonic transmission standard so it is used for useful connectivity?
Standard transmission levels so devices have to be within a certain range?
Individual transmission bursts rather than constant beacons?
Use of constantly varied frequencies to avoid potential interference?
Untold millions of Internet-connected devices are slacking off when they could be working together to solve Big Audio Problems. The Search for Extraterrestrial Intelligence (SETI) program harnessed unused cycles on thousands of computers to hunt for aliens. What kinds of audio problems could we solve by networking millions of underused devices? What new sonic experiences could we create?
Now imagine extending the processing pool to mobile devices with microphones and location sensors. What are the opportunities for Massively Multiplayer Music? What incentives would inspire people to share access to their processors and I/O?
I suggest the BBQ brain was correct in calling for the end of the analog jack, but we failed to offer a compelling replacement. Previous brains showed the limitations of wireless transport for audio, and we still want a good wired experience for low cost, highest performance, and ultimate DiY hackability. (don’t we?)
Let’s show the industry what’s needed for a good wired digital headset experience:
Capitalize on power: now there’s power available in the headset, what awesome features will make wired digital headsets worthwhile?
But be frugal: wired headsets are DOA if they drain the host’s battery like a 5G modem…
5G is coming: charge while you get your groove on.
Capitalize on digital interface: lower latency and higher bandwidth than wireless, what awesome features will make wired digital headsets worthwhile?
Low cost to high-feature spectrum or: how I learned to stop worrying and spec decent headsets for developing markets and cheap Americans
Audio is the body’s only 360 degree sense that can be used to help cue someone to turn around without touching them almost precisely in that direction.
VR and AR/MR provide immersive video experience, but the audio needs to be created and combined with this video providing a similar immersive experience at an affordable cost.
The purpose of this topic is to discuss and identify the challenges for audio in these various types of “reality” (VR, AR, MR,) and the solutions thereof.
One may have multiple challenges while creating content for these headsets. Some of these have to be probably taken care of during content creation such as scene based, etc., but some will have to be done in the headsets such as distance rendering or factoring in room impulse response, etc.
Once these problems are identified, the next challenge will be to identify the best possible format for the content to be stored, communicated and played, object or ambisonics
In a world where kids seem excited to realize that dropping their phone into a pint glass ‘improves the sound quality’ it is a terrifying realization that we have an entire generation of audio consumers that have never enjoyed the impact and dynamics of real HiFi. The last ten years or so has started to bring audio out of the smartphone malaise, but what innovations will help to create a new generation of audio lovers?
‘big sound technologies’
psycho acoustic harmonic overlay
wide range transducers
What technology do we need to make the next generation of killer audio products?
Will I-V sense amps and closed loop amplification give us the ability to extract substantially more performance from traditional transducers (not just micro speakers)?
Can we further leverage advanced DSP platforms and multi-channel arrays to improve sound quality and deliver true dynamics within the constraints of the current industrial design trends?
Will distributed/mesh audio solutions proliferate into the mainstream?
What problems do we need to solve to allow small form factor audio solutions to deliver more impact and larger sound fields? Are we really at the limit of the laws of physics?
Per Moore’s law, transmission and processing bandwidth and storage density doubles every 18 months. Bandwidth required by audio and video applications, on the other hand, has historically doubled every 10 years.
We have already reached a point where many audio applications no longer strain our networks, computers and storage and the situation for video is not far behind. From here on, the gap between available performance and the requirements of our AV applications will widen rapidly, exponentially.
Keeping in mind that nature abhors a vacuum, what sorts of new AV applications can we envision to make use of these resources?
Transmission systems: What are the file/transport format requirements?
Composing and synthesizing systems
I’ve listed these in reverse to imagine the payoff so we can identify ways to get there.
As a background question, what are some Joe-friendly ways to record and distribute immersive audio today? I’ve tried software to make “AC3” DVDs and DTS CDs, the Zoom H2N 5-mic recorder’s stereo mixdown, in-ear mics, and spatializer plugins, but none has really curled my mustache.
Machine Learning (ML) or Artificial Intelligence (AI) is being used in various fields today including audio applications.
We should identify and list all potential “future” applications that can use ML or AI in Professional Audio and Home Audio Applications further. The focus is to identify if there are applications with and without networking.
A love of music is at the root of all our passions for audio. The ways that you and I learned to appreciate listening to, and creating music, are going to be different than the ways that people listen to and create music in the future. There is a constant struggle between the disruption and preservation of those methods.
Record stores have been closing steadily since the demise of Tower Records, streaming music via YouTube and Spotify are becoming preferred to AM/FM, and Guitar Center is $1.3 billion in debt, while Amazon is thriving. The romance of digging through dusty record crates to find two copies of a great break on vinyl, has been fading like a well-loved album cover. Music discovery, once a truly social activity requiring physical presence, grueling research, and conversation, has become more voyeuristic and individualized through the use of social media and search engines. Over the past century, passive solitary music consumption has overtaken active group musical performance. Percentages of discretionary income spending are also shifting away from musical content and moving toward hardware devices that commoditize music.
However, with tectonic shifts come opportunity.
Fundamentally, one of the greatest powers of music is to bring people together, and bringing people together is core to building community, identity, and purpose. How could the power for music to connect people be amplified in a positive way?
What problems could be solved in the next generation of music discovery?
How could this industry inspire a viral interest in music education and creation, or redefine those terms?
How could brick-and-mortar record stores and musical instrument stores reinvent themselves not just to survive, but to thrive?
What incentives could be created to drive an employment boom in the greater music industry?
What should the 5, 10 and 20-year goals and roadmap be for music discovery and creation?
Let’s get together and make a formal recommendation for adding audio properties to the gITF open source asset delivery format for digital 3D objects. Currently there is zero notion of audio in the format.
Possible recommendations could include:
Acoustic properties: absorption, reflection
Diffusion relating to material types
The format as described by the website:
The GL Transmission Format (glTF) is a runtime asset delivery format for GL APIs: WebGL, OpenGL ES, and OpenGL. glTF bridges the gap between 3D content creation tools and modern GL applications by providing an efficient, extensible, interoperable format for the transmission and loading of 3D content.
Also…there is a session at the WC3 workshop where audio will be discussed. If we can formalize our recommendation quickly it can be presented and submitted there:
“Augmented Audio Reality (AAR) ” has become another buzz word with the advent of Air Buds, apps like RjDj Here and innovations from audio companies like Doppler Labs and Harman as well as the popularity of applications of it’s big brother, “Augmented Reality” like Pokeman Go. Will consumers embrace AAR or will it be another personal intrusion only geeks will embrace?
I suspect it will be a mixed bag with broad adoption of AAR applications that enhance user experience and resistance to applications that intrude on the users brain. But considering AAR is a close cousin to Interactive Audio who better qualified than the BarBQ to sort the wheat from the chaff (You’all).
I pose these questions to the BarBQ Brain:
What AAR applications will be embraced by consumers and why?
What audio technology is needed to make advanced AAR compelling?
Will all AAR solutions be proprietary or is there a need for any standards?
What will AAR applications look like in 2021?
I’m sure other can add even more meaningful questions to this list.
Anyone and everyone – literally, anyone and everyone – are recording videos on their phones and uploading to YouTube to share with friends and family. While picture quality steadily improves upon device iterations, audio quality consistently leaves a lot to be desired.
We have dialogue-disrupting ambiences!
We have diaphragm-distorting wind!
We have directionality-driven dynamics fluctuations!
With recent trends toward livestreaming, this audio quality problem has now evolved from a mere “fix it in post” problem into a no-holds-barred, guns-a-blazing, real-time audio challenge for us to solve!
It’s a physics problem.
No, it’s a transducer problem.
No, it’s a DSP problem.
No, it’s a latency problem.
No, it’s a UI/UX problem.
No, it’s a marketing/education problem.
No, it’s a product problem.
Q: What is the future of personalization as it relates to interactive audio applications and technologies?
Head worn computers, smartphones that are essentially powerful PCs in our pockets that we carry with us at all times, fitness trackers…the list goes on…computing is just getting more “personal”. Customization is the next step and how we get closer with these new more personal computing devices, and also what makes them more deeply integrated into our lives. We assign special ringtones to important contacts, adjust our inter-pupillary distance for VR/AR headsets, and enter personal details into health trackers like weight, height, and birth date. Digital AI assistants like Siri and Cortana interface with us conversationally and learn our preferences. Active earbuds like the Bragi Dash and Here allow for personalization of the sound of the world around you to the point where we can almost dial in our own personal “mix” for a live musical performance.
Sample questions or discussion topics:
How does personalization relate to our ability to experience things like spatial audio?
How does personalization improve listening in general?
Why do we settle for a spatial audio effect based on an HRTF profile of some guy at Cambridge?
Will your HRTF profile be just like your height and weight someday?
Is there a way we can personalize user interface audio to make it more informative?
Does the mass-market care about personalization? If not, how do we get them interested?
How can machine learning be applied to continuously refine or tune personalization over time for a better listening experience?
Does inevitable hearing loss present an opportunity to adjust or tune audio systems over time to accommodate the individual hearing needs of the user?
YouTube is now the largest music-streaming service. Facebook lets you post videos, but not audio. Many musicians get around this by posting non-moving movies consisting of a stereo music track and a picture of an album cover. But what if music could become the foundation for dynamic, compelling video? That could make the whole audio chain more popular.
Imagine a visualizer driven by a combination of audio metadata, DSP, and artificial intelligence…perhaps even influenced by other sensor inputs. Instead of wiggling wireframes, this system could approach cinematic storytelling. And not just in video, but AR and VR as well.
What hooks could we add to audio files to generate more immersive visuals? What are the opportunities in production and delivery? And why are people who imagine the future called visionaries?
Footnote: Creative Labs did some groundbreaking work on music visualization back in 1999 with Lava/Oozic. The system used a proprietary file format and web player, and it died around the dot-com crash, but there were some ambitious ideas in there.
What advances in audio can enhance the effectiveness of augmented reality solutions? Is state-of-the-art audio enough or are there sound barriers to be broken to make the augmented reality user experience more compelling? What AR applications will drive these needs?
While I considered inclusion of virtual reality I believe there are some many diverse applications for augmented reality it is a huge topic in itself.
Ok, so we championed the open DSP architecture in 2014. Now its 2020: audio accessories skipped our Smart Connector interface (also from 2014) and went straight to wireless (really Devon? BT inside the chassis?); our Open DSPs are now little islands isolated by high latency, low bandwidth links.
How does the Open DSP ecosystem evolve to support – and maximize – an all wireless world?
Are signal processing entities portable? How does the framework optimize processing?
SPE = Signal Processing Element
Monkey Bus Wireless = A wireless bus from a future bbq. It must be better than Bluetooth.
What if every sound in the world was being recorded, and tagged with location and time? What if it was all searchable, reusable and accessible from any device? What new information could we learn from a sonic omniscience? What could we detect and automate? What problems could a system like this create or solve? What would it disrupt? What new forms of art could emerge?
As our world becomes increasingly filled with sensors and microphones, and the services we use are paid for with disclosure of data, it seems as though a system like this might one day be possible. What are the long term implications of a sonic omniscience? Is it all NSA and 1984, or are there opportunities to mitigate an Orwellian dystopia and use a system like this to create a better world? What responsibilities should those developing sensor networks and search algorithms have to ensure the best possible outcome? What should the equivalent be to Asimov’s “Laws of Robotics?”
Over the years we’ve all seen several promising standards efforts fail to bear timely fruit, consuming huge amounts of valuable volunteer time and energy in the process. I posit that this is a relevant problem for at least some of the industries represented at BBQ, and worthy of careful thought inside those industries.
Therefore, let the Big BBQ Brain think together upon: When to Standardize, vs. When Not To? Each path has its peculiar advantages and disadvantages which some people understand well but others don’t, particularly. A BBQ Workgroup Report gathering knowledge on this subject could, perhaps, have practical use as inception-time advice for future efforts by helping them to choose whatever path’s best for the particular project.
Under certain circumstances standards development can be slow and contentious, and therefore frustrating. Participants may burn out, then drop out, making subsequent progress even slower. Sometimes standards efforts fail as a result.
When progress toward any important thing is perceived as excessively process-heavy, technical people naturally become impatient and seek a faster workaround… and start thinking of open-source projects etc. … but this is also not always a perfect solution. After the feel-good launch and coding-party stages, the practical end results from that path don’t always display quite the required level of technical rigor, nor succeed quite as widely, nor attract quite the kinds of companies needed, nor exhibit quite the kind of technical stability over time that a large market may require.
A timely and good quality standard from a recognized standards development organization that’s created by major relevant companies can, by contrast, powerfully succeed and prevail in the market for many years, even as individual vendors come and go. And for the right kind of project with the right individual participants, the fluidity of an open source project is absolutely the best and most productive way to go.
What exactly is it about a given project that makes it likely to fail as a standard, or fail as an open-source project? This topic is all about characterizing the two ways, and characterizing projects.
How to funnel precious volunteer-hours toward more (vs. less) productive outcomes?
What are the characteristics of successful standards efforts?
What are the characteristics of unsuccessful standards efforts?
What characteristics make something other than a standards effort – for example, an open-source project, or establishing a new community – a more effective path for a given project?
What does taking a standards path achieve that other approaches (open-source, etc.) don’t, or can’t?
What does taking a non-standards path achieve that a standard doesn’t, or can’t?
What about IPR models?
Is there anything standards bodies could be doing differently to help troubled projects succeed
Which of standardization’s many inconveniences are simply unavoidable?
How about hybrid models, for example combining standardized specifications with open-source implementations?
‘…or meaningful ways to safeguard hearing without becoming a nanny state.’
I know that protecting hearing is a major hot-button issue with several BBQers, and I think it is an important issue that we should talk about.
I have not recently been involved in any updates to the EU hearing protection rules, but when I last read the proposed changes I nearly fainted. What I read:
-Dupe users into thinking they’re deaf or suffering from tinnitus when they’ve listened to too much loud music.
-Plaster ugly UI elements all over otherwise beautiful OSes.
-Hosts/players must psychically intuit rendering devices in order to know their output parameters.
-Track users across devices to monitor exposure.
If you are involved with the EU rulemaking and these do not reflect the current state of affairs (and assuming that you are permitted to do so), please correct my understanding. Note that I have taken some liberty in describing my observations.
What is the best way to protect hearing? What roll (or controls) should content creators/parents/governments/police have in protecting their fans/children/citizens/sheep? What can we do as technologists to help?
In the article, Peter, who’s also a two-time BBQ speaker, shares his insights on adding warmth and personality to devices through evocative sound. In the emerging Internet of Things, imagine how much further that could go if devices not only sung beautifully, but also harmonized with each other and the environment.
What’s the hottest new consumer technology? Helicopter drone video. What’s the worst thing about helicopter drone video? That droning sound. (Or no sound at all.) Imagine…
A drone-mounted mic that cancels the propeller sound, producing pristine soundtracks
A ring of drone-mounted speakers that follow you around, for mobile surround sound (“wingtones”)
An app that synthesizes music from silent drone video
More practically, future noise cancellation algorithms will offer numerous opportunities for adding sound and music to previously hostile environments. What are some scenarios that would encourage that development?
While the creators of augmented and mixed reality are pioneering great experiences in the realm of binaural audio, one might ask the question, what’s next? Where are the greatest opportunities for understanding our environment through sound, and seamlessly blending audio content with the world around us? What would we do with greater contextual awareness and responsiveness? What problems could we solve? What are the limitations and the possibilities?
From the earliest sputtering combustion engine of the Ford Model T, or the clackity-clack of a Mickey Mantle card in your bicycle spokes, to the modern stealthy sounds of the Tesla, the symphony of transportation continues to evolve. Knight Rider’s Kitt sold us on the dream of a car with the ability to carry on a conversation, although we’re not there yet. Car enthusiasts modify their exhausts to make them louder, and researchers are designing tires to make them quieter. As vehicle sound systems become more complex, what will this mean for our interactions with them? How will the sonic experience of vehicles impact the emotional relationships that users or bystanders have with them. What risks and opportunities does the vehicle give us that’s different from other platforms?
As Apple’s dominance continues to sore, most recently with iOS device sales overtaking Win PC sales, what does this mean for us developers? How much do the current popular platforms define the products we build? How much should they define what we build? In the case of iOS and music production there are some serious hurdles, namely screen real-estate and until recently lack of a good standard for inter-app communication. In the case of musical instruments, lack of tactile feedback creates serious design challenges. And in the case of all iOS products (software at least) there are significant economic challenges, namely, how do you fund and profit a serious development with a $5 product (if you’re lucky to charge any money at all)? Is anybody other than Apple making money on iOS apps? Will El Capitan’s AU3 and Audio Extensions change things? Will Windows 10 audio updates change things? Can touch devices really change the way music is produced without Android attending the party?
The IoT buzzword and concept has a lot of push behind it. The topic that I think would be interesting is does the Audio of Things in a given place want / need to use the internet. The companies with server side services are pushing it but that mean it is the “right” answer. Because of implied power (Radios) and security (information on shared server) can I keep the information local and still accomplish all of my home automation goals and get a benefit?
With the advent of digital interfaces for headset accessories, what kind of functionality are end-users wanting in their next gen headset/accessories?
Let’s keep the conversation away from the “how” and “over what interface” and think bigger to what end users are really looking for in future systems. Sensor, lights, multichannel, floating cameras that take selfies! etc.
Humans communicate with each other not just with pure speech but with speech augmented by a variety of cues including audio, facial, posture and gestures. Do any of these cues also add value in human to machine communications? Could they serve as another form of contextual awareness making ASR more accurate and machine responses more meaningful.
Some suggest these cues are too individual in nature to augment human to machine communication. Others point out that experts can identify and interpret these audio and visual cues by observing anyone which suggests computer intelligence could easily programmed to interpret these cues.
Let’s identify specific applications where a multifactor interface (speech plus other visual and audio cues) would add value in the new world where your voice becomes the primary interface to consumer products.
What wearable and IoT applications featuring audio will make a splash and stick? What applications will just splat? What sound voice, speech and audio technologies will make a difference in these emerging applications? What advances in sound technologies are needed in the next five years to create compelling and sustainable applications in this space? Let’s brainstorm potential applications with real consumer value propositions, debate their merit, prioritize them and define a sound technology roadmap to support them.
Dealing with thousands of dialogue lines has inspired ingenuity, creativity, tantrums and nervous breakdowns. The software used is myriad and methods swing from lunacy to genius. Get down and dirty therapy; share stories and find Holy Grails.
Almost every studio and every audio engineer still have different approaches and ‘standards’ when it comes to recording and mastering dialogue. Can we thrash out and write a best practice Bible on one side of paper?
We’ve got two days to validate a business model for a new company that we have created out of the ether of beer, sweat and BBQ.
We will identify our target customers, understand what they’re bitching about, and propose what we’re going to do about it & blow their socks off. We’ll run experiments on other groups (our target customers) to validate the hypotheses of our model. We’ll iterate, and pivot and all that good stuff. And when our model is complete, we will be ready to put a dent in the universe and make a ton of cash. Yeehaw!
With some sincerity, this topic is really about tech and business, not just tech. I’ve heard as many grumblings about business and politics at BBQ as I have about tech problems. Like Love and Marriage, you can’t have one without the other. So, what if we set out to solve some business problems?
Regardless of where you stand on the “Is High-Resolution Audio Worth It” debate, the marketing departments have already opened the barn door and the cows are out. In the corporate world’s never-ending quest for brand differentiation, market relevance, and lavish CEO compensation packages, “High Resolution Audio” is already being sold as the “Next Big Thing” in audio. As the owners of audio for the next five years, do we:
1: Ignore it: “That’s Snake Oil and we don’t want any part of it!”
2: Sell it: “We’ve got the BEST Snake Oil, and we’re gonna milk this to stay employed for a few more years!”
3: Build it: “We love it, you really want it, and we’re gonna do it right, even if that means ultrasonic tweeters in your headphone cans!”
I see that the 2011 “Galileo” group might be real big on topic 1, where we discuss whether or not High-Resolution Audio really does bring audio happiness. I think we’ll need some Screaming Monkeys when we get to topic 2, because we need to know if the new Monkey Bus can support High-Res. Finally, for #3, I think we need some support of the Doppler Chickens. As engineers, let’s do it right. I know mics can go ultrasonic, but what about speakers? How do we engineer speakers, microspeakers, and headphone receivers that can cleanly go up to 40kHz? What about low-power portable audio amplifiers with no intermodulation distortion up there? What about headphones (both circumaural and in-ear) design considerations? What kinds of transducers are we looking at? How about test systems and standards? I don’t think the type 3.3 ears go that high. Where do we go from here?
When it’s time to build the dream product, or series of products, yer gonna want yer best possible noises.
So you bring in the Dream Team, naturally, but how best to set them up? What tricks have Time and Experience taught us about the environment, the structure, the attitude–and how can we anticipate changes to those lessons over the next few years?
How do these things change in the future, with vastly improved collaboration tools, teleconferencing, telepresence, lifecasting, and with new tools blurring lines between “integration,” “music” and “sound design:” Who punches the clock at the factory, who commutes to his garage studio in bunny slippers? How big are the teams? How do we account for slippery job titles? How frequent are physical/virtual meetings? What’s flow of control and command will work best? What about interactions Audio has with the deeper technical teams and loftier Vision-Holders? Collaborations with outside contributors? What about it? HUH? WHAT ABOUT IT?!?!?!?
Looks like there’s lots of interest in Binaural and Headphones this year. Hmmmm.
I don’t recall who is handsome, let alone who “Handsome” is (Howard Brown?) but I like this topic, and it appears to have slipped through the cracks of 2013’s Giant Brain. Can we take another swing at it?
We often talk about ‘immersive audio’, where one feels like they are in the middle of a game, orchestra or movie. The use of spatial audio (HRTFs, room models, BRIRs, etc.) to render these immersive scenes is usually the ‘go-to’ idea. Some of the problems with synthetic spatial audio, as well as binaural field recordings, are:
1) The visual cues are missing or wrong.
2) Head motion is not taken into account.
3) HRTFs are generic and not individualized.
4) The listener’s environment is not taken into account.
That last point is particularly important. If you have a binaural recording made in a small room, but you listen to it in a large room, it will sound terribly colored. In fact, if the room you are listening in is not taken into account, any synthetic or binaural recording will have coloration.
Another big issue is that, if the visual cue is missing, the listener tends to localize the sound behind them (or at least somewhere outside of their field of vision).
So what can be done to mitigate these issues? Is this something that we can engineer (i.e., build me some new, celebrity endorsed headphones), or is it a matter of getting the signal processing just right (can you say ‘head tracker’, hallelujah!), or are there limitations at the cognitive level that need to be addressed?