Project Bar-B-Q 2017 report section 3

home previous next
The Twenty-second Annual Interactive Audio Conference PROJECT BAR-B-Q 2017

BBQ

Group Report:
Alexa, Siri, Cortana or: How I Learned to Stop Worrying and Love the Cloud


Participants: A.K.A. "Always Listening, Always"
Michael Ricci, Knowles	Bobby Littrell, Vesper
Dafydd Roche, Dialog Semi	Phil Brown, Dolby
Chris Morrison, Dialog Semi	Rajeev Morajkar, Analog Devices
Dan Bogard, Synaptics	David Berol, Amazon
David Dully, Dolby	Neil Hinnant, Microsoft
Jack Joseph Puig, Waves

Facilitator: David Battino
	download the PDF

Brief statement of the problem(s) on which the group worked There are now 10 billion devices that are always on and, for example, are able to discern music playing or to characterize the background before a keyword. What challenges do these systems bring in terms of ethics, security and privacy, like the expectation of a similar experience about privacy across multiple ecosystems? Can we do cool stuff and still keep our privacy intact? A brief statement of the group’s solutions to those problems We created a diagram to explain the flow of data from a hardware and software perspective in today’s systems to identify the trust boundary for each respective node. Of note for each node is the privacy, security and ownership. Privacy is defined as users having a clear understanding of where and how their data is being used. Security is defined as how safe the data is from unknown use. Ownership is defined as who can use, control access to, and store/remove data. Software Model: Link between Mic and Firmware Security concerns: Physically connecting to the mic output (low probability) and/or Firmware hack (medium probability) Recommendations: Hardware switch and/or encryption method along with some type of indication mechanism to user. Owner: HW system vendor. Link between Firmware and Software driver Security concerns: TBD Recommendations: TBD Owner: Microphone Aggregator and SW Stack Link between SW driver and the OS Security concerns: Who can talk to the SW driver. Recommendations: TBD Owner: OS Vendor — If OS with multiple consumers (apps) have a device access trust model. 1/ Link between OS and Application layer Security concerns: single app environment — no concerns; in multi-app environment, the brokering of access to the data (microphone input) Recommendations: TBD 2/ Link between OS and OS platform data collection Security concerns: Excessive data collection (typically not screened, capture everything) Recommendations: Best practices like applied to financial data — encryption, appropriate tagging, strict policy Link between Application Layer / Device and Router Security concerns: No practical concerns but it’s the responsibility of application to maintain data encryption as needed. Recommendations: Clearly articulate the security namespace is captured Owner: Application developer Application and/or OS cloud Security concerns: What data is being collected? Recommendations: Opt-in for feature (like voice print improvement) Owner: Cloud owner Link between Application Cloud and 3rd Party, e.g., Bank, Retail, Utility Cloud Wrong cloud gets access to data outside of its need to know. Recommendations: Clearly define roles and responsibility for App Cloud and 3P Cloud Owner: Both Cloud services Expanded problem statement Original problem statement to reference: There are now 10 billion devices that are always on and, for example, are able to discern music playing or to characterize the background before a keyword. What challenges do these systems bring in terms of ethics, security and privacy, like the expectation of a similar experience about privacy across multiple ecosystems? Can we do cool stuff and still keep our privacy intact? Next evolution of “always-on” systems taking into account privacy, clarity of terms of use, competitive landscape and use cases. Does always listening mean always analyzed and / or always stored? Stored for how long, who owns the data and who can access the data? Complexity vs. user experience in balancing all the permissions. Expanded solution description Privacy Expectations: Articulate Usage States (LED state or icon is strongly desired) Current state / Normal Mode In a home, anyone can use, no parental control (free use), all apps use approved, purchase with pin. “Family Mode (Multi-User / Speaker ID) Tiers of users; purchase capability per account; automatic categorization of searches “Friends” Mode: Continue playing song or movie; Take a note for another person’s account. “Incognito” Mode A mode where search history is not save; keyword and apply to wake is still present. “Mute” Mode: Microphone off; some implement as a hard button (more robust); “Privacy” Mode: may need to include camera as well. Any and all capture functions are off. This is desire to be cross ecosystem and the same keyword for clarity to users. “Off” You know, Off. Need to address what the minimum privileges are in broad report. Shared keywords / Privacy mode: This is addressing the idea of an industry-wide phrase or keyword to enable a given feature. For example, “Engage Cone of Silence” or “Privacy” to turn off all device across voice assistants (Google, Amazon, Apple, Microsoft, etc.). Issues: Buy-in from all parties. Personality of the given assistant. Cost of performance to keywords or battery life. Additional infrastructure: All devices may not hear you but if one does it pokes all the other device in your accounts (across ecosystem). Advantages: Trust for users that they are in a safe space for what they want to do. Convenience to not have engage each device directly (push all of the mute buttons). Intro Privacy vs. Security What is the implementation of background apps in the future? Each Link SWOT SW Side Who’s responsible? HW Side Who’s responsible? Standardization of trust model Policies among cloud providers Articulate the terms and conditions for data storage and persistence One person asks a question on another person’s account Minimum Privileges Recommendations Privilege level change needs two-factor authentication What cool stuff can we do? Items from the brainstorming lists that the group thought were worth reporting What cool things can we do with always-on? Camouflage the speaker so they can’t be listened to Measuring accuracy of always-on devices Can we preserve audio for posterity’s sake? Ethical and Philosophical issues Terms of Service & Clickthrough Convenience vs. Privacy Each product has different rules/terms etc. Regulation and Legislation Multiple devices listening; do they interact? Communicate? What is captured? Is it preserved? What privacy rights does it have? What info is shared between devices? Can my wife recall a conversation to prove she’s right? Is there individual speaker ID with different permissions? Can we switch to higher quality modes at the right time? What is protected by different countries or states? What other devices can we add mics to? How do you individually address each device? What is being stored and where? What is local, what is in the cloud? Who is responsible for right to privacy? Chip vendor? System integrator? Software provider? How do we integrate authentication on always-on devices? Privacy in/out of house — how do we handle these? Different considerations? Shotspotter type crowdsourcing Acoustic event detection, do we want to use these devices to monitor? Google, Amazon, Apple are clear as to what they are doing with the audio Facebook (Terms of Service) TOS says it listens all the time (and pushes ads) How does the user know what the hell is going on? No one reads TOS and there is not always a light on saying that the device is listening. How do we handle opt-in/out? When you use the service, the data is used for training Deep Neural Networks (DNNs) etc. (anonymously?) “There needs to be a way to delete all my data” implies that my data is annotated and not anonymized Take snippets of speech and re-synthesize people saying things that they didn’t actually say Is our memory of recorded events distorted or romanticized from what is actually recorded? Buffered recording — can we have enough buffer to save/preserve things that just happened? “In the moment” type recording. Is relevant advertising useful? Friends talking about bands might be useful. But browsing history is already used for targeting ads, so is voice-triggered advertising more or less annoying? How do we balance convenience vs. privacy? Scrubbing email to put flight info on a calendar is useful, but other things seem invasive How has culture changed to allow mics to be always on? Identifying individual speakers is further invasive. If we talk about going to dinner in two weeks, do I get a calendar entry predicting this? Level of trust — a real assistant might provide a completely trustworthy conduit but an electronic one is not? A real assistant would be discreet and confidential — for instance, if I buy Xmas gifts for my family, how do I keep my kids from finding this out? Do we need deeper understanding from machine learning? Able to interpret or use common sense to get to the core of the topic or request? Alexa has a “personality” that is a choice by Amazon; she has favorite teams, colors, etc. — this personality might not be appropriate for another country Can the AI personalize herself based on my favorites — sports, politics, etc. Communication between virtual assistants: “Alexa, unlock the privacy rights skill” — can we ask the AI to set the rules for sharing data? What if I tell the AI to not tell something? Or tell a lie? Or ask something that might indicate that I’m dangerous or criminal? Can there be a natural language way to have a private conversation? “Alexa, go private” How do we do it across all the different mics/platforms? Is there a market for a private assistant that might not be used for marketing purposes? Would that be a subscription? Opt-out options — see Google analytics opt-out add-on Trade features or money for privacy Opt-in/out in a selective way — dates, times, content, etc. Address direction of the future groups: 2018 Multiple groups have addressed the future of voice assistance over the past several years. In addressing topics in this space going forward, we advise future groups to avoid going too broad (a consequence of clumping) and address the direction of a specific topic in the field. Other reference material Previous BBQ reports The Future of Voice Interfaces —https://www.projectbarbq.com/reports/bbq17/bbq17r3.htm Audio Of Things: Audio Features and Security for Smart Homes/Internet of Things —http://www.projectbarbq.com/reports/bbq15/bbq15r4.htm Audio opportunities in the Internet of Things —http ://www.projectbarbq.com/reports/bbq14/bbq14r7.htm Using Sensor Data to Improve the User Experience of Audio Applications —http://www.projectbarbq.com/reports/bbq13/bbq13r6.htm Form Factors and Connectivity for Wearable Audio Devices —https://www.projectbarbq.com/reports/bbq12/bbq12r5.htm Privacy links https://www.aclu.org/blog/privacy-technology/privacy-threat-always-microphones-amazon-echo https://comm.ncsl.org/productfiles/95595057/FPF_Always_On_WP.pdf http://www.red5security.com/index_36_1224411244.pdf http://thebigsmoke.com/2017/01/14/always-always-listening-privacy-no-place-future/ https://www.democracynow.org/2017/1/4/privacy_advocates_warn_of_potential_surveillance https://www.democracynow.org/shows/2017/1/4?autostart=true https://en.m.wikipedia.org/wiki/General_Data_Protection_Regulation https://www.wired.com/2017/02/murder-case-tests-alexas-devotion-privacy/ https://images.apple.com/business/docs/iOS_Security_Guide.pdf (pg 49) https://lifehacker.com/how-to-delete-the-voice-data-that-amazon-echo-and-googl-1820737802 https://www.getakita.co/?utm_source=FB_TRS&utm_campaign=Akita+valifation Fun Videos! https://youtu.be/YvT_gqs5ETk https://youtu.be/sAz_UvnUeuU https://youtu.be/LRq_SAuQDec section 3

select a section:
1. Introduction
2. Workgroup Reports Overview
3. Alexa, Siri, Cortana or: How I Learned to Stop Worrying and Love the Cloud
4. “You and the Uni: Defining Pedagogical Requirements for Audio Engineering Education” a.k.a. Discovering What to Learn Them Young Whippersnappers
5. A spatial audio format with 6 Degrees of Freedom
6. CAAML: Creative Audio Applications of Machine Learning
7. Mode and Nodes – Enabling Consumer Use of Heterogeneous Wireless Speaker Devices
8. Abusing Technology for Creative Purposes
9. Schedule & Sponsors

Copyright 2000-2017, Fat Labs, Inc., ALL RIGHTS RESERVED
www.projectbarbq.com