We know that customers have been concerned by recent reports of people listening to audio Siri recordings as part of our Siri quality evaluation process. We heard their concerns, immediately suspended human grading of Siri requests, and began a thorough review of our practices and policies. As a result, we decided to make some changes to Siri that will be available in a software update this fall. You can learn more about these changes in this Apple Newsroom post.
In addition, here are some answers to common questions about Siri privacy and grading.
What is grading?
Before we suspended grading, our process involved reviewing a small sample of audio from Siri requests — less than 0.2% — and their computer-generated transcripts to measure how well Siri was responding and to improve its reliability. For example, did the user intend to wake Siri? Did Siri hear the request accurately? And did Siri respond appropriately to the request? By using grading across a small sample of Siri requests over time, Apple can make big improvements that help ensure that our customers around the world have the best Siri experience possible.
How are Siri’s privacy policies unique among intelligent assistants?
At Apple, we believe privacy is a fundamental human right. We design our products to protect users’ personal data, and we are constantly working to strengthen those protections. This is true for our services, as well.
Our goal with Siri is to provide the best experience for our customers while vigilantly protecting their privacy. We believe that customers should have privacy by default with respect to their audio recordings, without requiring that they change device settings. With Apple, customers will need to opt in to share their audio to help improve Siri.
Siri has been engineered to protect user privacy from the beginning, because Apple’s business doesn’t depend on collecting anyone’s data. We use as little data as possible to provide a great service, and we process that data — including Siri requests — on device as much as possible.
Siri uses a random identifier — a long string of letters and numbers associated with a single device — to keep track of data while it’s being processed, rather than tying it to your identity through your Apple ID or phone number — a process that we believe is unique among the digital assistants in use today. For further protection, after six months, the device’s data is disassociated from the random identifier.
In iOS, we offer details on the data Siri accesses, and how we protect your information in the process, in Settings > Siri & Search > About Ask Siri & Privacy.
Is Siri always listening? What do you do to prevent Siri listening when I haven’t said, "Hey Siri”?
No. Siri is designed to activate and send audio to Apple only after you trigger your device by saying “Hey Siri,” use the raise to speak feature on Apple Watch, or physically trigger Siri using the designated buttons on iPhone, iPad, Mac, Apple Watch, Apple TV, AirPods, and HomePod.
To recognize “Hey Siri” we process audio solely on device through multiple stages of analysis to determine if the audio matched the “Hey Siri” pattern. Only when the device recognizes the "Hey Siri" pattern is your audio sent to the server. On the server we do additional mitigation to analyze the full request to confirm it is intended for Siri.
Occasionally we have what’s called a “false trigger,” where Siri activates when you did not intend it to. We work hard to minimize false triggers and have updated the review process to limit graders’ exposure to them. When we resume grading, our team will work to delete any recording which is determined to trigger Siri inadvertently.
When you say you are minimizing the amount of data reviewers have access to, what does that mean? What will they still be able to hear?
We are making changes to the human grading process to further minimize the amount of data reviewers have access to, so that they see only the data necessary to effectively do their work. For example, the names of the devices and rooms you setup in the Home app will only be accessible by the reviewer if the request being graded involves controlling devices in the home.
Why does Siri need access to user information such as contacts, personal playlist names and names of rooms and devices set up in the Home app? Can Apple identify me or control my HomeKit devices?
For Siri to more accurately complete personalized tasks, it collects and stores certain information from your device. For instance, when Siri encounters an uncommon name, it may use names from your Contacts to make sure it recognizes the name correctly. In iOS we offer details on the data that Siri accesses, and how we protect your information in the process, in Settings > Siri & Search > About Ask Siri & Privacy.
Siri uses as little data as possible to deliver an accurate result. When you ask a question about a sporting event, for example, Siri uses your general location to provide suitable results. But if you ask for the nearest grocery store, more specific location data is used.
If you ask Siri to read your unread messages, Siri simply instructs your device to read aloud your unread messages. The contents of your messages aren’t transmitted to Siri’s servers, because that isn’t necessary to fulfill your request.
Who does the grading?
When customers opt-in, only Apple employees will be allowed to listen to audio samples of the Siri interactions. Our team will work to delete any recording which is determined to be an inadvertent trigger of Siri.
Why do you keep transcripts for customers who do not opt in?
Computer generated transcripts are used to improve Siri and its reliability. These transcripts are used in machine learning training to improve Siri, determine common usage patterns, and update language and understanding models. The transcripts may also be used to resolve critical problems that affect Siri reliability.
Is the only way for Siri not to retain my audio recordings and transcripts to disable Siri?
By default, Apple will no longer retain audio of your Siri requests, starting with a future software release in fall 2019. Computer-generated transcriptions of your audio requests may be used to improve Siri. These transcriptions are associated with a random identifier, not your Apple ID, for up to six months. If you do not want transcriptions of your Siri audio recordings to be retained, you can disable Siri and Dictation in Settings.