Apple reveals it keeps anonymized Siri data for up to 2 years

Posted:
in iPhone edited January 2014
Data collected by Apple to improve its voice-driven Siri service is anonymized and kept on the company's servers for up to two years before it is discarded.

The disclosure was made by Apple to Wired after privacy advocates called on the company to reveal exactly what information it knows and keeps about users. Apple spokeswoman Trudy Muller said the anonymized data is collected solely to improve the service, and that the company takes customer privacy very seriously.



Much of the work for Siri is done remotely, which is why the personal assistant software available on iPhone, iPad and iPod touch requires a data connection to operate.

Voice clips stored by Apple are categorized by random numbers to represent the user who recorded it. The number is not associated with an Apple ID, email address, or anything else that could be easily personally identifiable.

After six months, the random number is no longer associated with the saved clip, but the audio file may be saved for up to two years in total for what Wired said are "testing and product improvement purposes."

However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it.

The fact that Siri data must be sent to Apple before it can provide results has been a concern for security advocates, as well as some companies. For example, last year it was revealed that security-conscious IBM barred the use of Siri on its corporate networks, out of concern that sensitive information could leak.
«13

Comments

  • Reply 1 of 58
    asciiascii Posts: 5,936member
    Wasn't there a thing a few years ago, where Google kept all their search queries associated with an anonymous id, and they released a database to a university researcher, and she was able to identify people and contact them just by going through all the the queries associated with a particular id?

    Sometimes just the fact of linking things together has a de-anonymising effect.
  • Reply 2 of 58
    solipsismxsolipsismx Posts: 19,566member
    This isn't unexpected. However, I do wish Apple would use something like the Harvard Sentences or some other derivation so that you can train Siri to know how you speak so it can tie what you say to various phonemes.

    However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it.

    I wonder if this is accurate as stated. I wouldn't think the data associated with it is deleted, but rather just the identifier that ties it to that device. From a user's perspective it's gone, but I would think it's still on Apple's servers.
  • Reply 3 of 58
    bdkennedy1bdkennedy1 Posts: 1,459member
    If you're worried about it, don't use it.
  • Reply 4 of 58
    solipsismxsolipsismx Posts: 19,566member
    ascii wrote: »
    Wasn't there a thing a few years ago, where Google kept all their search queries associated with an anonymous id, and they released a database to a university researcher, and she was able to identify people and contact them just by going through all the the queries associated with a particular id?

    Sometimes just the fact of linking things together has a de-anonymising effect.

    Yeah, I do recall that now. Hopefully Apple's service is more anonymous, but since we're talking bits of speech it's unless to get looked over at some point by the public or sold.
  • Reply 5 of 58
    nagrommenagromme Posts: 2,834member
    I'm not sure how this is much of a shock: we already knew it was a cloud service, and Apple's terms already stated that info was stored. The main comfort for me: unlike SOME companies I could name, Apple doesn't make essentially all its profit by selling such "anonymized" personal info! Apple makes its profit by making its users happy, and selling the Siri data would not achieve that!

    As for Google and DE-anonymizing... I am NEVER logged into Google. I use DoNotTrack. I use the Google opt-out extension. I block third-party cookies. Yet when I searched Google for specific car models last week, I almost immediately received spam about those same cars! Now, I don't think Google turned around and directly sold my info... but they are part of the chain, which is not anonymous at all! Clearly third parties have profiles on me including private email addresses. I don't think Siri plays any part in that kind of personal profile-building.

    (This is all the more puzzling because I thought Google said they'd stop sharing search terms in their referrals. Really not sure how this kind of thing happens. Something similar happened to my friend with a medical condition: she did some searches on Google, and shortly afterward, entirely unrelated sites were showing relevant banner ads! She too is never logged into Google.)

    Bottom line... privacy worries me. Siri doesn't.
  • Reply 6 of 58
    gatorguygatorguy Posts: 24,176member

    Quote:

    Originally Posted by nagromme View Post



    I'm not sure how this is much of a shock: we already knew it was a cloud service, and Apple's terms already stated that info was stored. The main comfort for me: unlike SOME companies I could name, Apple doesn't make essentially all its profit by selling such "anonymized" personal info! Apple makes its profit by making its users happy, and selling the Siri data would not achieve that!


    Which companies are selling that "anonymized" personal info? I don't recall you mentioning their names before.

  • Reply 7 of 58


    That's just totally creepy.  apple seems to be no better than Google when it comes to privacy.


     


     


    And BTW - what the heck happened to Apple Insider?  It used to get a lot of traffic on the forums.  Now it is like a ghost town, with no more than a handful of comments?  What chased everyone away?  How can the owners make any money with so few participants?

  • Reply 8 of 58
    tbelltbell Posts: 3,146member

    Quote:

    Originally Posted by JoeySmith2 View Post


    That's just totally creepy.  apple seems to be no better than Google when it comes to privacy.


     


     


    And BTW - what the heck happened to Apple Insider?  It used to get a lot of traffic on the forums.  Now it is like a ghost town, with no more than a handful of comments?  What chased everyone away?  How can the owners make any money with so few participants?



     


     


    You have only made one comment before. Considering you suggest you have been a long time follower, it is strange that this issue would be the one you decide to chime in on. 


     


     


    As far as creepy goes, how do you think services like this work and improve? Apple, like Google with Google voice and its voice search, keep the voice data to improve the service. With that said, from personal experience, my long relationship with both companies,  and the difference in business models, I trust Apple more with my data. 

  • Reply 9 of 58
    mhiklmhikl Posts: 471member

    Quote:

    Originally Posted by nagromme View Post



    (This is all the more puzzling because I thought Google said they'd stop sharing search terms in their referrals. . . .)


     


    Just because I say I'm pretty, doesn't make it so. :)


  • Reply 10 of 58
    tbelltbell Posts: 3,146member

    Quote:

    Originally Posted by Gatorguy View Post


    Which companies are selling that "anonymized" personal info? I don't recall you mentioning their names before.



     


    I don't think Google sells anonymized data. That would likely undermine its business model.  It certainly uses the data to drive up the price it charges to deliver people supposedly personalized ads. In my view, it fails at delivering those ads effectively both annoying me and robbing advertisers who think Google is sending ads to the right people. I also think Google is poor at protecting users' data. I have had multiple Gmail accounts from friends, family, and myself hacked. That raises the question, can parties get their hands on the data without Google's permission.  Google certainly isn't the only company that fails regarding protecting data, but I haven't had any issues with Apple yet. 


     


    However, there are companies like Verizon, On Star, and Mint that sells anonymized data. 

  • Reply 11 of 58
    gatorguygatorguy Posts: 24,176member

    Quote:

    Originally Posted by nagromme View Post



    I'm not sure how this is much of a shock: we already knew it was a cloud service, and Apple's terms already stated that info was stored. The main comfort for me: unlike SOME companies I could name, Apple doesn't make essentially all its profit by selling such "anonymized" personal info! Apple makes its profit by making its users happy, and selling the Siri data would not achieve that!



    As for Google and DE-anonymizing... I am NEVER logged into Google. I use DoNotTrack. I use the Google opt-out extension. I block third-party cookies. Yet when I searched Google for specific car models last week, I almost immediately received spam about those same cars! Now, I don't think Google turned around and directly sold my info... but they are part of the chain, which is not anonymous at all! Clearly third parties have profiles on me including private email addresses. I don't think Siri plays any part in that kind of personal profile-building.



    (This is all the more puzzling because I thought Google said they'd stop sharing search terms in their referrals. Really not sure how this kind of thing happens. Something similar happened to my friend with a medical condition: she did some searches on Google, and shortly afterward, entirely unrelated sites were showing relevant banner ads! She too is never logged into Google.)



    Bottom line... privacy worries me. Siri doesn't.


    If you weren't logged into Google and using Do Not Track then it's sounds like some company other than Google was responsible for the targeted ads you saw. Wouldn't you agree?


     


    Are you an ATT subscriber for your iPhone? Are you familiar with ATT Adworks? Perhaps you should be.


    http://www.adworks.att.com/


     


    Or maybe you're with Sprint. Did you know they track the websites you visit on your mobile device too as well as log the apps you use to help third-parties in targeting ads just for you?


     


    How about Verizon? Yup they do it too. In fact they changed their privacy policy in late 2011 to allow them to sell your "profile" outright to third-parties. Supposedly ATT, Sprint and T-Mo only assist with targeted ad placement and don't actually hand over your data.


     


    While Google has said they'll respect Safari users Do Not Track requests, prompted by their "accidental"image bypassing caught a few months back, there's a lot of ad providers that have made no such commitment. 


     


    http://www.nytimes.com/2012/10/14/technology/do-not-track-movement-is-drawing-advertisers-fire.html?_r=0


    http://money.cnn.com/2011/11/01/technology/verizon_att_sprint_tmobile_privacy/index.htm


     


    If you REALLY want to get a slap in the face waking you up to your myth of privacy, here's a quick and easy read.


    http://www.propublica.org/article/everything-we-know-about-what-data-brokers-know-about-you

  • Reply 12 of 58
    vorsosvorsos Posts: 302member


    Cue clueless senators calling for Apple to limit this data to 7 days, drastically reducing Siri's quality improvements. Just like with location data.

  • Reply 13 of 58
    gatorguygatorguy Posts: 24,176member

    Quote:

    Originally Posted by TBell View Post


     


    However, there are companies like Verizon, On Star, and Mint that sells anonymized data. 



    Yes sir they do and they're not alone. IIRC every major credit reporting agency has been caught selling personally identifiable financial information to 3rd party marketers. There's several FTC settlements made with them over the years.

  • Reply 14 of 58
    isaidsoisaidso Posts: 750member


    No shit...?!    How'd people think it worked?


     


    Sir is a learning system.  You thought the way it learned, was by "forgetting?

  • Reply 15 of 58


    Yes, it's obviously google's fault here... A giant company (with lots of money) is obviously willfully violating the law.  Doing something that computer savvy people can easily track and determine who's to blame.  I mean, spam is worth a lot of money, and it's not like the lawyers would launch a huge class action law suit against google for violating federal law.  Plus, no one from these companies would ever think that they're violating CANSPAM (or whatever law it is), and blow a whistle on google.  Obviously they have their people locked down.\


     


    Or maybe, your system has been compromised by adware/spyware.  Something on your system knows your email address, and then knows what you search for... This makes FAR more sense to me.  There are tons of websites that also track your data.  I'm assuming when you were searching for cars you also clicked on links, and didn't just read google's results.  Maybe someone had a cookie on your system storing your email address, or an identifier linked to it, and then sent it.  There are tons of ways companies can collect this, but I suppose this is appleinsider, where blaming google is the easy way out.


     


    Phil


    Quote:

    Originally Posted by nagromme View Post



    I'm not sure how this is much of a shock: we already knew it was a cloud service, and Apple's terms already stated that info was stored. The main comfort for me: unlike SOME companies I could name, Apple doesn't make essentially all its profit by selling such "anonymized" personal info! Apple makes its profit by making its users happy, and selling the Siri data would not achieve that!



    As for Google and DE-anonymizing... I am NEVER logged into Google. I use DoNotTrack. I use the Google opt-out extension. I block third-party cookies. Yet when I searched Google for specific car models last week, I almost immediately received spam about those same cars! Now, I don't think Google turned around and directly sold my info... but they are part of the chain, which is not anonymous at all! Clearly third parties have profiles on me including private email addresses. I don't think Siri plays any part in that kind of personal profile-building.



    (This is all the more puzzling because I thought Google said they'd stop sharing search terms in their referrals. Really not sure how this kind of thing happens. Something similar happened to my friend with a medical condition: she did some searches on Google, and shortly afterward, entirely unrelated sites were showing relevant banner ads! She too is never logged into Google.)



    Bottom line... privacy worries me. Siri doesn't.

  • Reply 16 of 58
    Worried about technology? Then don't use it.
    I love it and use it.
  • Reply 17 of 58


    @ascii and others talking about Google releasing searches.


     


    I believe you are thinking of the AOL 2006 data release http://en.wikipedia.org/wiki/AOL_search_data_leak through which New York Times successfully discovered the identity of several searchers, most notably Thelma Arnold.

  • Reply 18 of 58
    Privacy is concerning. Sites like facebook often know more about you than you do. And there isn't a ton of net benefit except to facebook. Inline ads on your phone are a scourge.

    But Siri's anonymized data is what will make Siri useful and valuable for YOU the more you use it, and an annoying gimmick if politicians successfully limit the ability for Apple to use voice recordings to improve quality.

    Tell your politicians! Send them emails!
  • Reply 19 of 58
    kdarlingkdarling Posts: 1,640member

    Neither Apple nor Google is going to sell our voice snippets, and both undoubtedly have very few people with access to the data and the relationship key.  (Google's voice recognition policy specifically states this.)


     


    Google gives the user full control.  On an Android device, go to Settings - Language & Input - Voice Search - Personalized Recognition, and you can turn it on or off.


     



     



    You can also go to your Google Dashboard (https://www.google.com/dashboard/and instantly anonymize all your previous voice recordings, something which I decided not to do, since I _like_ how well Google deciphers my voice.


     



     


     


    Does Apple have any kind of user dashboard, one place where we can go to see at least some of what info they have stored, and/or control it?

  • Reply 20 of 58
    gatorguygatorguy Posts: 24,176member

    Quote:

    Originally Posted by emcomments View Post


    @ascii and others talking about Google releasing searches.


     


    I believe you are thinking of the AOL 2006 data release http://en.wikipedia.org/wiki/AOL_search_data_leak through which New York Times successfully discovered the identity of several searchers, most notably Thelma Arnold.



    There's also this more recent study:


     


     


    While in the past, mobility traces were only available to mobile phone carriers, the advent of smartphones and other means of data collection has made these broadly available. For example, Apple® recently updated its privacy policy to allow sharing the spatio-temporal location of their users with “partners and licensees”21. 65.5B geo-tagged payments are made per year in the US22while Skyhook wireless is resolving 400 M user's WiFi location every day23. Furthermore, it is estimated that a third of the 25B copies of applications available on Apple's App StoreSM access a user's geographic location2425, and that the geo-location of ~50% of all iOS and Android traffic is available to ad networks26. All these are fuelling the ubiquity of simply anonymized mobility datasets and are giving room to privacy concerns.


    A simply anonymized dataset does not contain name, home address, phone number or other obvious identifier. Yet, if individual's patterns are unique enough, outside information can be used to link the data back to an individual. For instance, in one study, a medical database was successfully combined with a voters list to extract the health record of the governor of Massachusetts27. In another, mobile phone data have been re-identified using users' top locations28. Finally, part of the Netflix challenge dataset was re-identified using outside information from The Internet Movie Database29.


    http://www.nature.com/srep/2013/130325/srep01376/full/srep01376.html

Sign In or Register to comment.