Apple reveals it keeps anonymized Siri data for up to 2 years

2

Comments

  • Reply 21 of 58
    MarvinMarvin Posts: 15,324moderator
    bdkennedy1 wrote: »
    If you're worried about it, don't use it.

    Yeah, if you don't want Siri to hear certain things, you shouldn't be saying them.

    I watch Eric Schmidt while he sleeps. If he doesn't like it, maybe he shouldn't be doing it.

    I don't particularly mind Google or Apple using anonymous data as long as they are transparent about the level of anonymity, amount of data and data retention period. Siri hasn't even been with Apple for 2 years so I wonder if the result quality will start to be change for some people around the end of the year when they start removing some data.
  • Reply 22 of 58
    tbelltbell Posts: 3,146member

    Quote:

    Originally Posted by KDarling View Post




    Neither Apple nor Google is going to sell our voice snippets, and both undoubtedly have very few people with access to the data and the relationship key.  (Google's voice recognition policy specifically states this.)


     


    Google gives the user full control.  On an Android device, go to Settings - Language & Input - Voice Search - Personalized Recognition, and you can turn it on or off.


     



     



    You can also go to your Google Dashboard (https://www.google.com/dashboard/and instantly anonymize all your previous voice recordings, something which I decided not to do, since I _like_ how well Google deciphers my voice.


     



     


     


    Does Apple have any kind of user dashboard, one place where we can go to see at least some of what info they have stored, and/or control it?



     


     


    Not that I am aware of, but as you mentioned, Google's model by default does not dissociate your voice. Like with all its services, it links your voice to your profile. Apple by default dissociates your voice by assigning you an anonymous identifier number. After 6 months it gets rid of even that. Apple's policy seems superior from a pure privacy standpoint.

  • Reply 23 of 58

    Quote:

    Originally Posted by TBell View Post


     


     


    Not that I am aware of, but as you mentioned, Google's model by default does not dissociate your voice. Like with all its services, it links your voice to your profile. Apple by default dissociates your voice by assigning you an anonymous identifier number. After 6 months it gets rid of even that. Apple's policy seems superior from a pure privacy standpoint.



     


    There's still something linking that number with you. Google probably uses some sort of generated key to store voice clips too. Perhaps it's whatever key your Google account already has but it really doesn't matter. Honestly, there's really no difference between how these 2 companies are storing the data except that as someone pointed out, Google gives you an explicit way to actually completely sever that link.

  • Reply 24 of 58
    kdarlingkdarling Posts: 1,640member

    Quote:

    Originally Posted by TBell View Post


    Not that I am aware of, but as you mentioned, Google's model by default does not dissociate your voice. Like with all its services, it links your voice to your profile. Apple by default dissociates your voice by assigning you an anonymous identifier number. After 6 months it gets rid of even that. Apple's policy seems superior from a pure privacy standpoint.



     


    Google, like Apple, uses a identifier key to associate voice samples with a person, that only they know.   However, you're right, it's not random like the Apple one.


     


    Quote:


    "Normally, all saved speech samples remain anonymous. In other words, Google stores millions of voice recordings with no way of telling who was speaking. When you sign up for personalized speech recognition, Google creates an electronic key that links your speech samples with your Google Account. Google uses this key to access voice samples and improve recognition of your speaking voice."


     


    - Google Voice Recognition Policy



     


    Now, obviously if you want personalized voice recognition that works best and automatically across ALL your devices, the key will have to be linked to the same account.  


     


    Moreover, if an Android user buys another device (new or replacement), the personalized voice recognition carries over automatically. Sounds like Apple cannot do that.  


     


    I agree the Google method is not as secure in a strict sense, but the advantages are worth the trade-off.  Plus I'm not worried, since only Google stores the voice and key.  And again, if I don't like that, I can uncheck the personalized recognition and/or wipe out the association anytime I wish.  I get to choose.

  • Reply 25 of 58
    bregaladbregalad Posts: 816member
    I've worked in software development for the better part of the last two decades. Anonymizing data is a lot of work. Nobody does it unless they really have to.

    You know that software that runs your doctor's office? What do you think happens when the office manager starts seeing weird stuff? Tech support asks for a copy of the data so they can try to figure out what's wrong. The data gets passed to developers and testers who treat it like gold.

    HIPPA rules dictate that all customer data must be either destroyed or anonymized, but it rarely happens because real data filled with empty "required" fields, duplicate IDs and other conditions that shouldn't exist is the holy grail of software development.
  • Reply 26 of 58
    charlitunacharlituna Posts: 7,217member
    And what about Samsung etc. don't they have a voice system that works they same way. How long do they keep the data
  • Reply 27 of 58
    kdarlingkdarling Posts: 1,640member

    Quote:

    Originally Posted by charlituna View Post



    And what about Samsung etc. don't they have a voice system that works they same way. How long do they keep the data


     


    Samsung's "S-Voice" is powered by VLingo, which uses Nuance voice recognition servers.


     


    It's basically a reskinned version of "Dragon Mobile Assistant", which is available for Android or iOS.


     


    You'd have to check with Nuance to see what their personalization (if any) and privacy policies are.

  • Reply 28 of 58
    mstonemstone Posts: 11,510member


    A friend and I were talking about gardening which led to reminiscing the old days of growing pot which was pretty funny with all the odd lingo. Then the question came up of how many grams were in a lid? He asked Siri and she actually came back with: "This might answer your question" Convert 1 lid to grams = 28.35g. She also knows how many pounds are in a "key" 2.205.


     


    I'm glad he used his phone and not mine. I don't want that sort of stuff saved in my profile.

  • Reply 29 of 58
    ipenipen Posts: 410member

    Quote:

    Originally Posted by widmark View Post



    Privacy is concerning. Sites like facebook often know more about you than you do. And there isn't a ton of net benefit except to facebook. Inline ads on your phone are a scourge.



     


     


    ?? why facebook knows more about me than I do?  Aren't we always suppose to use fake names on the web?  Yes, facebook knows about this fictitious "person" a lot.  But everything was just made up.  

  • Reply 30 of 58
    hill60hill60 Posts: 6,992member

    Quote:

    Originally Posted by KDarling View Post




    Does Apple have any kind of user dashboard, one place where we can go to see at least some of what info they have stored, and/or control it?



     


    Buy an iPhone and find out.

  • Reply 31 of 58
    blah64blah64 Posts: 993member


    Originally Posted by ipen View Post


    ?? why facebook knows more about me than I do?  Aren't we always suppose to use fake names on the web?  Yes, facebook knows about this fictitious "person" a lot.  But everything was just made up.  



     


    If you think simply creating an account under a false name doesn't give Facebook lots of information about you, then you are sadly selling them short.  Unless you use extreme (tinfoil level, which isn't always a bad thing) care and diligence, Facebook, Google and other personal profile-generating sites can make very good guesses about who you are, no matter what fake FB name you use.


     


    Think about it.  Do you have any "real" friends on this fake account?  Your social graph exists across several web properties, it's not like they can't piece that stuff together.  If you don't have any real friends on the fake account, then why bother with it?


     


    Also, if you access Facebook (using your fake account) from your home or office using the same computer that you access other "real" accounts with, then there is an easy path to put those two pieces of data together.  Happens all the time.  If you really, truly, don't want FB (or Google, etc.) from tracking everything you do online, you need to block their cookies, block their widgets, block their ads and probably even don't connect via the same IP address from which you connect to other services that do business with them.


     


    The odds are that FB has a very good idea that your account is "fake", but they also have vested interest in keeping huge #s of accounts active, so they don't shut them all down, only where something suspicious is happening.

  • Reply 32 of 58
    kdarlingkdarling Posts: 1,640member

    Quote:

    Originally Posted by hill60 View Post


    Buy an iPhone and find out.



     


    If you don't know the answer, why post?


     


    In all my time owning, using and programming iOS devices, I've never run across an Apple dashboard site, so I was curious if anyone knew of one.


     


    (There is a page for controlling ad id numbers for people with older versions of iOS, but that's all it does.)

  • Reply 33 of 58
    lightknightlightknight Posts: 2,312member
    Bullshit.

    [I]The number is not associated with an Apple ID, email address, or anything else that could be easily personally identifiable.

    However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it[/I], since the device is not easily personnaly identifiable because it definitely doesn't have a UID, totally not.
  • Reply 34 of 58

    Quote:

    Originally Posted by Gatorguy View Post


    A simply anonymized dataset does not contain name, home address, phone number or other obvious identifier. Yet, if individual's patterns are unique enough, outside information can be used to link the data back to an individual. 



     


    This is how theorists have been able to track Elvis around Vegas from one Denny's restaurant to another late at night. 

  • Reply 35 of 58

    Quote:

    Originally Posted by KDarling View Post


     


    If you don't know the answer, why post?


     


    In all my time owning, using and programming iOS devices, I've never run across an Apple dashboard site, so I was curious if anyone knew of one.


     


    (There is a page for controlling ad id numbers for people with older versions of iOS, but that's all it does.)





    How else is he supposed to hit 5,000 posts unless he posts inane comments like that?  On topic - I haven't seen anything like that either.  Would be nice though.

  • Reply 36 of 58
    copelandcopeland Posts: 298member
    "However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it."

    How can Apple delete my voice clips if they are not connect to my account ?
    I have no problem with Apple storing it, but that remark i don't understand.
  • Reply 37 of 58
    dominoxmldominoxml Posts: 110member

    Quote:

    Originally Posted by copeland View Post



    "However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it."



    How can Apple delete my voice clips if they are not connect to my account ?

    I have no problem with Apple storing it, but that remark i don't understand.


     


    I'll try to clarify this by using a simple example and the comparable easy to understand SQL language.


     


    1. You have messages saved in a table with a direct reference to the user. In this case you would get the result by typing this:


     


    select * from table messages where user like 'John Dow'


     


    2. The more common way is to build references by UID's (faster and more functional because you can use multiple references). 


     


    In this case there might be a table called user with the row ID = 12, User='John Dow'. This ID is then stored in the messages table and you can get the personalized messages through:


     


    select * from table messages where user.ID = 12


     


    When you delete the entires for the user-ID, but the user keeps this value on his device the messages are partially anonymized. The user can build the reference but not someone else, not knowing the ID.


     


    Anonymization can be prepared and made stronger by using random UIDs, because normal IDs can be guessed. The strongest way is to create them independently on your device instead on a central server. I do this often for e.g. one time transactions or passwords.


    Those device specific UIDs might look like 'EFXwjTV86LcT'.


     


    The point here is that you can obtain the same functionality for referencing the user's messages by


     


    select * from table messages where ID like 'EFXwjTV86LcT'


     


    This query can only be invoked by the device, because the server doesn't know who is related to the ID. You can also delete all messages related to this ID.


     


    This is what Siri does when you disable it.


     


    But we are still at a point where your messages are only strongly, but not fully anonymized.


    The reason is that your messages might contain contentual references to you like spoken words.


     


    In order to learn from those they have to be analyzed. The system might e.g. link your pronounciation of 'Eiffel Tower' to the correct recognition for giving the directions.


    When this pattern get's linked to the target and command, but the additional information get's wiped your messages are fully anonymized.


     


    I think what's clear is that Apple chose the technical harder way for best possible privacy with Siri.


     


    Keeping the full messages including the references to the user would be much simpler and also would allow to provide functionality you can only offer when you know all about the user.


     


    Please let me know if this is too technical. I'll try to clarify it further.

  • Reply 38 of 58
    Great! So all the times I told Siri to open her f@#!ing ears, are stored for 2 years. Heh
  • Reply 39 of 58
    No matter what you do, there is no such thing as privacy anymore. Unless you are truly off the grid
  • Reply 40 of 58
    dominoxmldominoxml Posts: 110member

    Quote:

    Originally Posted by btracy713 View Post



    Great! So all the times I told Siri to open her f@#!ing ears, are stored for 2 years. Heh


    The fact that it still doesn't recognize it correctly point's to the fact that it's not considered to be helpful improve the service.


    As I explained also this BS isn't stored in your profile.

Sign In or Register to comment.