The disclosure was made by Apple to Wired after privacy advocates called on the company to reveal exactly what information it knows and keeps about users. Apple spokeswoman Trudy Muller said the anonymized data is collected solely to improve the service, and that the company takes customer privacy very seriously.
Much of the work for Siri is done remotely, which is why the personal assistant software available on iPhone, iPad and iPod touch requires a data connection to operate.
Voice clips stored by Apple are categorized by random numbers to represent the user who recorded it. The number is not associated with an Apple ID, email address, or anything else that could be easily personally identifiable.
After six months, the random number is no longer associated with the saved clip, but the audio file may be saved for up to two years in total for what Wired said are "testing and product improvement purposes."
However, if a user turns off Siri on their device, their randomized identifier is deleted, along with any data associated with it.
The fact that Siri data must be sent to Apple before it can provide results has been a concern for security advocates, as well as some companies. For example, last year it was revealed that security-conscious IBM barred the use of Siri on its corporate networks, out of concern that sensitive information could leak.