Apple licenses millions of Shutterstock images to train its AI models
Apple has struck a deal to license millions of images from Shutterstock in order to train its AI models.

Other tech companies have obtained similar deals from Shutterstock to help develop visual AI engines, including Google, Meta, and Amazon. News of Apple's deal comes well after its signing in late 2022, and is expected to cost Apple up to $50 million.
This follows on from news of previous negotiations between Apple and various publishers for similar AI large language model (LLM) training using content from news articles. Conde Nast IAC, and NBC are among the big media names that have allegedly been in talks with Apple about licensing their content.
Apple is expected to make some major announcements about its efforts to add more AI technologies into its operating systems this June, at WWDC. Though often perceived as being behind its rivals in AI integration, Apple has made some innovations of its own.
Over the past year, Apple device users may have noticed smaller improvements in Apple's "machine learning" technologies. Predictive text, for example, has grown steadly more accurate in adapting to a given user's preferred vocabulary, and Siri has improved its ability to translate common phrases.
The next generation of Apple's processors are rumored to be including substantially more powerful neural engines.
Apple's Senior VP of Worldwide Marketing, Greg Joswiak, has quipped on social media that the next WWDC conference will be "Absolutely Incredible," hinting that the conference will be heavily focused around AI type features being added to iOS 18 and other Apple OSes.
The big challenge for Apple in using AI technologies is in maintaining its standards on user privacy, a problem other big AI-using tech firms don't concern themselves with. Apple has recently revealed that it intends to develop LLMs that can use on-device technology as much as possible.
Read on AppleInsider
Comments
If you're curious, prior to this 2022 licensing deal with Shutterstock for AI training images, Apple had been using undisclosed "other photo data sets" for AI training data for over 8 years, though none of them were images from their own customers...
unless you or I agreed to it hidden somewhere in the multipage ToS for some Apple service or iOS app.
Further to that, as I read Apple's iOS10 disclosures, Apple did say they may use other data from us for AI training, but anonymized with differential privacy so it could no longer be connected to us as personal data. That's quite similar to some other techs using anonymized/differential data and it being considered acceptable since it was no longer deemed identifiable.
The ChatGPT, Canva, DALL-E etc Generative AI training models have reopened the conversation, so what was once considered OK no longer is.
Thus, the relatively recent rush to pay to license data sets rather than scraping from customer-contributed content, interactions, and the general web even if it is anonymized.