Apple keeps pushing AI industry forward with more open-source models
Apple's Apple Intelligence research team have released two new small but high-performing language models used to train AI generators.
Apple's ability to create incredibly compact yet powerful AI models is unequaled in the industry.
The Machine Learning team at Apple are taking part in an open-source DataComp for Language Models project alongside others in the industry. The two models Apple has recently produced have been seen to match or beat other leading training models, such as Llama 3 and Gemma.
Language models like these are used to train AI engines, like ChatGPT, by providing a standard framework. This includes an architecture, parameters, and filtering of datasets to provide higher-quality data for the AI engines to draw from.
I am really excited to introduce DataComp for Language Models (DCLM), our new testbed for controlled dataset experiments aimed at improving language models. 1/x pic.twitter.com/uNe5mUJJxb
-- Vaishaal Shankar (@Vaishaal)
Apple's submission to the project includes two models: a larger one with seven billion parameters, and a smaller one with 1.4 billion parameters. Apple's team said the larger model has outperformed the previous top model, MAP-Neo, by 6.6 percent in benchmarks.
More remarkably, the Apple team's DataComp-LM model uses 40 percent less computing power to accomplish those benchmarks. It was the best-performing model among those with open datasets, and competitive against those with private datasets.
Apple has made its models fully open -- the dataset, weight models, and training code are all available for other researchers to work with. Both the larger and smaller models scored well enough in the Massive Multi-task Language Understanding benchmarks (MMLU) to be competitive against commercial models.
Benchmarks for Apple's larger dataset prove competitive against other models.
In debuting both Apple Intelligence and Private Cloud Compute at its WWDC conference in June, the company silenced critics who had claimed that Apple was behind the industry on artificial intelligence applications in its devices. Research papers from the Machine Learning team published before and after that event proved that the company is in fact an AI industry leader.
These models the Apple team has released are not intended for use in any future Apple products. They are community research projects to show improved effectiveness in curating small or large datasets used to train AI models.
Apple's Machine Learning team have previously shared research to the larger AI community. The datasets, research notes, and other assets are all to be found at HuggingFace.co, a platform dedicated to expanding the AI community.
Read on AppleInsider
Comments
Meanwhile, Apple's LLM can't even say who Obama is.
https://www.reddit.com/r/singularity/comments/1ck55ag/apples_open_elm_model_is_fast_but_it_isnt_very/?rdt=32852
"Apple's ability to create incredibly compact yet powerful AI models is unequaled in the industry."
There is a huge amount of research going into tiny LLM's and it seems new advances come out every week. Open source or not, and many are specifically tailored to specific areas or languages. Apple is unlikely to challenge native Chinese models for the Chinese language. And what about Arabic?
When I say 'huge' I mean it's very difficult to know exactly what might appear and keep track of it all.
Earbuds can make great use of NLP and everything related to voice biometrics, bone, conduction, audio processing etc. Little more is needed there.
Then this:
"In debuting both Apple Intelligence and Private Cloud Compute at its WWDC conference in June, the company silenced critics who had claimed that Apple was behind the industry on artificial intelligence applications in its devices"
The critics were simply pointing out reality. No equivalent shipping product from Apple was available. That remains the case today and will remain the case until something actually ships. Sometime late this year on a small range of models and well into next year for the rest of what was announced.
To all intents and purposes 'Apple Intelligence' was more akin to a placeholder at WWDC to generate buzz and let people know what's coming.
That's fine but we still have to wait to see what eventually comes out of the pipe and how it performs.
The more the better IMO but the 'industry' isn't slacking and is actually shipping.
https://opensource.apple.com/projects/
Perhaps those "critics" were entirely wrong?
MS CoPilot, as an example, has subsequently received quite a bit of "well earned" bad press by early users.
Perhaps the rush to deliver has consequences?
More to the point, critics totally ignored the fact that Apple has something on the order of 1.5 billion iPhone users, most of whom will upgrade in the future to more powerful AI hardware the does in essence allow increasingly larger models.
Calling this AI "race" at the starting gate, as you are oft to do, isn't a determining factor in Apple ultimate AI success.
https://www.youtube.com/watch?v=dx-tMK7w5g8
Recall, Qualcomm SOC fiasco, Windows emulation software on Arm, Crowdstrike and Ads.......
https://www.pcmag.com/news/10-reasons-not-to-upgrade-to-windows-11
https://www.microsoft.com/en-us/windows/business/windows-11-pro-onward-itdm?ef_id=_k_36c7ed8943d21155b9758b17d2c20ca4_k_&OCID=AIDcmmdmw4ue0n_SEM__k_36c7ed8943d21155b9758b17d2c20ca4_k_&msclkid=36c7ed8943d21155b9758b17d2c20ca4#layout-container-uid3104 Bold claims the most secure Windows ever.
Market inertia is what Microsoft leads in.
The stock market impact is irrelevant to the point.
If Apple announced, but didn't ship, a non-invasive continuous glucose monitor, the stock price would soar. It wouldn't change the argument of the critics in slightest, even if the term 'critics' is entirely incorrect.
Apple had little to nothing ready to go at WWDC while others had multiple versions of their solutions under their belts.
Previously, Apple had deliberately avoided the term which, in hindsight, wasn't the greatest of moves.
They couldn't go another year with only the ML line. Yeah, the stock can go both ways.
What has been announced will roll out very slowly and until it actually reaches users no one knows how it will perform.
That isn't being critical. It is being a realist.