Apple's custom Neural Engine in iPhone XS about 'letting nothing get in your way'
Apple's insistence on custom design of chips like the Neural Engine in the iPhone XS, XS Max, and XR is about unchaining the company's other designers, according to the lead of its chip architects.
"It's about owning the pieces that are critical and letting nothing get in your way," VP Tim Millet told Wired in an interview published on Tuesday. "The experiences we deliver through the phone are critically dependent on the chip."
Work on the first-generation Neural Engine, which appeared in the iPhone 8, 8 Plus, and X, reportedly began a few years ago with photography in mind. Engineers at the company thought iPhone cameras could be enhanced by machine learning, and some of the initial results included 2017's Portrait Lighting and Face ID technologies.
"We couldn't have done that [Face ID] properly without the Neural Engine," Millet said.
The second-generation Neural Engine in 2018 iPhones can run 5 trillion operations per second, and helps deliver more photo-related features such as the ability to control depth-of-field after a photo was taken, and better augmented reality. Apple is additionally opening up the chip to use by outside developers.
Most non-Apple smartphones use off-the-shelf chip designs from companies like Qualcomm. While those can be powerful and are steadily advancing, Apple's in-house design work has allowed it to build tight hardware/software integration and achieve features that would otherwise have to wait.
Apple has been designing custom chips since the A4 processor used in 2010's iPhone 4, following the takeover of PA Semi. Actual manufacturing was for some time handled by Samsung, but is now thought to be the exclusive domain of TSMC.
The use of custom designs has spread beyond central processors to things like the T2 chip that handles things like the Touch Bar and SSDs in Macs. Some third-party chips remain, like cellular and Wi-Fi.
"It's about owning the pieces that are critical and letting nothing get in your way," VP Tim Millet told Wired in an interview published on Tuesday. "The experiences we deliver through the phone are critically dependent on the chip."
Work on the first-generation Neural Engine, which appeared in the iPhone 8, 8 Plus, and X, reportedly began a few years ago with photography in mind. Engineers at the company thought iPhone cameras could be enhanced by machine learning, and some of the initial results included 2017's Portrait Lighting and Face ID technologies.
"We couldn't have done that [Face ID] properly without the Neural Engine," Millet said.
The second-generation Neural Engine in 2018 iPhones can run 5 trillion operations per second, and helps deliver more photo-related features such as the ability to control depth-of-field after a photo was taken, and better augmented reality. Apple is additionally opening up the chip to use by outside developers.
Most non-Apple smartphones use off-the-shelf chip designs from companies like Qualcomm. While those can be powerful and are steadily advancing, Apple's in-house design work has allowed it to build tight hardware/software integration and achieve features that would otherwise have to wait.
Apple has been designing custom chips since the A4 processor used in 2010's iPhone 4, following the takeover of PA Semi. Actual manufacturing was for some time handled by Samsung, but is now thought to be the exclusive domain of TSMC.
The use of custom designs has spread beyond central processors to things like the T2 chip that handles things like the Touch Bar and SSDs in Macs. Some third-party chips remain, like cellular and Wi-Fi.
Comments
The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.
According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:
Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.
The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.
Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.
I'm really looking forward to see the performance of the A12 neural engine.
I don't know for sure, but Huawei have a long history of falsifying benchmark and throttling/boosting performance way over the normal TDP of the device to wins numbers war.
You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.
There are no real benchmarks for NPUs yet.
The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.
The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.
As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.
The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).
All of them are using state of the art technology to great effect.
They cheat on benchmarks; that's a fact.
Such pure non-sense marketing
So where are these 8 benchmarks "presented officially"? I could only find the one. I'm sure your employer will make them available to you so that you can post them up for all of us to see.
"As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays."
Bullshit. Do you not understand the relationship between a programming language, a compiler and the processor? Nobody can control/optimize the entire stack like Apple can.
"All of them are using state of the art technology to great effect."
You mean like using off-the-shelf (and inferior) ARM cores instead of designing their own (for the Kirin 980)? So Huawei designed an NPU. Compared to a CPU, an NPU is pretty simple to design and implement. Which is probably why Huawei added an NPU but used ARM cores - it's all they're capable of.
LOL. How about during the official presentation of the SoC!
To be honest I just didn't pay that much attention to the numbers and I am speaking purely from memory but if you're willing to sit through the presentation you'll catch it.
"Bullshit. Do you not understand the relationship between a programming language, a compiler and the processor? Nobody can control/optimize the entire stack like Apple can."
LOL. 'Bullshit'? Do you know why the Kirin 970 didn't use the latest cores available at the time? Purely due to the maturity of the processors and optimisation. They went with stable, mature optimised tools and code for processors to avoid headaches. They deliberately passed on the latest technology.
The SoC is more than the cores. You really need to understand that. Everything has to be balanced. ARM and Qualcomm offer the tools to get the most out of their cores. Yes, Apple can control the design and code to it but when the finished product is opened up to developers, they are writing to an end product with all the limitations imposed by the manufacturer. This is exactly the same as Qualcomm and manufacturers that use their SoCs. Or Huawei for that matter with HiSilicon.
If Apple were offering something competitors weren't getting from the likes of Qualcomm, it would be different, but is that the case?
Has Qualcomm had problems delivering the features required by handset manufacturers? Even if it is an off-the-shelf solution? I ask because Apple isn't exactly performing that well in key areas - even with the new phones. For example, it is catching up on modem performance but isn't exactly making a song and dance about it!
Yes, that off the shelf solution was actually offering a better modem than Apple.
In fact off the shelf parts aren't uncommon in iPhones.
Wi-fi. Huawei has designed its own ultra fast wi-fi solution for the Kirin 980. That is probably beyond Apple's scope due to IP issues but Apple has wi-fi options even though it is an off the shelf part.
AR. We have been reading about the big AR advantage Apple has for a few years now. Nothing has actually come of it yet and when it does, do you doubt that competing - off the shelf - solutions won't offer similar functionality?
"You mean like using off-the-shelf (and inferior) ARM cores instead of designing their own (for the Kirin 980)? So Huawei designed an NPU. Compared to a CPU, an NPU is pretty simple to design and implement. Which is probably why Huawei added an NPU but used ARM cores - it's all they're capable of.
An NPU is simple to design? Wow! I will let that stand (and fall) all by itself.
Huawei (HiSilicon) co-designed the NPU with Cambricon. I very much doubt it was as simple as you claim but if it was so simple, how did Huawei manage to squeeze more out of it than Apple?
Think about that for a moment.
Huawei marketing sometimes goes OTT but what exactly is false about the presentation of the Kirin 980? Battery tech? Wi-fi tech? Modem? DSP? ISP? NPU?
It must be noted however that the so called performance mode will be opened up to users. It is somewhat similar to the performance throttling mess on iPhones and Apple's decision to put throttling behaviour in the hands of the user.
Most of the Pixel 2's inference computations are done on the Snapdragon SoC (HDR+, offline music recognition, etc.).
The PVC is currently only being used for applying HDR+ to 3rd party camera applications. Although, there will likely be more uses when the Pixel 3 launches next month.
All that Apple has done is to put the use and control of this software, with added analytics into the hands of the user.
https://www.geekbench.com/blog/2018/09/huawei-benchmark-boost/
Apple released an update that reduced the performance of many phones and failed to adequately communicate the change to users. After a major media backlash it came out with a statement and promised to return performance to users - if they wished - via a software switch.
Can you see any similarity now?
I would add, that with iOS 12, everybody is getting a free performance upgrade from the 5s on, which is especially notable with the iPhone 6's, based on anecdotal evidence,
I noticed that my iPhone 7 Plus even seems to be snappier, thanks to the efforts of Apple's iOS team and iOS 12.
Would that 5 year old Android devices do as well.