Apple's custom Neural Engine in iPhone XS about 'letting nothing get in your way'

AppleInsider · September 18, 2018 2:18PM

Apple's insistence on custom design of chips like the Neural Engine in the iPhone XS, XS Max, and XR is about unchaining the company's other designers, according to the lead of its chip architects.

"It's about owning the pieces that are critical and letting nothing get in your way," VP Tim Millet told Wired in an interview published on Tuesday. "The experiences we deliver through the phone are critically dependent on the chip."

Work on the first-generation Neural Engine, which appeared in the iPhone 8, 8 Plus, and X, reportedly began a few years ago with photography in mind. Engineers at the company thought iPhone cameras could be enhanced by machine learning, and some of the initial results included 2017's Portrait Lighting and Face ID technologies.

"We couldn't have done that [Face ID] properly without the Neural Engine," Millet said.

The second-generation Neural Engine in 2018 iPhones can run 5 trillion operations per second, and helps deliver more photo-related features such as the ability to control depth-of-field after a photo was taken, and better augmented reality. Apple is additionally opening up the chip to use by outside developers.

Most non-Apple smartphones use off-the-shelf chip designs from companies like Qualcomm. While those can be powerful and are steadily advancing, Apple's in-house design work has allowed it to build tight hardware/software integration and achieve features that would otherwise have to wait.

Apple has been designing custom chips since the A4 processor used in 2010's iPhone 4, following the takeover of PA Semi. Actual manufacturing was for some time handled by Samsung, but is now thought to be the exclusive domain of TSMC.

The use of custom designs has spread beyond central processors to things like the T2 chip that handles things like the Touch Bar and SSDs in Macs. Some third-party chips remain, like cellular and Wi-Fi.

radarthekat · September 18, 2018 2:49PM

I have a feeling the ML engine can and will be applied to good effect on creating efficiencies in the dispatching of processes, such that iPhones will be able to perform better even as they age. Wouldn’t that be a homerun? Like self-driving cars that all learn from the edge cases encountered by each individual car, perhaps the neural engine can be put to use to evolve faster means of scheduling processes and allocating resources under a myriad of load/usage scenarios, with the most efficient means being preserved into a new generation of experimentation. It could all be taking place as we simply use our iPhones, reporting back (with each iPhone owner’s permission) successful evolutionary branches.

ericthehalfbee · September 18, 2018 3:49PM

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

bigmac2 · September 18, 2018 4:32PM

ericthehalfbee said:

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

I don't know for sure, but Huawei have a long history of falsifying benchmark and throttling/boosting performance way over the normal TDP of the device to wins numbers war.

avon b7 · September 18, 2018 4:38PM

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

Soli · September 18, 2018 4:51PM

If you can reduce security so you can let a Mac boot from USB by changing a setting in macOS Recovery, doesn't that mean the security is easily bypassed? Or, do you have to unlock those settings by first using your system password to unlock the drive?

Soli · September 18, 2018 4:52PM

avon b7 said:
All of them are using state of the art technology to great effect.

And Huawei is using state-of-the-art lying to great effect.

tmay · September 18, 2018 4:57PM

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

Soli · September 18, 2018 5:00PM

tmay said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

Huawei is so corrupt they make Samsung seem honest.

kkqd1337 · September 18, 2018 5:01PM

Does the name ‘Neural Engine’ annoy anyone else apart from me?

Such pure non-sense marketing

Soli · September 18, 2018 5:09PM

kkqd1337 said:

Does the name ‘Neural Engine’ annoy anyone else apart from me?

Such pure non-sense marketing

Why does it annoy you? How is being inspired by biology to make computers more intelligent a bad thing? You know Apple wasn't the first to start using neural in relation to computing, right? I think it's at least as old as the 1960s, and I think it also follows on the heals of the term artificial intelligence.

igerard · September 18, 2018 5:19PM

A TPU run Neural Network computation... then Neural Engine is well chosen

ericthehalfbee · September 18, 2018 5:39PM

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

So where are these 8 benchmarks "presented officially"? I could only find the one. I'm sure your employer will make them available to you so that you can post them up for all of us to see.

"As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays."

Bullshit. Do you not understand the relationship between a programming language, a compiler and the processor? Nobody can control/optimize the entire stack like Apple can.

"All of them are using state of the art technology to great effect."

You mean like using off-the-shelf (and inferior) ARM cores instead of designing their own (for the Kirin 980)? So Huawei designed an NPU. Compared to a CPU, an NPU is pretty simple to design and implement. Which is probably why Huawei added an NPU but used ARM cores - it's all they're capable of.

tipoo · September 18, 2018 7:50PM

The juxtaposition in the article makes me wonder when the Mac will have the neural engine. The T2 is based around something like the A10. So possibly as early as next one.

avon b7 · September 18, 2018 9:04PM

ericthehalfbee said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

So where are these 8 benchmarks "presented officially"? I could only find the one. I'm sure your employer will make them available to you so that you can post them up for all of us to see.

"As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays."

Bullshit. Do you not understand the relationship between a programming language, a compiler and the processor? Nobody can control/optimize the entire stack like Apple can.

"All of them are using state of the art technology to great effect."

You mean like using off-the-shelf (and inferior) ARM cores instead of designing their own (for the Kirin 980)? So Huawei designed an NPU. Compared to a CPU, an NPU is pretty simple to design and implement. Which is probably why Huawei added an NPU but used ARM cores - it's all they're capable of.

"So where are these 8 benchmarks "presented officially"? I could only find the one. I'm sure your employer will make them available to you so that you can post them up for all of us to see."

LOL. How about during the official presentation of the SoC!

To be honest I just didn't pay that much attention to the numbers and I am speaking purely from memory but if you're willing to sit through the presentation you'll catch it.

"Bullshit. Do you not understand the relationship between a programming language, a compiler and the processor? Nobody can control/optimize the entire stack like Apple can."

LOL. 'Bullshit'? Do you know why the Kirin 970 didn't use the latest cores available at the time? Purely due to the maturity of the processors and optimisation. They went with stable, mature optimised tools and code for processors to avoid headaches. They deliberately passed on the latest technology.

The SoC is more than the cores. You really need to understand that. Everything has to be balanced. ARM and Qualcomm offer the tools to get the most out of their cores. Yes, Apple can control the design and code to it but when the finished product is opened up to developers, they are writing to an end product with all the limitations imposed by the manufacturer. This is exactly the same as Qualcomm and manufacturers that use their SoCs. Or Huawei for that matter with HiSilicon.

If Apple were offering something competitors weren't getting from the likes of Qualcomm, it would be different, but is that the case?

Has Qualcomm had problems delivering the features required by handset manufacturers? Even if it is an off-the-shelf solution? I ask because Apple isn't exactly performing that well in key areas - even with the new phones. For example, it is catching up on modem performance but isn't exactly making a song and dance about it!

Yes, that off the shelf solution was actually offering a better modem than Apple.

In fact off the shelf parts aren't uncommon in iPhones.

Wi-fi. Huawei has designed its own ultra fast wi-fi solution for the Kirin 980. That is probably beyond Apple's scope due to IP issues but Apple has wi-fi options even though it is an off the shelf part.

AR. We have been reading about the big AR advantage Apple has for a few years now. Nothing has actually come of it yet and when it does, do you doubt that competing - off the shelf - solutions won't offer similar functionality?

"You mean like using off-the-shelf (and inferior) ARM cores instead of designing their own (for the Kirin 980)? So Huawei designed an NPU. Compared to a CPU, an NPU is pretty simple to design and implement. Which is probably why Huawei added an NPU but used ARM cores - it's all they're capable of.

An NPU is simple to design? Wow! I will let that stand (and fall) all by itself.

Huawei (HiSilicon) co-designed the NPU with Cambricon. I very much doubt it was as simple as you claim but if it was so simple, how did Huawei manage to squeeze more out of it than Apple?

Think about that for a moment.

avon b7 · September 18, 2018 9:14PM

Soli said:

avon b7 said:
All of them are using state of the art technology to great effect.

And Huawei is using state-of-the-art lying to great effect.

Except that the SoC is from HiSilicon. Are you claiming that HiSilicon is lying?

Huawei marketing sometimes goes OTT but what exactly is false about the presentation of the Kirin 980? Battery tech? Wi-fi tech? Modem? DSP? ISP? NPU?

avon b7 · September 18, 2018 9:19PM

tmay said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

That is a valid criticism of the PR department.

It must be noted however that the so called performance mode will be opened up to users. It is somewhat similar to the performance throttling mess on iPhones and Apple's decision to put throttling behaviour in the hands of the user.

KITA · September 18, 2018 10:58PM

ericthehalfbee said:

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

The Pixel Visual Core is an entire SoC on its own. It has a single core Cortex A53 CPU, 512 MB LPDDR4 DRAM, 8 IPU cores, MIPI and PCIe.

Most of the Pixel 2's inference computations are done on the Snapdragon SoC (HDR+, offline music recognition, etc.).

The PVC is currently only being used for applying HDR+ to 3rd party camera applications. Although, there will likely be more uses when the Pixel 3 launches next month.

tmay · September 19, 2018 7:40AM

avon b7 said:

tmay said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

That is a valid criticism of the PR department.

It must be noted however that the so called performance mode will be opened up to users. It is somewhat similar to the performance throttling mess on iPhones and Apple's decision to put throttling behaviour in the hands of the user.

Huawei was found guilty of cheating on performance benchmarks as a marketing tactic for new products, and has been banned by Geekbench for that. This is not even "somewhat similar" to preventative software that Apple incorporated as default in its operating system to mitigate the complete loss of operation of a user's iPhone, due to an aged battery, with a lower performance mode for the the SoC.

All that Apple has done is to put the use and control of this software, with added analytics into the hands of the user.

https://www.geekbench.com/blog/2018/09/huawei-benchmark-boost/

"Some have asked me why this issue matters; if the hardware is clearly capable of performance like this, why should Huawei and HiSilicon not be able to present it that way? The higher performance results that 3DMark, GFXBench, and now Geekbench show are not indicative of the performance consumers get with their devices on real applications. The entire goal of benchmarks and reviews is to try to convey the experience a buyer would get for a smartphone, or anything else for that matter.
If Huawei wanted one of its devices to offer this level of performance in games and other applications, it could do so, but at the expense of other traits. Skin temperature, battery life, and device lifespan could all be impacted – something that would definitely affect the reviews and reception of a smartphone. Hence, the practice of cheating in an attempt to have the best of both".

avon b7 · September 19, 2018 9:37AM

tmay said:

avon b7 said:

tmay said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

That is a valid criticism of the PR department.

It must be noted however that the so called performance mode will be opened up to users. It is somewhat similar to the performance throttling mess on iPhones and Apple's decision to put throttling behaviour in the hands of the user.

Huawei was found guilty of cheating on performance benchmarks as a marketing tactic for new products, and has been banned by Geekbench for that. This is not even "somewhat similar" to preventative software that Apple incorporated as default in its operating system to mitigate the complete loss of operation of a user's iPhone, due to an aged battery, with a lower performance mode for the the SoC.

All that Apple has done is to put the use and control of this software, with added analytics into the hands of the user.

https://www.geekbench.com/blog/2018/09/huawei-benchmark-boost/

"Some have asked me why this issue matters; if the hardware is clearly capable of performance like this, why should Huawei and HiSilicon not be able to present it that way? The higher performance results that 3DMark, GFXBench, and now Geekbench show are not indicative of the performance consumers get with their devices on real applications. The entire goal of benchmarks and reviews is to try to convey the experience a buyer would get for a smartphone, or anything else for that matter.
If Huawei wanted one of its devices to offer this level of performance in games and other applications, it could do so, but at the expense of other traits. Skin temperature, battery life, and device lifespan could all be impacted – something that would definitely affect the reviews and reception of a smartphone. Hence, the practice of cheating in an attempt to have the best of both".

Ok. I'll change it. It wasn't 'somewhat' similar. It was massively similar.

Apple released an update that reduced the performance of many phones and failed to adequately communicate the change to users. After a major media backlash it came out with a statement and promised to return performance to users - if they wished - via a software switch.

Can you see any similarity now?

tmay · September 19, 2018 9:50AM

avon b7 said:

tmay said:

avon b7 said:

tmay said:

avon b7 said:

ericthehalfbee said:

Here's some interesting numbers for you:

The Kirin 970 claimed 1.92 trillion operations per second (TOPS).
The A11 claimed 600 billion operations per second, or 0.6 TOPS.
The Kirin 970 performs 3.2x as many TOPS as the A11.

According to Huawei's own benchmark tests using ResNet50 (image references per second) we have the following results:

Kirin 970 - 2,030 images per second.
A1 - 1,458 images per second.
The Kirin 970 performs 1.4x as many images as the A11.

The question I have is this: How can a processor that claims to have 3.2x the performance in TOPS only manage to get 1.4x the performance performing an actual task (image references)? A task that Huawei themselves picked to showcase their processor, so nobody can claim bias for the A11.

Speaking of custom design, the Pixel 2 has its own neural engine as well. But since they don't design their own SoC they had to "tack it on" to the Snapdragon. Which means it won't be integrated nearly as tightly as you'd see in the A11/A12. Basically, they are limited by the bandwidth between the SoC and the external neural engine. So while it has higher performance (3 TOPS), it's doubtful that performance can be sustained.

I'm really looking forward to see the performance of the A12 neural engine.

There is more to it than operations per second.

You picked one 'benchmark' from Huawei but about eight were presented officially. Of those, some saw the Kirin 970 firing way past the A11 but others were closer.

There are no real benchmarks for NPUs yet.

The real point of NPUs is what you can do with them as well as how fast and at what efficiency cost.

The Kirin970 was used to improve voice recognition, image stabilization, motion blur and recognition, system (hardware) efficiency, noise reduction etc. Mostly trained offline and made available to the NPU via upgrades.

As for the article, the 'home grown' versus 'off the shelf' argument doesn't really cut it nowadays. The only real difference is if you have a home grown chip that has no off the shelf equivalent.

The moment Apple opened up use of the NPU to outside developers it became a de-facto off the shelf solution too. Just like Qualcomm to a certain degree. Without forgetting Huawei of course who also co-designed their NPU and also opened it up to developers from the get-go (via standard APIs for Android and its own in-house API).

All of them are using state of the art technology to great effect.

I have very low confidence in anything that Huawei states as "fact", based on past and recent history of various PR statements, so "benchmark from Huawei" really needs to have an asterisk attached.

They cheat on benchmarks; that's a fact.

That is a valid criticism of the PR department.

It must be noted however that the so called performance mode will be opened up to users. It is somewhat similar to the performance throttling mess on iPhones and Apple's decision to put throttling behaviour in the hands of the user.

Huawei was found guilty of cheating on performance benchmarks as a marketing tactic for new products, and has been banned by Geekbench for that. This is not even "somewhat similar" to preventative software that Apple incorporated as default in its operating system to mitigate the complete loss of operation of a user's iPhone, due to an aged battery, with a lower performance mode for the the SoC.

All that Apple has done is to put the use and control of this software, with added analytics into the hands of the user.

https://www.geekbench.com/blog/2018/09/huawei-benchmark-boost/

"Some have asked me why this issue matters; if the hardware is clearly capable of performance like this, why should Huawei and HiSilicon not be able to present it that way? The higher performance results that 3DMark, GFXBench, and now Geekbench show are not indicative of the performance consumers get with their devices on real applications. The entire goal of benchmarks and reviews is to try to convey the experience a buyer would get for a smartphone, or anything else for that matter.
If Huawei wanted one of its devices to offer this level of performance in games and other applications, it could do so, but at the expense of other traits. Skin temperature, battery life, and device lifespan could all be impacted – something that would definitely affect the reviews and reception of a smartphone. Hence, the practice of cheating in an attempt to have the best of both".

Ok. I'll change it. It wasn't 'somewhat' similar. It was massively similar.

Apple released an update that reduced the performance of many phones and failed to adequately communicate the change to users. After a major media backlash it came out with a statement and promised to return performance to users - if they wished - via a software switch.

Can you see any similarity now?

No, I can't see the similarities, as the context is completely different, which you are obviously ignoring.

I would add, that with iOS 12, everybody is getting a free performance upgrade from the 5s on, which is especially notable with the iPhone 6's, based on anecdotal evidence,

I noticed that my iPhone 7 Plus even seems to be snappier, thanks to the efforts of Apple's iOS team and iOS 12.

Would that 5 year old Android devices do as well.

Apple's custom Neural Engine in iPhone XS about 'letting nothing get in your way'

Comments