Apple executives explain Apple Silicon & Neural Engine in new interview

AppleInsider · February 9, 2023 7:47PM

Apple executives recently talked about Apple Silicon in an interview, explaining the Neural Engine and the company's chip design process.

Apple leaders talk about Apple Silicon

Laura Metz, the Director of Product Marketing, Anand Shimpi of Hardware Engineering, and Tuba Yalcin from the Pro Workflow team join the interview to discuss the Apple Silicon design process and some of the newest Macs.

The 30-minute video with Andru Edwards covers various topics, from how the Pro Workflow Team helps in Apple design to how Apple adds more powerful chips to smaller devices.

Subscribe to AppleInsider on YouTube

{"@context":"https://schema.org/","@type":"VideoObject","name":"How Apple Silicon Dominates With The M2 Chip!","description":"How Apple Silicon DOMINATES With The M2 Chip!","thumbnailUrl":"https://i.ytimg.com/vi/jtyXZtuDQJ4/sddefault.jpg","uploadDate":"2023-02-09T17:01:02Z","duration":"PT31M19S","embedUrl":"

"}

They also discussed the Neural Engine that is found in Apple Silicon. Introduced in the A11 Bionic chip inside the iPhone 8 and iPhone X, the Neural Engine is a specialized computing component for specific tasks, as opposed to the general-purpose computing that the rest of the CPU delivers.

The Neural Engine is optimized to handle machine learning and neural networking tasks for things like photography. For example, the 16-core Neural Engine in the M2 Pro and M2 Max chips can handle 15.8 trillion operations per second.

Next, they discuss transistors, the foundation of computer chip technology. As semiconductor processes become more advanced over time, manufacturers are able to add more transistors to chips during fabrication.

Moving to a smaller transistor technology is one of the ways that Apple can deliver more features, more performance, more efficiency, and better battery life. As transistors get smaller, manufacturers can add more of them, which can result in additional cores to the CPU and GPU.

Apple also has a special Pro Workflow team with experts across music, video, photographer, 3D visual effects, and more. The team puts together complex workflows designed to test the limits of Apple Silicon and give feedback to the engineers.

Read on AppleInsider

doozydozen · February 10, 2023 2:28AM

Where is the video posted?

filemakerfeller · February 10, 2023 3:09AM

The first link in the article works for me:

freeassociate2 · February 10, 2023 5:42PM

Nope … def some odd linking issues. Most pronounced between AI’s app and the browser site.

applezulu · February 10, 2023 9:42PM

freeassociate2 said:

Nope … def some odd linking issues. Most pronounced between AI’s app and the browser site.

tenthousandthings · February 15, 2023 10:50PM

Here is a transcript of Anand Shimpi's technical contributions to this. For those who don't know, he is the founder of Anandtech, and was hired away from his own very successful tech site by Apple in 2014. It seems like he really is one of the driving forces behind Apple Silicon, and likely a major influence in (or his hire was a result of) the decision to bring it to the Mac. He says some interesting things in here (the MacBook Pro will always get the current generation, and that's also the goal for Macs in general), but overall it's a useful explanation of how Apple is approaching this, of what they are trying to do:

[Intro questions]

The silicon team doesn’t operate in a vacuum, right? … When these products are being envisioned and designed, folks on the architecture team, the design team, they’re there, they’re aware of where we’re going, what’s important, both from a workload perspective, as well as the things that are most important to enable in all of these designs.

I think part of what you’re seeing now is this now decade-plus long, maniacal obsession with power-efficient performance and energy efficiency. If you look at the roots of Apple Silicon, it all started with the iPhone and the iPad. There we’re fitting into a very, very constrained environment, and so we had to build these building blocks, whether it’s our CPU or GPU, media engine, neural engine, to fit in something that’s way, way smaller from a thermal standpoint and a power delivery standpoint than something like a 16-inch MacBook Pro. So the fundamental building blocks are just way more efficient than what you’re typically used to seeing in a product like this. The other thing you’re noticing is, for a lot of tasks that maybe used to be high-powered use cases, on Apple Silicon they actually don’t consume that much power. If you look, compared to what you might find in a competing PC product, depending on the workload, we might be a factor of two or a factor of four times lower power. That allows us to deliver a lot of workloads that might have been high-power use cases on a different product in something that actually is a very quiet and cool and long-lasting sort of use case. The other thing that you’re noticing is that the single-thread performance, the snappiness of your machine, it’s really the same high-performance core regardless of if you’re talking about a MacBook Air or a 14-inch Pro or 16-inch Pro or the new Mac mini, and so all of these machines can accommodate one of those cores running full tilt, again we’ve turned a lot of those usages and usage cases into low-power workloads. You can’t get around physics, though, right? ... So if you light up all the cores, all the GPUs, the 14-inch system just has less thermal capacity than the 16, right? ... So depending on your workload, that might drive you to a bigger machine, but really the chips are across the board incredibly efficient.

[Battery life question]

You can look at how chip design works at Apple. You have to remember we’re not a merchant silicon vendor, at the end of the day we ship product. So the story for the chip team actually starts at the product, right? ... There is a vision that the design team, that the system team has that they want to enable, and the job of the chip is to enable those features and enable that product to deliver the best performance within the constraints, within the thermal envelope of that chassis, that is humanly possible. So if you look at kind of what we did going from the M1 family to the M2 Pro and M2 Max, at any given power point, we’re able to deliver more performance. If you look at, on the CPU we added two more efficiency cores, two more of our e-cores. That allowed us, or was part of what allowed us, to deliver more multi-thread performance, again, at every single power point where the M1 and M2 curves overlap we were able to deliver more performance at any given power point. The dynamic range of operations [is] a little bit longer, a little bit wider, so we do have a slight increase in terms of peak power, but in terms of efficiency, across the range, it is a step forward versus the M1 family, and that directly translates into battery life. The same thing is true for the GPU, it’s kind of counterintuitive, but a big GPU running a modest frequency and voltage, is actually a very efficient way to fill up a box. So that’s been our philosophy dating back to iPhone and iPad, and it continues on the Mac as well.

// But really the thing that we see, that the iPhone and the iPad have enjoyed over the years, is this idea that every generation gets the latest of our IPs, the latest CPU IP, the latest GPU, media engine, neural engine, and so on and so forth, and so now the Mac gets to be on that cadence too. If you look at how we’ve evolved things on the phone and iPad, those IPs tend to get more efficient over time. There is this relationship, if the fundamental chassis doesn’t change, any additional performance you draw, you deliver has to be done more efficiently, and so this is the first time the MacBook Pro gets to enjoy that and be on that same sort of cycle.

On the silicon side, the team doesn’t pull any punches, right? … The goal across all the IPs is, one, make sure you can enable the vision of the product, that there’s a new feature, a new capability that we have to bring to the table in order for the product to have everything that we envisioned, that’s clearly something that you can’t pull back on. And then secondly, it’s do the best you can, right? ... Get as much down in terms of performance and capability as you can in every single generation. The other thing is, Apple’s not a chip company. At the end of the day, we’re a product company. So we want to deliver, whether it’s features, performance, efficiency. If we’re not able to deliver something compelling, we won’t engage, right? ... We won’t build the chip. So each generation we’re motivated as much as possible to deliver he best that we can.

[Neural engine question]

… There are really two things you need to think about, right? ... The first is the tradeoff between a general purpose compute engine and something a little more specialized. So, look at our CPU and GPU, these are big general purpose compute engines. They each have their strengths in terms of the types of applications you’d want to send to the CPU versus the GPU, whereas the neural engine is more focused in terms of the types of operations that it is optimized for. But if you have a workload that’s supported by the neural engine, then you get the most efficient, highest density place on the chip to execute that workload. So that’s the first part of it. The second part of it is, well, what kind of workload are we talking about? Our investment in the neural engine dates back years ago, right? The first time we had a neural engine on an Apple Silicon chip was A11 Bionic, right? ... So that was five-ish years ago on the iPhone. Really, it was the result of us realizing that there were these emergent machine learning models, where, that we wanted to start executing on device, and we brought this technology to the iPhone, and over the years we’ve been increasing its capabilities and its performance. Then, when we made the transition of the Mac to Apple Silicon, it got that IP just like it got the other IPs that we brought, things like the media engine, our CPU, GPU, Secure Enclave, and so on and so forth. So when you’re going to execute these machine learning models, performing these inference-driven models, if the operations that you’re executing are supported by the neural engine, if they fit nicely on that engine, it’s the most efficient way to execute them. The reality is, the entire chip is optimized for machine learning, right? ... So a lot of models you will see executed on the CPU, the GPU, and the neural engine, and we have frameworks in place that kind of make that possible. The goal is always to execute it in the highest performance, most efficient place possible on the chip.

[Nanometer process question]

… You’re referring to the transistor. These are the building blocks all of our chips are built out of. The simplest way to think of them is like a little switch, and we integrate tons of these things into our designs. So if you’re looking at M2 Pro and M2 Max, you’re talking about tens of billions of these, and if you think about large collections of them, that’s how we build the CPU, the GPU, the neural engine, all the media blocks, every part of the chip is built out of these transistors. Moving to a new transistor technology is one of the ways in which we deliver more features, more performance, more efficiency, better battery life. So you can imagine, if the transistors get smaller, you can cram more of them into a given area, that’s how you might add things like additional cores, which is the thing you get in M2 Pro and M2 Max—you get more CPU cores, more GPU cores, and so on and so forth. If the transistors themselves use less power, or they’re faster, that’s another method by which you might deliver, for instance, better battery life, better efficiency. Now, I mentioned this is one tool in the toolbox. What you choose to build with them, the underlying architecture, microarchitecture and design of the chip also contribute in terms of delivering that performance, those features, and that power efficiency.

If you look at the M2 Pro and M2 Max family, you’re looking at a second-generation 5 nanometer process. As we talked about earlier, the chip got more efficient. At every single operating point, the chip was able to deliver more performance at the same amount of power.

[Media engine question]

… Going back to the point about transistors, taking that IP and integrating it on the latest kind of highly-integrated SOC and the latest transistor technology, that lets you run it at a very high speed and you get to extract a lot of performance out of it. The other thing is, and this is one of the things that is fairly unique about Apple Silicon, we built these highly-integrated SOCs, right? ... So if you think about the traditional system architecture, in a desktop or a notebook, you have a CPU from one vendor, a GPU from another vendor, each with their own sort of DRAM, you might have accelerators kind of built into each one of those chips, you might have add-in cards as additional accelerators. But with Apple Silicon in the Mac, it’s all a single chip, all backed by a unified memory system, you get a tremendous amount of memory bandwidth as well as DRAM capacity, which is unusual, right? ... In a machine like this a CPU is used to having large capacity, low bandwidth DRAM, and a GPU might have very low capacity, high bandwidth DRAM, but now the CPU gets access to GPU-like memory bandwidth, while the GPU gets access to CPU-like capacity, and that really enables things that you couldn’t have done before. Really, if you are trying to build a notebook, these are the types of chips that you want to build it out of. And the media engine comes along for the ride, right? ... The technology that we’ve refined over the years, building for iPhone and iPad, these are machines where the camera is a key part of that experience, and being able to bring some of that technology to the Mac was honestly pretty exciting. And it really enabled just a revolution in terms of the video editing and video workflows.

The addition of ProRes as a hardware accelerated encode and decode engine as a part of the media engine, that’s one of the things you can almost trace back directly to working with the Pro Workflows team, right? ... This is a codec that it makes sense to accelerate to integrate into hardware for our customers that we're expecting to buy these machines. It was something that the team was able to integrate, and for those workflows, there’s nothing like it in the industry, on the market.

NOTE: I did this transcript. It may contain some mistakes. I also cut out some interjections, like "I think" and "kind of." But I tried to capture his talking style, especially the use of "..., right?"

edited February 2023

Apple executives explain Apple Silicon & Neural Engine in new interview

Comments