Compared: M1 vs M1 Pro and M1 Max
The new MacBook Pro has an evolution of the year-old Apple Silicon, bringing both more computing power and graphical ability to the new models. Here's how the M1 Pro and M1 Max compare against the original, and how it impacts the Mac lineup.
Apple's introduction of new Mac models using M1 chips, including the 13-inch MacBook Pro, MacBook Air, and Mac mini, heralded a sea change for the company, as it transitioned away from Intel processors. The launch, which would start a two-year schedule for Apple to shift its entire Mac product line over to, was a resounding success, with Apple's new chip faring extremely well against its competition.
In fall 2021, Apple tripled the number of Apple Silicon chip options it had, introducing the M1 Pro and M1 Max as the mid-range and premium options for its MacBook computing lineup.
While not a full second-generation release as you would see from an iPhone update from the A14 to the A15, for example, the upgrade could be likened more to previous A-series upgrades specifically for the iPad Pro lineup, such as the A12Z Bionic. Apple took what it learned from the M1 and expanded the concepts to create improved variants.
The M1 Pro and M1 Max offer considerable improvements over the original M1, which will most likely make Macs using the chips more attractive to the professional and creative market.
This is what's changed in the line between the original M1 release and Apple's new wave of chips.
They're all produced on a 5-nanometer process, which provides several benefits. This includes reducing the physical space of the chip die itself, which reduces the cost-per-chip for wafers, and in turn, the processor itself.
The size also enables lower power consumption compared to other processes, such as the 10-nanometer level that Intel employs in its 2021 chips, as well as potentially higher overall performance.
The benefits of a smaller die work both ways, as it can also enable for a more complex chip to be produced in a specific footprint.
The M1 Max is certainly the biggest chip in the M1 family.
When Apple introduced the M1, it included 16 billion transistors, just above the 15.8 billion used in the A15 Bionic chip. For the new chips, Apple decided to take advantage of the size benefits and think bigger.
The M1 Pro is a chip with 33.7 billion transistors, more than double the original's count. The M1 Max has 57 billion transistors, giving it a 70% improvement over the M1 Pro, which is also 3.5 times the transistors of the original.
The number of transistors that a chip uses doesn't directly translate into performance, as a transistor can be used in varying ways in a chip design. For example, we know that the CPU elements of the M1 Pro and M1 Max are pretty similar, but they differ significantly in GPU core counts, among other alterations.
As such, Apple uses a total of 8 CPU cores, consisting of four "Firestorm" high-performance cores and four "Icestorm" energy-efficient cores. This enabled the chip to switch between the low power consumption cores for menial tasks and the higher performance cores for more intensive workloads.
In the M1 Pro, Apple has two CPU configurations on offer, covering eight cores and ten cores.
The eight-core option includes two high-efficiency "Icestorm" cores along with six high-performance "Firestorm" cores. The ten-core version keeps the two efficiency cores but adds two extra high-performance cores to the existing six, making the total two "Icestorm" and eight "Firestorm" cores.
The M1 Max is longer, and has a higher memory bandwidth than the M1 Pro.
The M1 Max is only available with the ten-core configuration that the M1 Pro offers, of two efficiency and eight performance cores.
So far, benchmarks suggest that the performance of the cores are the same, though ultimate performance scores for individual devices will vary depending on other factors, like memory bandwidth and GPU cores, among other things.
However, the core counts do matter as benchmarks have pointed to the increased available cores as being quite beneficial. In single-core tests in Geekbench performed by AppleInsider, all three chips all hover around 1,760 in Geekbench's single-core benchmark.
The M1 managed 1,752 points, the M1 Pro scored 1,760, and the M1 Max hit 1,769. This makes sense since it's the same high-performance core in the CPU being tested across the board.
In multi-core tests, things are different, as AppleInsider's benchmarking scored the M1 at 7,723. Meanwhile, the 10-core M1 Pro and M1 Max being tested managed 12,437 and 12,308 respectively, which is unsurprising given the makeup of the chips.
All three of the chips are equipped with the 16-core Neural Engine, Apple's machine learning-based chip element, to assist with tasks such as computational photography. ML accelerators in the CPU allow the Neural Engine to assist with computations that aren't necessarily best for CPUs.
More importantly, Unified Memory used the principle of allowing all SoC components access to the same data store, rather than separating it into different CPU and GPU pools, for example. The idea is to help prevent the unneeded duplication of data in memory to service different SoC elements.
Sitting between the CPU, GPU, Neural Engine, the memory, and other components is Fabric, Apple's term for its connections between all of the components, enabling Unified memory to work in the first place.
Since the integrated GPU uses the same memory as the CPU due to this non-duplication approach, any increases to the Unified Memory pool affect all components equally. Add more memory, and it's both the CPU and the GPU that will benefit.
The M1 Pro has more transistors and GPU cores than the M1.
For the M1, Apple included 8GB and 16GB memory options. With the M1 Pro, Apple started at 16GB and raised the maximum to 32GB, while the M1 Max includes 32GB and 64GB options.
It is unclear what precisely the theoretical memory limit could be for the new chips, but for the moment, that maximum is 64GB.
Along with memory capacities, Apple also upgraded Fabric in the newest chips to increase the bandwidth, effectively speeding up the accessing of memory by SoC components. The memory interface was increased to 200GB/s in the M1 Pro and 400GB/s in the M1 Max.
Apple didn't officially announce the memory bandwidth of the M1, but says the M1 Max is "nearly 6x the memory bandwidth of M1." This, in theory, puts the M1 at around 66GB/s of peak memory bandwidth.
In short, the newer chips offer more memory capacities and allow the CPU and other elements to access that memory faster, which can help improve performance.
For the first version, Apple included either a 7-core or 8-core GPU in the M1. Only the MacBook Air had the option of the lesser-count SoC, with all other M1-equipped models using the 8-core GPU.
On the M1 Pro and M1 Max, Apple decided to spend the previously mentioned extended transistor budget on the GPU. The M1 Pro has 14-core and 16-core GPU options, while the M1 Max has 24-core and 32-core versions.
Compared to the M1, Apple says its 16-core M1 Pro is twice as fast. The M1 Max is said to be twice as fast again than the M1 Pro, making it four times faster than the M1 in its 32-core configuration.
Benchmarks that have surfaced seem to agree that the extra cores and the improvements to Fabric improve the graphical performance, which makes sense. With more memory, higher memory bandwidth, and more GPU cores, of course performance will go up compared to the M1.
In AppleInsider's testing, an M1 in a 13-inch MacBook Pro scored 21,425. The M1 Pro with a 16-core GPU managed 40,991, roughly double the M1's score. The M1 Max with a 32-core GPU scores 68,950, which is a 68% improvement over the M1 Pro score, and 3.2 times that of the M1.
Looking at Affinity Photo's benchmark results for single GPU Raster, the M1 reached 8,555, while the M1 Pro doubled the figure to 16,839. Again, the M1 Max topped the pile with 32,028.
This last test sure seems to confirm Apple's graphical boasting, and underlines what its new chips can offer.
However, more CPU and GPU cores aren't all of the story for the new chips.
The Media Engine is a section devoted to video processing that preserves battery life. In effect, it's a pile of hardware-accelerated encoding and decoding engines that can handle video more efficiently than the rest of the chip can.
With its video decode and encode engines, the Media Engine can handle H.264, HEVC, ProRes, and ProRes RAW content. It also has dedicated ProPres encode and decode engines to handle footage used in professional video productions.
AppleInsider benchmarked the M1, M1 Pro, and M1 Max in various ways, including Final Cut Pro exports.
The M1 Max goes one step further for its Media Engine, as it includes two video encode engines, rather than just one in the M1 Pro. It also has double the number of ProRes encode and decode engines.
In an AppleInsider Final Cut Pro export test, involving a 1.12GB 4K video file of a YouTube video, the M1 completed the task in 3 minutes and 53 seconds.
The M1 Pro took 3 minutes and 35 seconds for the same export, but the M1 Max achieved the same result in 2 minutes and 4 seconds.
Using a 13.5-gigabyte 4K video recorded in ProRes on an iPhone 13 Pro and rendered in ProRes 422, the M1 Max excelled, taking just 51 seconds to complete the task. The M1 Pro took 1 minute 30 seconds to perform the same task, and the M1 trailed behind at 1 minute 40 seconds.
With film and TV productions relying on premium hardware due to time being a crucial factor in editing, it makes sense for Apple to include these in the M1 Pro and Max.
The M1 could support Thunderbolt 4, but the initial devices offered relatively few Thunderbolt ports to use the standard. Apple rectified the limitation for the M1 Pro and M1 Max by adding more integrated Thunderbolt 4 controllers to add more I/O bandwidth.
In brief, this allowed you to connect more stuff to your Mac, as there's more bandwidth available.
As a byproduct of the I/O bandwidth increase, Apple also made it possible for the new chips to deal with video output much better than before.
You can attach more external monitors to the M1 Pro and M1 Max than the limited M1.
The M1 could only handle two high-resolution displays, which meant the M1 Mac mini could output to two screens at a 6K and 4K resolution. Meanwhile, the M1 13-inch MacBook Pro could only deal with one external 6K display along with its built-in screen.
The M1 Pro improves its external video support, allowing up to two 6K displays to be driven by a MacBook Pro. The M1 Max goes one step further, with it able to deal with up to three 6K screens along with a fourth 4K monitor.
In previous years, when Intel chips would vary in core counts, clock speeds, and generations, it was relatively simple to highlight what has changed. There are so many moving parts to an Apple Silicon chip that you can't just say one is faster than the other for just one specific reason.
Why are M1 Pro and M1 Max faster than the M1? A few more cores, a lot more GPU cores, more base memory, and faster memory as well. Then there's the Media Engine, though that will only really apply if you're doing video work.
Looking at the details doesn't simplify the differences. However, you can still say with quite a bit of certainty that the M1 Max is better than the M1 Pro, which is better than the M1.
At press time, the best MacBook Pro deals offer triple-digit savings on both retail and configure-to-order (CTO) models at Apple Authorized Reseller Adorama with promo code APINSIDER and this activation link. Step-by-step instructions detailing how to redeem the coupon can be found here.
Read on AppleInsider
Apple's introduction of new Mac models using M1 chips, including the 13-inch MacBook Pro, MacBook Air, and Mac mini, heralded a sea change for the company, as it transitioned away from Intel processors. The launch, which would start a two-year schedule for Apple to shift its entire Mac product line over to, was a resounding success, with Apple's new chip faring extremely well against its competition.
In fall 2021, Apple tripled the number of Apple Silicon chip options it had, introducing the M1 Pro and M1 Max as the mid-range and premium options for its MacBook computing lineup.
While not a full second-generation release as you would see from an iPhone update from the A14 to the A15, for example, the upgrade could be likened more to previous A-series upgrades specifically for the iPad Pro lineup, such as the A12Z Bionic. Apple took what it learned from the M1 and expanded the concepts to create improved variants.
The M1 Pro and M1 Max offer considerable improvements over the original M1, which will most likely make Macs using the chips more attractive to the professional and creative market.
This is what's changed in the line between the original M1 release and Apple's new wave of chips.
Specifications
M1 (2020) | M1 Pro (2021) | M1 Max (2021) | |
---|---|---|---|
CPU Cores (Total) | 8 | 8 or 10 | 10 |
CPU Performance Cores | 4 | 6 or 8 | 8 |
CPU Efficiency Cores | 4 | 2 | 2 |
GPU Cores | 7 or 8 | 14 or 16 | 24 or 32 |
Neural Engine Cores | 16 | 16 | 16 |
Transistors | 16 billion | 33.7 billion | 57 billion |
Foundry process | 5nm | 5nm | 5nm |
Unified Memory Capacities | 8GB, 16GB | 16GB, 32GB | 32GB, 64GB |
Memory Bandwidth | - | 200GB/s | 400GB/s |
Media Engine | - | Video decode engine, Video encode engine ProRes encode and decode engine | Video decode engine, 2 Video encode engines 2 ProRes encode and decode engines |
M1 vs M1 Pro vs M1 Max - Construction
Apple's system-on-chips (SoC) are produced by long-time foundry partner TSMC, the same source of Apple's A-series chips.They're all produced on a 5-nanometer process, which provides several benefits. This includes reducing the physical space of the chip die itself, which reduces the cost-per-chip for wafers, and in turn, the processor itself.
The size also enables lower power consumption compared to other processes, such as the 10-nanometer level that Intel employs in its 2021 chips, as well as potentially higher overall performance.
The benefits of a smaller die work both ways, as it can also enable for a more complex chip to be produced in a specific footprint.
The M1 Max is certainly the biggest chip in the M1 family.
When Apple introduced the M1, it included 16 billion transistors, just above the 15.8 billion used in the A15 Bionic chip. For the new chips, Apple decided to take advantage of the size benefits and think bigger.
The M1 Pro is a chip with 33.7 billion transistors, more than double the original's count. The M1 Max has 57 billion transistors, giving it a 70% improvement over the M1 Pro, which is also 3.5 times the transistors of the original.
The number of transistors that a chip uses doesn't directly translate into performance, as a transistor can be used in varying ways in a chip design. For example, we know that the CPU elements of the M1 Pro and M1 Max are pretty similar, but they differ significantly in GPU core counts, among other alterations.
M1 vs M1 Pro vs M1 Max - CPU, Neural Engine, and Cores
The M1 was launched in Apple's value-end of the MacBook spectrum, in devices that aren't considered powerhouses, namely the MacBook Air, the 13-inch MacBook Pro, and the Mac mini. As it was framed as being more an entry-level chip, Apple erred towards efficiency and battery life, as well as keeping temperatures manageable for the fan-less MacBook Air.As such, Apple uses a total of 8 CPU cores, consisting of four "Firestorm" high-performance cores and four "Icestorm" energy-efficient cores. This enabled the chip to switch between the low power consumption cores for menial tasks and the higher performance cores for more intensive workloads.
In the M1 Pro, Apple has two CPU configurations on offer, covering eight cores and ten cores.
The eight-core option includes two high-efficiency "Icestorm" cores along with six high-performance "Firestorm" cores. The ten-core version keeps the two efficiency cores but adds two extra high-performance cores to the existing six, making the total two "Icestorm" and eight "Firestorm" cores.
The M1 Max is longer, and has a higher memory bandwidth than the M1 Pro.
The M1 Max is only available with the ten-core configuration that the M1 Pro offers, of two efficiency and eight performance cores.
So far, benchmarks suggest that the performance of the cores are the same, though ultimate performance scores for individual devices will vary depending on other factors, like memory bandwidth and GPU cores, among other things.
However, the core counts do matter as benchmarks have pointed to the increased available cores as being quite beneficial. In single-core tests in Geekbench performed by AppleInsider, all three chips all hover around 1,760 in Geekbench's single-core benchmark.
The M1 managed 1,752 points, the M1 Pro scored 1,760, and the M1 Max hit 1,769. This makes sense since it's the same high-performance core in the CPU being tested across the board.
In multi-core tests, things are different, as AppleInsider's benchmarking scored the M1 at 7,723. Meanwhile, the 10-core M1 Pro and M1 Max being tested managed 12,437 and 12,308 respectively, which is unsurprising given the makeup of the chips.
All three of the chips are equipped with the 16-core Neural Engine, Apple's machine learning-based chip element, to assist with tasks such as computational photography. ML accelerators in the CPU allow the Neural Engine to assist with computations that aren't necessarily best for CPUs.
M1 vs M1 Pro vs M1 Max - Unified Memory and Fabric
The introduction of the M1 also brought with it a new paradigm for memory. Apple's Unified Memory consisted of memory mounted to the SoC, minimizing the distance between memory and chip.More importantly, Unified Memory used the principle of allowing all SoC components access to the same data store, rather than separating it into different CPU and GPU pools, for example. The idea is to help prevent the unneeded duplication of data in memory to service different SoC elements.
Sitting between the CPU, GPU, Neural Engine, the memory, and other components is Fabric, Apple's term for its connections between all of the components, enabling Unified memory to work in the first place.
Since the integrated GPU uses the same memory as the CPU due to this non-duplication approach, any increases to the Unified Memory pool affect all components equally. Add more memory, and it's both the CPU and the GPU that will benefit.
The M1 Pro has more transistors and GPU cores than the M1.
For the M1, Apple included 8GB and 16GB memory options. With the M1 Pro, Apple started at 16GB and raised the maximum to 32GB, while the M1 Max includes 32GB and 64GB options.
It is unclear what precisely the theoretical memory limit could be for the new chips, but for the moment, that maximum is 64GB.
Along with memory capacities, Apple also upgraded Fabric in the newest chips to increase the bandwidth, effectively speeding up the accessing of memory by SoC components. The memory interface was increased to 200GB/s in the M1 Pro and 400GB/s in the M1 Max.
Apple didn't officially announce the memory bandwidth of the M1, but says the M1 Max is "nearly 6x the memory bandwidth of M1." This, in theory, puts the M1 at around 66GB/s of peak memory bandwidth.
In short, the newer chips offer more memory capacities and allow the CPU and other elements to access that memory faster, which can help improve performance.
M1 vs M1 Pro vs M1 Max - Graphics
Another relatively tangible performance area for chip comparisons is graphical performance. Apple does not rely on separate discrete GPUs for its new MacBook Pro models, and instead relies on high-core-count GPUs integrated in the SoC.For the first version, Apple included either a 7-core or 8-core GPU in the M1. Only the MacBook Air had the option of the lesser-count SoC, with all other M1-equipped models using the 8-core GPU.
On the M1 Pro and M1 Max, Apple decided to spend the previously mentioned extended transistor budget on the GPU. The M1 Pro has 14-core and 16-core GPU options, while the M1 Max has 24-core and 32-core versions.
Compared to the M1, Apple says its 16-core M1 Pro is twice as fast. The M1 Max is said to be twice as fast again than the M1 Pro, making it four times faster than the M1 in its 32-core configuration.
Benchmarks that have surfaced seem to agree that the extra cores and the improvements to Fabric improve the graphical performance, which makes sense. With more memory, higher memory bandwidth, and more GPU cores, of course performance will go up compared to the M1.
In AppleInsider's testing, an M1 in a 13-inch MacBook Pro scored 21,425. The M1 Pro with a 16-core GPU managed 40,991, roughly double the M1's score. The M1 Max with a 32-core GPU scores 68,950, which is a 68% improvement over the M1 Pro score, and 3.2 times that of the M1.
Looking at Affinity Photo's benchmark results for single GPU Raster, the M1 reached 8,555, while the M1 Pro doubled the figure to 16,839. Again, the M1 Max topped the pile with 32,028.
This last test sure seems to confirm Apple's graphical boasting, and underlines what its new chips can offer.
However, more CPU and GPU cores aren't all of the story for the new chips.
M1 vs M1 Pro vs M1 Max - Media Engine
With a mind to serving its video-centric customers, the M1 Pro and M1 Max introduce a new element to the SoC that the M1 lacks entirely.The Media Engine is a section devoted to video processing that preserves battery life. In effect, it's a pile of hardware-accelerated encoding and decoding engines that can handle video more efficiently than the rest of the chip can.
With its video decode and encode engines, the Media Engine can handle H.264, HEVC, ProRes, and ProRes RAW content. It also has dedicated ProPres encode and decode engines to handle footage used in professional video productions.
AppleInsider benchmarked the M1, M1 Pro, and M1 Max in various ways, including Final Cut Pro exports.
The M1 Max goes one step further for its Media Engine, as it includes two video encode engines, rather than just one in the M1 Pro. It also has double the number of ProRes encode and decode engines.
In an AppleInsider Final Cut Pro export test, involving a 1.12GB 4K video file of a YouTube video, the M1 completed the task in 3 minutes and 53 seconds.
The M1 Pro took 3 minutes and 35 seconds for the same export, but the M1 Max achieved the same result in 2 minutes and 4 seconds.
Using a 13.5-gigabyte 4K video recorded in ProRes on an iPhone 13 Pro and rendered in ProRes 422, the M1 Max excelled, taking just 51 seconds to complete the task. The M1 Pro took 1 minute 30 seconds to perform the same task, and the M1 trailed behind at 1 minute 40 seconds.
With film and TV productions relying on premium hardware due to time being a crucial factor in editing, it makes sense for Apple to include these in the M1 Pro and Max.
M1 vs M1 Pro vs M1 Max - Thunderbolt and Video Out
Two of the less critical changes to the chips relate to connectivity to other devices.The M1 could support Thunderbolt 4, but the initial devices offered relatively few Thunderbolt ports to use the standard. Apple rectified the limitation for the M1 Pro and M1 Max by adding more integrated Thunderbolt 4 controllers to add more I/O bandwidth.
In brief, this allowed you to connect more stuff to your Mac, as there's more bandwidth available.
As a byproduct of the I/O bandwidth increase, Apple also made it possible for the new chips to deal with video output much better than before.
You can attach more external monitors to the M1 Pro and M1 Max than the limited M1.
The M1 could only handle two high-resolution displays, which meant the M1 Mac mini could output to two screens at a 6K and 4K resolution. Meanwhile, the M1 13-inch MacBook Pro could only deal with one external 6K display along with its built-in screen.
The M1 Pro improves its external video support, allowing up to two 6K displays to be driven by a MacBook Pro. The M1 Max goes one step further, with it able to deal with up to three 6K screens along with a fourth 4K monitor.
M1 vs M1 Pro vs M1 Max - Professional Powerhouses
It's pretty apparent that it's not simple to explain the differences between Apple's chip lineup.In previous years, when Intel chips would vary in core counts, clock speeds, and generations, it was relatively simple to highlight what has changed. There are so many moving parts to an Apple Silicon chip that you can't just say one is faster than the other for just one specific reason.
Why are M1 Pro and M1 Max faster than the M1? A few more cores, a lot more GPU cores, more base memory, and faster memory as well. Then there's the Media Engine, though that will only really apply if you're doing video work.
Looking at the details doesn't simplify the differences. However, you can still say with quite a bit of certainty that the M1 Max is better than the M1 Pro, which is better than the M1.
Where to buy
Apple's 2021 MacBook Pros featuring the new M1 Pro and M1 Max chips are already discounted, with exclusive coupon savings on the systems and AppleCare at your fingertips in the AppleInsider 2021 16-inch MacBook Pro Price Guide and 14-inch MacBook Pro Price Guide. Exclusive discounts on M1 models can also be found in the Mac Price Guide.At press time, the best MacBook Pro deals offer triple-digit savings on both retail and configure-to-order (CTO) models at Apple Authorized Reseller Adorama with promo code APINSIDER and this activation link. Step-by-step instructions detailing how to redeem the coupon can be found here.
Read on AppleInsider
Comments
Could make for some crazy powerful handheld devices and introduce new use cases for the Airlines, Military, etc.
What’s clear is that no M1s get their power efficient performance from raw compute so these benchmarks are conservative to the point of being misleading.
Better to go with application-based workloads and provide a spread of scenarios (though preferably not from an x86 systems vendor who pulled the AI test components when the M1 crapped all over its own products & seems to have confused ‘work done’ with screen refresh rate)
I would disagree with the statement “Apple erred towards efficiency and battery life, as well as keeping temperatures manageable for the fan-less MacBook Air,” though. I don’t see this as ‘erring.’ The M1 was their first laptop SOC and was designed for the MacBook Air. System and chip design always involve some compromises but laptop design especially does. The MBA is Apple’s lower level laptop where battery life matters more than performance. As such, the M1 is the perfect chip for the device.
Now I just wish Apple would make a 16” MBA. I don’t necessarily need the performance of the M1Pro but I do need a bigger screen to work on.
With dedicated media acceleration they don’t have a need for several low-power cores while playing back video, or even encoding it: most common desktop applications rarely make use of more than 2 separate threads at any given time, with one being for front end GUI interaction with the user, and a background thread for waiting on I/O, which, when the hardware is implemented correctly, is mostly waiting without much processor use anyway. Thus, it’s fully possible in most use-cases you can do all you need with just the 2 energy-efficient cores: Mail or Pages will be far more than fast enough with the 2 small cores. Web browsing, similar for I/O, but other things will cause greater power core usage, and a lot more threads.
The other reason you would use efficiency cores are for regular scheduled background system tasks, that don’t need much CPU power. Again, even 2 threads is more than a lot of those will use. I personally thought it was more an odd choice for the M1 to start out with 4 energy-efficient cores because of these reality-based considerations. Perhaps that was more a question of getting some real-world results achieved with lower risk, and getting real-world data of core usage in practice as to why they went with those decisions.
Me? If I’m typing up a storm using Pages while listening to my music and it doesn’t cause a slow-down/glitch in any of it while typing 90+ WPM with it doing spellchecking as I type, if it’s only using 2 efficient cores, that’s less wasted energy I can use later on other things.
I strongly suspect based on my experience as a developer and doing developer support that 4 energy-efficient cores is the most Apple will ever have on a system, because of what I explained above. There are only a few types of applications that make effective use of more than 2 cores, and other than properly-written games and media encoding, most of them are in the server realm. Making full use of a large number of cores isn’t nearly as easy or feasible as most would like to think in practice because too much sequential dependency is involved: this is why, unless you’re running a server, an M1 Max is far more useful with a smaller number of faster cores, than an AMD Threadripper with 64: all those cores are going to be slower, and most of the time, not used. Most users would be shocked to learn how little their M1 of any variety are fully utilized.
I just wish there was a way to un-disable a core that was actually working perfectly. Kind of like deneuralizing a "Men in Black."
When you double the accessed number of functions, you add another bit of address/selection lines, and even if the speed of what was accessed (once it was selected) is identical and remains in the selected state, that initial access due to the layers of combinatorial logic still takes more time because it requires more linear distance for signals to travel, and that means more resistance and capacitance of the signal lines, beyond the speed of each transistor used.
Look at the sizes of caches for CPUs: the smaller the caches are, the lower the latency, because of this as a factor. Also, as as you increase the size of the caches, the distance grows (and thus time) for access, and generally it’s preferable to have constant-time/cycles for cache access (especially L1 instruction and data cache) to simplify and speed logic. Once a cache becomes too large, not even counting the heat/power issue, your system performance goes down, even if you can afford to make that large of a chip.
Now, look at the size of the M1 Pro versus the M1 Max: the Max is over twice as large for area. All of the GPU/CPU cache and those cores must be kept coherent, and that has a cost in space, time and energy, and waste heat. Bigger isn’t always better for performance, as alluded to above, and bigger systems tend to be slower due to sheer size: at the speeds of electrons in circuits (not as fast as light, but a good percentage) the differences in the sizes of the M1, M1 Pro and the M1 Max really add up. For the same reasons, it’s a HUGE design win to have the RAM on the same package, because the circuit traces are kept minimal for distance, which greatly affects power usage as well, making the system so darn power-efficient while also being fast.
Given all these constraints from physics, I’m very curious how they’re going to design and manufacture the SoCs used for the Pro desktop/workstations with more CPU and GPU cores with unified memory. Designing and building them as they are has practical limits for costs as well as scaling. I expect the next step is using chiplets on the package with an underlying total system bus on the main chip that is mostly system cache and I/O, with the CPUs and GPUs with their more local caches on chiplets with a number of cores each.