M5 Pro may separate out GPU and CPU for new server-grade performance

AppleInsider · December 23, 2024 3:57PM

Analyst Ming-Chi Kuo says that Apple will move away from its current processor designs that keep the CPU and GPU cores on the same chip -- and see a performance gain.

Close-up of a red three-dimensional sign with stylized letters, partially obstructed view, against a grid background.

TSMC has just announced an all-new chip production process called "A16"

One of the reasons for Apple Silicon's speed over the previous Intel processors has been that each M-series chip has been a single unit. This System-on-a-Chip (SoC) idea cuts bottlenecks by having all the processor's elements together on one chip package.

According to Kuo, however, Apple is going to change this for the M5 Pro, M5 Max, and M5 Ultra. Only the M5 will remain as a single unit.

Instead, the M5 Pro and other chips will use manufacturer TSMC's latest chip packaging process. Called the System-in-Integrated-Chips-Molding-Horizontal (SoIC-mH), it puts together different chips into one package.

Apple M5 series chip

1. The M5 series chips will adopt TSMC's advanced N3P node, which entered the prototype phase a few months ago. M5, M5 Pro/Max, and M5 Ultra mass production is expected in 1H25, 2H25, and 2026, respectively.
2. The M5 Pro, Max, and Ultra will utilize https://t.co/XIWHx5B2Cy
-- (Ming-Chi Kuo) (@mingchikuo)
December 23, 2024

The advantage, according to Kuo, is that this will produce "server-grade" packaging. Apple "will use 2.5D packaging" that has "separate CPU and GPU designs," and which will "improve production yields and thermal performance."

Kuo says that mass production is expected in 2H25 for the M5 Pro and the M5 Max, and then 2026 for the M5 Ultra. The M5 has reportedly been in the prototyping phrase for a few months, and mass production is believed to be planned for 1H25.

That M5 processor will be produced by TSMC using its N3P technology, which is expected to be seen first in the iPhone 18 range.

Kuo also claims that the M5 Pro processors will be used in Apple Intelligence servers. Specifically, it will be utilized for the company's Private Cloud Compute technology.

Rumor Score: Possible

Read on AppleInsider

sunman42 · December 23, 2024 4:35PM

If Apple wants to optimize Apple Intelligence servers it builds with its own chips, I can see the attraction to a roadmap that includes both "pure" CPU chips with the new packaging and chipsets that are tuned for maximum GPU performance on AI tasks. But for laptops/desktops.... that new packaging would have to be really nice to make such major design changes.

apple4thewin · December 23, 2024 5:15PM

Everyone currently trying to come out with a chip using SoC by the end of 2025 or 2026 but apple is already one step ahead. Although would this still share memory or will it go back to the dedicated ram for CPU and other for GPU?

programmer · December 23, 2024 5:38PM

apple4thewin said:

Everyone currently trying to come out with a chip using SoC by the end of 2025 or 2026 but apple is already one step ahead. Although would this still share memory or will it go back to the dedicated ram for CPU and other for GPU?

This would almost certainly still be a unified memory architecture. The chiplets will be interconnected via some kind of high speed in-package network or bus, much like current chips use an on-die interconnect. This gives manufacturing flexibility and improve yields. AMD has been aggressively using such techniques for years, and it has really just been a matter of time until Apple jumped on it as well.

netrox · December 23, 2024 6:30PM

Does that mean there will be no M4 Ultra, just M5 Pro/Max and Ultra later on? The rumors int he past suggest M4 Ultra will be processed on NPX process (designed for high end computing) and then after that, they start the M5 for base Macs using 2nm.

danox · December 23, 2024 6:56PM

If this new chip can be used in all the current devices that Apple makes laptops, Mac Studios without delay no problem, if it can’t or if there is a delay for example, the Mac Studio M4 Ultra isn’t coming in 2025 that is a problem. I am glad however the writing is on the wall, Apple needs to roll up it sleeves and make a in house server using Apple Silicon which has too many good characteristics to leave on the sideline.

Oh, and separating the memory from CPU/GPU appears to be a step backwards to UMA, whatever AMD is pursuing right now it isn’t better than Apple Silicon when it comes to wattage used efficiency and performance, yes they (AMD) are closer than Intel but Nvidia is just completely out of the ballpark when it comes to energy wasted, yes it’s faster for certain tasks now but 1000-1500 watts liquid cooled systems are an eventual dead end which Apple used at one time when they were at the heights of their G5 hell.

https://www.digitaltrends.com/computing/why-nvidia-rtx-4090s-are-melting/ butterfly keyboards, antenna gate, bend gate, power button location, or upside down mice are nothing to this….. I don’t think Apple will go away from power efficiency.

But the biggest bonus might be the software Apple will need to write to support networking multiple Mac servers together some of that software has got to filter down to many very smart end users (developers).

edited December 2024

entropys · December 23, 2024 7:28PM

Not sure it makes sense to have a unified M5 SOC with some GPU cores then M5 Pro SOC etc. without GPU cores. Perhaps they are all identical GPU wise (same GPU cores as part of the SOC across the board) with the pros also having the add on GPU chip for more cores and RAM.

rob53 · December 23, 2024 7:37PM

The title talks about server-grade performance but if I remember correctly servers only speed up certain processes. Desktop systems are capable of doing a bunch of things instead of server-type number crunching. At this point, I care less about AI than doing typical desktop things. Sure, having an Apple system that's server-grade for doing server-type processes (yes, I mean a consumer, local iCloud server) would be great. Splitting the CPU from the GPU might be nice if, as others have said, they continue to have a very high speed internal bus. On the other hand, once Apple splits the CPU from the GPU, then will they split the memory and storage as well? This sounds like we're going back to a component based system instead of a SoC. On the other hand, GPUs appear to be the faster computing device. For the mythical recreation of the Apple Server, an adequate CPU connected to a huge GPU farm with access to easily replaceable storage would be a great idea. The big question for Mac users is which applications make use of CPUs and which make use of GPUs? I know most applications make use of both but if the GPU can handle higher levels of computing (the top of the TOP500 supercomputers use tons of GPUs) but would it be beneficial if more macOS applications adjusted their applications to predominantly use GPUs instead of CPUs? PCIe 7.0 is supposed to be able to handle 128Gt/s (giga-transfers/sec). I believe (probably wrong) the M4 might use a PCIe 5 internal bus. Anyway, none of this really matter to 95% of Apple consumers, they just want everything to work and keep working for as long as possible.

melgross · December 23, 2024 7:38PM

I’ve been saying for some time that Apple could do this so that they could use larger GPU assemblies, but was struck down every time.

mikethemartian · December 23, 2024 9:10PM

Isn’t this the same thing the group that left Apple to start Nuvia wanted to do at Apple?

9secondkox2 · December 23, 2024 11:07PM

Doesn't make sense. It's already server-grade packaging. Perhaps better than other server-grade packaging.

I can think of 3 reasons this would make sense:

1. If Apple is combining their silicon with third party chips from Nvidia or semthing.

2. Alternatively, another way it could make sense is if Apple is looking to add more GPU cores to various iterations of its chips, without increasign CPU core counts. i.e. having multiple sets of max/Ultra chips - one set for laptops, one for Mac Studio, and another for Mac Pro.

or...

3. Appel could be redoing the way it tiers its chip lineup. CPU could be the same, but GPU would be different for each tier.

Interesting to see how this develops.

chelgrian · December 24, 2024 12:32AM

9secondkox2 said:

Doesn't make sense. It's already server-grade packaging. Perhaps better than other server-grade packaging.

I can think of 3 reasons this would make sense:

1. If Apple is combining their silicon with third party chips from Nvidia or semthing.

2. Alternatively, another way it could make sense is if Apple is looking to add more GPU cores to various iterations of its chips, without increasign CPU core counts. i.e. having multiple sets of max/Ultra chips - one set for laptops, one for Mac Studio, and another for Mac Pro.

or...

3. Appel could be redoing the way it tiers its chip lineup. CPU could be the same, but GPU would be different for each tier.

Interesting to see how this develops.

It’s much simpler than that, the top end chips Apple are currently producing are already at the reticle limit that is they are as large as you can go and still be a single chip.

To add more resources the only thing you can do is manufacture multiple chiplets each themselves at the reticle limit and package them together.

tht · December 24, 2024 2:03AM

9secondkox2 said:

Doesn't make sense. It's already server-grade packaging. Perhaps better than other server-grade packaging.

I can think of 3 reasons this would make sense:

1. If Apple is combining their silicon with third party chips from Nvidia or semthing.

2. Alternatively, another way it could make sense is if Apple is looking to add more GPU cores to various iterations of its chips, without increasign CPU core counts. i.e. having multiple sets of max/Ultra chips - one set for laptops, one for Mac Studio, and another for Mac Pro.

or...

3. Appel could be redoing the way it tiers its chip lineup. CPU could be the same, but GPU would be different for each tier.

Interesting to see how this develops.

As I understand it, the maximum sized chips that can be made with these more and more advanced nodes is decreasing.

Eg, TSMC can’t make a 400+ mm2 chip like the M4 Max at 2nm. It’s physical limit of the lithography machines. Moreover, costs per transistor won’t be scaling down as fast as transistors per mm2 is scaling up. SRAM in particular has already started seeing limited scaling at 5 nm. So challenges ahead.

As such, all chip designs are headed to stacking, silicon bridging or both. This particular tech looks to be a thru-silicon-vias (TSV), where the signaling between stacked chips is done through boring through the stack of chips and putting a wire down through it.

So basically, like HBM but with logic chips or mix and matching? There’s going to be chip scale heat transfer plates sandwiched between layers eventually.

canukstorm · December 24, 2024 3:56AM

mikethemartian said:

Isn’t this the same thing the group that left Apple to start Nuvia wanted to do at Apple?

Yes, and that's the irony isn't it. That group that left, wanted to do an ARM server processor while at Apple but supposedly, at that time, Apple decided not to pursue it. And now with the big AI initiative, Apple needs to do its own server processor.

toortog · December 24, 2024 6:03AM

Hope they also make a OSX Server version that is "lean and mean" for creating application servers.

michelb76 · December 24, 2024 8:41AM

canukstorm said:

mikethemartian said:

Isn’t this the same thing the group that left Apple to start Nuvia wanted to do at Apple?

Yes, and that's the irony isn't it. That group that left, wanted to do an ARM server processor while at Apple but supposedly, at that time, Apple decided not to pursue it. And now with the big AI initiative, Apple needs to do its own server processor.

Pretty sure these server processors have been in the cards for several years, as you don't just start doing this without having a solid base chip. Pretty sure the Nuvia guys just wanted it to happen faster, while Apple has to appease shareholders and milk the Mx iterations first. Also, unlikely Apple will sell these chips for use in datacenters, they will probably just exclusively use them themselves. No datacenter has plans to run on MLX.

michelb76 · December 24, 2024 8:42AM

danox said:

Oh, and separating the memory from CPU/GPU appears to be a step backwards to UMA, whatever AMD is pursuing right now it isn’t better than Apple Silicon when it comes to wattage used efficiency and performance, yes they (AMD) are closer than Intel but Nvidia is just completely out of the ballpark when it comes to energy wasted, yes it’s faster for certain tasks now but 1000-1500 watts liquid cooled systems are an eventual dead end which Apple used at one time when they were at the heights of their G5 hell.

https://www.digitaltrends.com/computing/why-nvidia-rtx-4090s-are-melting/ butterfly keyboards, antenna gate, bend gate, power button location, or upside down mice are nothing to this….. I don’t think Apple will go away from power efficiency.

But the biggest bonus might be the software Apple will need to write to support networking multiple Mac servers together some of that software has got to filter down to many very smart end users (developers).

Apple will have to go away from power efficiency if they want higher performance, unless we change how physics work. And connecting multiple macs together has been there for a while with MLX, and it works fantastic.

blastdoor · December 24, 2024 1:23PM

melgross said:

I’ve been saying for some time that Apple could do this so that they could use larger GPU assemblies, but was struck down every time.

I’ve thought the same thing. It also creates the potential for pro users to have more options in the ratio of cpu to gpu cores. When I look at an ultra, I’d rather see more cpu cores and fewer gpu cores. Others likely think the opposite. It would be great if more needs could be met.

icreate · December 24, 2024 3:19PM

I always thought they would have an SOC for consumer and a more traditional CPU for server/workstation. With all the AI hype, I wouldn't be surprised if they've created a dedicated CPU, and maybe standalone add in cards with some GPU/NPU combo for those that need it. Or at the very least just for internal private cloud compute workloads.

danox · December 25, 2024 5:36PM

The Mac Studio M2 Ultra is 3.3 times off the pace of a Nvidia 4080-4090 card equipped system, note. The Mac Studio has higher bandwidth and 192 gigs of memory it can use and needs only 107 watts of power run everything in the system, the Mac Studio M4 Ultra (if Apple releases it?) which is more powerful and is even more efficient and being two generations down the road is probably somewhere in the ballpark of that mythical Nvidia GPU graphics card which needs 353 W of power by itself and Nvidia recommends that you have a power supply of 750 to 1000 Watts just to boot up your computer just to use it.

I hope these rumors don’t mean Apple is not going to the release the M4 Studio Ultra in 2025 if so, that means they won’t be releasing any Studio Ultra until the middle to end of 2026 which would lead to further collapse in Mac desktop sales at the wrong time when many people were starting to look at big memory high bandwidth Macs to run AI models locally.

https://www.hardware-corner.net/guides/mac-for-large-language-models/

https://odysseyapp.io/blog/five-reasons-to-run-ai-models-locally

https://forums.macrumors.com/threads/so-happy-with-the-m4-pro-i-can-finally-use-ai-stuff-locally.2442964/

programmer · December 25, 2024 5:52PM

Multi-chiplet gives Apple options. The ultras so far have been dual chips of the same type, but that doesn’t have to be the case. And they might be able to do edge connectors on more than one edge. More expensive and higher power though, so mostly for desktops and servers. Imagine one chip with just CPUs, and one with just GPUs… and connectors on two edges instead of one. That would create a lot of permutations — all CPU, 3 GPU + 1CPU, 2+2. Put a core or two on the GPU chiplet and then you could have “GPU only” combination. Lots of options, which makes for some interesting possibilities for the pro desktop/server lineup.

None of which means that consumer machines wouldn’t still be single chip SoCs.

M5 Pro may separate out GPU and CPU for new server-grade performance

Comments