tenthousandthings

About

Username: tenthousandthings
Joined: June 2007
Visits: 179
Last Active: October 2024
Roles: member
Points: 2,055
Badges: 1
Posts: 1,068

Reactions

1.1KLike0Dislike163Informative

Future Mac Pro may use Apple Silicon & PCI-E GPUs in parallel

tenthousandthings

February 2023

This follows neatly on the Mac Pro discussion about rumors with regard to this question a few weeks ago, New Mac Pro may not Support PCI-E GPUs
New Mac Pro may not support PCI-E GPUs

tenthousandthings

January 2023

Marvin said:

tht said:

cgWerks said:

tht said:
… The big issue, how do they get a class competitive GPU? Will be interesting to see what they do. The current GPU performance in the M2 Max is intriguing and if they can get 170k GB5 Metal scores with an M2 Ultra, that's probably enough. But, it's probably going to be something like 130k GB5 Metal. Perhaps they will crank the clocks up to ensure it is more performant than the Radeon Pro 6900X …

The devil is in the details. I’ve seen M1 Pro/Max do some fairly incredible things in certain 3D apps that match high-end AMD/Nvidia, while at the same time, there are things the top M1 Ultra fails at so miserably, it isn’t usable, and is bested by a low-mid-end AMD/Nvidia PC.

I suppose if they keep scaling everything up, they’ll kind of get there for the most part. But, remember the previous Mac Pro could have 4x or more of those fast GPUs. Most people don’t need that, so maybe they have no intention of going back there again. But, I hope they have some plan to be realistically competitive with more common mid-to-high end PCs with single GPUs. If they can’t even pull that off, they may as well just throw in the towel and abandon GPU-dependant professional markets.

If only they can keep scaling up. Scaling GPU compute performance with more GPU cores has been the Achilles heel of Apple Silicon. I bet not being able to scale GPU performance is the primary reason why the M1 Mac Pro was not shipped or got to validation stage. On a per core basis, 64 GPU cores in the M1 Ultra is performing at little over half half (GB5 Metal 1.5k points per core) of what a GPU core does in an 8 GPU core M1 (2.6k per core). It's basically half if you compare the Ultra to the A14 GPU core performance. And you can see the scaling efficiency get worse and worse when comparing 4, 8, 14, 16, 24, 32, 48 and 64 cores.

The GPU team inside Apple is not doing a good job with their predictions of performance. They have done a great job at the smartphone, tablet and even laptop level, but getting the GPU architecture to scale to desktops and workstations has been a failure. Apple was convinced that the Ultra and Extreme models would provide competitive GPU performance. This type of decision isn't based on some GPU lead blustering that this architecture would work. It should have been based on modeled chip simulations showing that it would work and what potential it would have. After that, a multi-billion decision would be made. So, something is up in the GPU architecture team inside Apple imo. Hopefully they will recover and fix the scaling by the time the M3 architecture ships. The M2 versions has improved GPU core scaling efficiency, but not quite enough to make a 144 GPU core model worthwhile, if the rumors of the Extreme model being canceled are true (I really hope not).

If the GPU scaling for the M1 Ultra was say 75% efficient, it would have scored about 125k in GB5 Metal. About the performance of a Radeon Pro 6800. An Extreme version with 128 GPU cores at 60% efficiency would be 200k in GB5 Metal. That's Nvidia 3080 territory, maybe even 3090. Both would have been suitable for a Mac Pro, but alas no. The devil is in the details. The Apple Silicon GPU team fucked up imo.

There are some tests where it scales better. Here they test 3DMark at 7:25:

Ultra gets 34k, Max gets 20k so 70% increase.

Later in the video at 20:00, they test Tensorflow Metal and Ultra GPU scales to as high as 93% faster than Max.

https://github.com/tlkh/tf-metal-experiments

Some GPU software won't scale well due to CPU holding it back, especially if it's Rosetta translated.

It's hard to tell if it's a hardware issue or software/OS/drivers, likely a combination. If the performance gain is more consistent with M2 Ultra and doesn't get fixed with the same software on M1 Ultra, it will be clearer that it's been a hardware issue. The good thing is they have software like Blender now that they can test against a 4090 and figure out what they need to do.

The strange part is that the CPU is scaling very well, almost double in most tests. The AMD W6800X Duo scales well for compute too and those separate GPU chips are connected together in a slower way than Apple's chip. There are the same kind of drops in some tests but OctaneX is near double:

https://barefeats.com/pro-w6800x-duo-other-gpus.html

As you say, the GPU scaling issue could have been the reason for not doing an M1 Extreme. M2 Ultra and the next Mac Pro will answer a lot of questions.

An Extreme model wouldn't have to be 4x Ultra chips, the edges on the Max chips could both be connected to a special GPU chip that only has GPU cores. Given that the Max is around 70W, they can do 4x Max GPU cores just on the extra chip (152) plus 76 on the others for 228 cores. This would be 90TFLOPs and would be needed to rival a 4090 and this extra chip can have hardware RT cores. This should scale better as 2/3 of the GPU cores would be on the same chip.

I wonder if the AMD Duo MPX modules, which use Infinity Fabric internally (to join two GPUs) in conjunction with the external "Infinity Fabric Link" Mac Pro interconnect to bypass PCIe (although PCIe remains) and create an Infinity Fabric quad (a total of four GPUs, despite the "duo" name), might serve as a model for something Apple could do, using UltraFusion (or the next generation of it) to link the base SoC to MPX GPUs. Imagine what Apple could do with that.

My understanding is that Infinity Fabric and UltraFusion are similar technologies, and if Apple and AMD can collaborate to create the Infinity Fabric Link, they can also coordinate the next generation(s). I guess that's really the thing that doesn't make sense to me -- that AMD and Apple would walk away from their alliance just as things start getting interesting, with RDNA 3 architecture on the AMD side, and Apple's own silicon with next-generation UltraFusion on the other. Metal 3 supports advanced AMD GPUs (i.e., those in the iMac Pro and the lattice Mac Pro), so until I hear otherwise, I'll continue to believe that Apple is going to leave the door open to AMD modules (even if only via PCIe, the status quo)...
New Mac Pro may not support PCI-E GPUs

tenthousandthings

January 2023

LOL, this is the opposite of what I said last night in the Intel Mac Pro versus M2 Pro Mac mini thread. All that beautiful thermal engineering, the MPX modules with Infinity Fabric Link, it all goes to waste, discontinued less than four years after launch (December 2019)? That would be a shame. I really don't think Apple is in the business of shooting itself in the foot like that.

It would, however, be very Apple-like to use an UltraFusion-like interconnect (which is similar to AMD's Infinity Fabric), so the MPX options at launch would be limited and expensive.
M2 Pro Mac mini vs Mac Pro - compared

tenthousandthings

January 2023

mjtomlin said:

keithw said:

The 2023(?) ASi "Mac Pro" must either be able to reach the 166,946 GB5 GPU results either with on-chip GPU cores or by a discrete graphics card like the existing Intel Mac Pro, otherwise, why bother to even release it?

I think this is why we haven't seen the new Mac Pro yet. The new GPU design for the A16 was supposed to see a huge performance increase (>50%, mainly due to implementing hardware based ray tracing), but it had to be pulled because apparently it wasn't meeting efficiency standards*. So If I had to guess, the M3 is going to skip the A16 and be based on the A17 generation of cores, so we should see a fairly substantial performance increase in the M3 and finally get the ASi based Mac Pro, which will be the first system with M3 generation SoCs with M3 Ultra and M3 Extreme. (Both the A17 and M3 will also use TSMC's N3 process bringing further performance and efficiency enhancements.)

*This is the issue that's going to have be addressed at some point in the future... trying to develop a single core for both mobile and desktop applications. More than likely, that new GPU would've been fine in a desktop system where thermal ceilings can be lifted with active cooling systems. I think Apple will eventually start "optimizing" actual CPU and GPU (more so) cores for their intended systems.

That leak about the A16 graphics for iPhone is believable, but it may miss the forest for the trees. An M2 Extreme could use the A16 foundation on N4 with new graphics, and let’s not forget we don’t actually know anything about the M2 Ultra. Your hypothetical optimized cores may be closer than we know…
M2 Pro Mac mini vs Mac Pro - compared

tenthousandthings

January 2023

keithw said:

blastdoor said:

Good comments discussion of GPU performance.

In order for the Mac Pro to compete with 'pro' level Windows/Linux systems using high-end discrete GPUs, I wonder if Apple needs to either (1) continue to include high-end discrete GPUs in the Mac Pro (which kind of runs contrary to Apple's strongly expressed preference for sharing memory between CPU and GPU cores, but perhaps so be it) or (2) reconfigure Apple Silicon so that CPU and GPU cores sit on different pieces of silicon and are linked together via 'UltraFusion', thereby perhaps improving chip yields since CPU and GPU cores could be on separate cores.

I'm inclined to think that option 2 is more appealing technically, but I'm not sure about the business/economics side. If Apple puts CPU and GPU on different dies linked by UltraFusion (or whatever they want to call their 'glue'), they would likely need to do that across more product lines than just the Mac Pro. Maybe only integrate CPU and GPU on a single piece of silicon for the generic M#, but for Pro, Max, Ultra, etc, put CPU and GPU on different dies linked. That would allow independent scaling of CPU and GPU power to better target the needs of users who either need more CPU or more GPU (or both).

I can see no technical reason Apple can't have both a powerful on-chip set of GPUs as well as supporting discrete PCIe GPU boards for people who need it. I've been running an AMD RX 6900 XT graphics card in a Thunderbolt 3-based eGPU enclosure made by Sonnettech, and getting similar GB 5 metal results to the current Mac Pro. This is with a 5-year-old iMac Pro.

I think if we step back, it’s evident the current Mac Pro must have been designed to house Apple Silicon. They introduced it only a year before the announcement (shipping only months before it) — the planning and preparation for such a consequential move must have been well underway in 2017 and 2018 when the current Mac Pro was created. If so, then the plan all along has been to support MPX modules with Apple Silicon, including GPUs.