tht

About

Username: tht
Joined: November 2001
Visits: 167
Last Active: April 21
Roles: member
Points: 6,906
Badges: 1
Posts: 5,452

Reactions

3.9KLike581Informative

New Mac Pro may not support PCI-E GPUs

tht

January 2023

tenthousandthings said:

Marvin said:

tht said:

cgWerks said:

tht said:
… The big issue, how do they get a class competitive GPU? Will be interesting to see what they do. The current GPU performance in the M2 Max is intriguing and if they can get 170k GB5 Metal scores with an M2 Ultra, that's probably enough. But, it's probably going to be something like 130k GB5 Metal. Perhaps they will crank the clocks up to ensure it is more performant than the Radeon Pro 6900X …

The devil is in the details. I’ve seen M1 Pro/Max do some fairly incredible things in certain 3D apps that match high-end AMD/Nvidia, while at the same time, there are things the top M1 Ultra fails at so miserably, it isn’t usable, and is bested by a low-mid-end AMD/Nvidia PC.

I suppose if they keep scaling everything up, they’ll kind of get there for the most part. But, remember the previous Mac Pro could have 4x or more of those fast GPUs. Most people don’t need that, so maybe they have no intention of going back there again. But, I hope they have some plan to be realistically competitive with more common mid-to-high end PCs with single GPUs. If they can’t even pull that off, they may as well just throw in the towel and abandon GPU-dependant professional markets.

If only they can keep scaling up. Scaling GPU compute performance with more GPU cores has been the Achilles heel of Apple Silicon. I bet not being able to scale GPU performance is the primary reason why the M1 Mac Pro was not shipped or got to validation stage. On a per core basis, 64 GPU cores in the M1 Ultra is performing at little over half half (GB5 Metal 1.5k points per core) of what a GPU core does in an 8 GPU core M1 (2.6k per core). It's basically half if you compare the Ultra to the A14 GPU core performance. And you can see the scaling efficiency get worse and worse when comparing 4, 8, 14, 16, 24, 32, 48 and 64 cores.

The GPU team inside Apple is not doing a good job with their predictions of performance. They have done a great job at the smartphone, tablet and even laptop level, but getting the GPU architecture to scale to desktops and workstations has been a failure. Apple was convinced that the Ultra and Extreme models would provide competitive GPU performance. This type of decision isn't based on some GPU lead blustering that this architecture would work. It should have been based on modeled chip simulations showing that it would work and what potential it would have. After that, a multi-billion decision would be made. So, something is up in the GPU architecture team inside Apple imo. Hopefully they will recover and fix the scaling by the time the M3 architecture ships. The M2 versions has improved GPU core scaling efficiency, but not quite enough to make a 144 GPU core model worthwhile, if the rumors of the Extreme model being canceled are true (I really hope not).

If the GPU scaling for the M1 Ultra was say 75% efficient, it would have scored about 125k in GB5 Metal. About the performance of a Radeon Pro 6800. An Extreme version with 128 GPU cores at 60% efficiency would be 200k in GB5 Metal. That's Nvidia 3080 territory, maybe even 3090. Both would have been suitable for a Mac Pro, but alas no. The devil is in the details. The Apple Silicon GPU team fucked up imo.

There are some tests where it scales better. Here they test 3DMark at 7:25:

Ultra gets 34k, Max gets 20k so 70% increase.

Later in the video at 20:00, they test Tensorflow Metal and Ultra GPU scales to as high as 93% faster than Max.

https://github.com/tlkh/tf-metal-experiments

Some GPU software won't scale well due to CPU holding it back, especially if it's Rosetta translated.

It's hard to tell if it's a hardware issue or software/OS/drivers, likely a combination. If the performance gain is more consistent with M2 Ultra and doesn't get fixed with the same software on M1 Ultra, it will be clearer that it's been a hardware issue. The good thing is they have software like Blender now that they can test against a 4090 and figure out what they need to do.

The strange part is that the CPU is scaling very well, almost double in most tests. The AMD W6800X Duo scales well for compute too and those separate GPU chips are connected together in a slower way than Apple's chip. There are the same kind of drops in some tests but OctaneX is near double:

https://barefeats.com/pro-w6800x-duo-other-gpus.html

As you say, the GPU scaling issue could have been the reason for not doing an M1 Extreme. M2 Ultra and the next Mac Pro will answer a lot of questions.

An Extreme model wouldn't have to be 4x Ultra chips, the edges on the Max chips could both be connected to a special GPU chip that only has GPU cores. Given that the Max is around 70W, they can do 4x Max GPU cores just on the extra chip (152) plus 76 on the others for 228 cores. This would be 90TFLOPs and would be needed to rival a 4090 and this extra chip can have hardware RT cores. This should scale better as 2/3 of the GPU cores would be on the same chip.

I wonder if the AMD Duo MPX modules, which use Infinity Fabric internally (to join two GPUs) in conjunction with the external "Infinity Fabric Link" Mac Pro interconnect to bypass PCIe (although PCIe remains) and create an Infinity Fabric quad (a total of four GPUs, despite the "duo" name), might serve as a model for something Apple could do, using UltraFusion (or the next generation of it) to link the base SoC to MPX GPUs. Imagine what Apple could do with that.

My understanding is that Infinity Fabric and UltraFusion are similar technologies, and if Apple and AMD can collaborate to create the Infinity Fabric Link, they can also coordinate the next generation(s). I guess that's really the thing that doesn't make sense to me -- that AMD and Apple would walk away from their alliance just as things start getting interesting, with RDNA 3 architecture on the AMD side, and Apple's own silicon with next-generation UltraFusion on the other. Metal 3 supports advanced AMD GPUs (i.e., those in the iMac Pro and the lattice Mac Pro), so until I hear otherwise, I'll continue to believe that Apple is going to leave the door open to AMD modules (even if only via PCIe, the status quo)...

Yes, my best guess for what Apple is going to do for more compute performance in the Mac Pro was essentially an Ultra or an Extreme in an MPX module. The machine would have a master Ultra or Extreme SoC that boots the machine and controls the MPX modules. You can add 3 or 4 2-slot wide MPX modules or 2 4-slot wide MPX modules. Most software will allow you to select which MPX module, or modules, to use. Don't think an additional connection, like an infinity fabric, over and above 16 lanes of PCIe is necessary, as most of the problems that can make use of these types of architectures are embarrassingly parallel.

UltraFusion represents something a bit different than AMD's Infinity Fabric (ex-Apple exec Mark Papermaster has a patent on IF!). An Infinity fabric is basically a serial connection for chips with about 400 GB/s of bandwidth. UltraFusion is a silicon bridge. It's a piece of silicon, not wire traces in the PCB with connection ports on the chip like AMD's Infinity Fabric, and is overlaid on top of the edges of the Max chip silicon.

Apple's chips have an on-die (read as silicon) fabric bus that connects the core components of the chips: CPU complexes, memory interfaces, GPU complexes, ML and media engines, SLC, PCIe, etc. The UltraFusion silicon bridge or interposer basically extends that fabric bus so to a second Max chip, at 2500 GB/s, with low enough latency that 2 GPUs can appear as one. I don't know if there is a hit to GPU core scaling performance because of this. I bet there probably is.

For the Extreme SoC, I was thinking the "ExtremeFusion" silicon bridge would be a double long version of UltraFusion one, also wider, and would have a 4 port fabric bus switch and 32 lanes of PCIe in it. So 4 Max chips would behave like one chip, and it would have 32 lanes of PCIe 4 coming out of it for 8 slots. Memory would be LPDDR, no DIMM slots. And they could stack the LPDDR 4-high. 16 stacks of 24 GB LPDDR5 stacked 4-high gets you 1.5 TB of memory. Hmm, just like the memory capacity of the 2019 Mac Pro.

Tremendous expense for a machine that sells what, 100k units per year? This level of integration doesn't come cheap. Heck, I'd like Apple to make an MBP16 with an Ultra to amortize the costs.
New Mac Pro may not support PCI-E GPUs

tht

January 2023

mjtomlin said:

1. If these new Mac Pros do not have support for all the PCIe cards that the current Intel Mac Pro support, this system will fail. Most users interested in this system will be those looking to upgrade from their current Mac Pro and will want to bring their extremely expensive MPX modules (GPU cards) with them. The advantages of PCI slots isn't just expandability, it's also portability - moving those cards to another system.

2. RAM is not on the SoC, it's on-package and can easily moved to the motherboard. There's no reason Apple cannot do this - yes, there will be a performance hit.

1. I think you are going to be disappointed. It's not going to support non Apple GPUs. Apple has already told developers that Apple Silicon isn't going to support discrete GPU cards, isn't going to support AMD, Nvidia or Intel GPUs. If it was different, they would have supported eGPUs with Apple Silicon machines way back when.

Now, they aren't holding up their end of the bargain by not offering Apple Silicon GPUs competitive to AMD and Nvidia graphics cards, but they surely know that. Sounds like it hasn't gotten bad enough for them to reverse the decision. The decision surely has huge ramifications for macOS design too.

For other PCIe cards like IO cards, storage, audio, and whatnot, I bet they will be supported if the Mac Pro has PCIe slots, which I think it will.

2. You literally stated a reason for Apple not to do it. The GPU is going to have a performance hit with less memory bandwidth. Before Apple started showing their architecture, a lot of people were contemplating how they were going to have system memory feed the GPU. 4 to 8 stacks of HBM2e? Lots of GDDR memory? Their own memory stacking solution (memory cubes et al)? 8 to 12 DDR5 channels? Turns out they decided on a gazillion channels of commodity LPDDR. Perhaps their GPU scaling issues is really a latency issue with LPDDR, and they really need to have a high clock memory solution (GDDR) to fix it? I don't know.
New Mac Pro may not support PCI-E GPUs

tht

January 2023

cgWerks said:

tht said:
… The big issue, how do they get a class competitive GPU? Will be interesting to see what they do. The current GPU performance in the M2 Max is intriguing and if they can get 170k GB5 Metal scores with an M2 Ultra, that's probably enough. But, it's probably going to be something like 130k GB5 Metal. Perhaps they will crank the clocks up to ensure it is more performant than the Radeon Pro 6900X …

The devil is in the details. I’ve seen M1 Pro/Max do some fairly incredible things in certain 3D apps that match high-end AMD/Nvidia, while at the same time, there are things the top M1 Ultra fails at so miserably, it isn’t usable, and is bested by a low-mid-end AMD/Nvidia PC.

I suppose if they keep scaling everything up, they’ll kind of get there for the most part. But, remember the previous Mac Pro could have 4x or more of those fast GPUs. Most people don’t need that, so maybe they have no intention of going back there again. But, I hope they have some plan to be realistically competitive with more common mid-to-high end PCs with single GPUs. If they can’t even pull that off, they may as well just throw in the towel and abandon GPU-dependant professional markets.

If only they can keep scaling up. Scaling GPU compute performance with more GPU cores has been the Achilles heel of Apple Silicon. I bet not being able to scale GPU performance is the primary reason why the M1 Mac Pro was not shipped or got to validation stage. On a per core basis, 64 GPU cores in the M1 Ultra is performing at little over half half (GB5 Metal 1.5k points per core) of what a GPU core does in an 8 GPU core M1 (2.6k per core). It's basically half if you compare the Ultra to the A14 GPU core performance. And you can see the scaling efficiency get worse and worse when comparing 4, 8, 14, 16, 24, 32, 48 and 64 cores.

The GPU team inside Apple is not doing a good job with their predictions of performance. They have done a great job at the smartphone, tablet and even laptop level, but getting the GPU architecture to scale to desktops and workstations has been a failure. Apple was convinced that the Ultra and Extreme models would provide competitive GPU performance. This type of decision isn't based on some GPU lead blustering that this architecture would work. It should have been based on modeled chip simulations showing that it would work and what potential it would have. After that, a multi-billion decision would be made. So, something is up in the GPU architecture team inside Apple imo. Hopefully they will recover and fix the scaling by the time the M3 architecture ships. The M2 versions has improved GPU core scaling efficiency, but not quite enough to make a 144 GPU core model worthwhile, if the rumors of the Extreme model being canceled are true (I really hope not).

If the GPU scaling for the M1 Ultra was say 75% efficient, it would have scored about 125k in GB5 Metal. About the performance of a Radeon Pro 6800. An Extreme version with 128 GPU cores at 60% efficiency would be 200k in GB5 Metal. That's Nvidia 3080 territory, maybe even 3090. Both would have been suitable for a Mac Pro, but alas no. The devil is in the details. The Apple Silicon GPU team fucked up imo.
Intel just took the worst beating in earnings in over a decade

tht

January 2023

ITGUYINSD said:

The article doesn't really explain WHY Intel sales were so bad. Especially during the holiday season. Is new gear too expensive for consumers? Are people keeping their computers longer? Or are consumers moving towards more mobile devices?

It doesn't help Intel that major sellers like Dell have raised their prices making one take a second thought about buying or upgrading.

Post-pandemic bust.

MS reported a 40% decline in Windows revenue. PC sales declined liked 35 to 40%. Crypto boomed during the pandemic. Crypto is now busted, so all those server GPU sales are declining. Services (Google, Amazon, MS, etc) boomed during the pandemic - work from home services etc - and now it has busted too.

Apple will be lucky to hold the line at 20% declines imo. They said Macs would decline 3 months ago. iPhone sales have slowed due to COVID lockdowns at various plants.
Apple still on track for iPad Pro revamp with OLED display in 2024

tht

January 2023

charlesn said:

A revamp with OLED? Wait, what? All of the display coverage lately has been about the eventual move, when prices come down, from OLED to mini-LED in computer applications because of several advantages. But the iPad 12.9 inch already has a mini-LED display... so we're gonna go backwards and replace it with OLED? This makes no sense.

Hmm, maybe you are confused between miniLED with microLED?

LCD = monolithic backlight (iPA, iPad, MBA, MBP13, iMac24, ASD27)
miniLED = discretized backlight (MBP14/16, iPP12.9, Pro Display XDR)
OLED = the subpixels are emissive and are the lights, and organic (iPhones, Watch)
microLED = the subpixels are emissive and are the lights, and inorganic

The miniLED in the iPP12.9 and MBP14/16 have backlights that are about 0.25" x 0.25". About 2500 of them. The 32" XDR has about 650 backlights, so much larger. It's a large display though and resultantly expensive. These miniLEDs trade having very high brightness - 1600 nits for HDR content - for some blooming in dark rooms. And they are presumably much longer lived.

The OLEDs trade having no blooming, but are limited by brightness. iPhones have 1200 nits of brightness for HDR content, which is pretty darn good, but they are more fragile than miniLED or LCDs. Using them for 10 years may be a problem due to the organic compounds deteriorating. You can use a computer monitor for 10 years. A phone not so much.

Apple has been trying to get Samsung and LG to make dual-layer OLED displays for Macs and iPads, and probably even external displays too. More robust, can be driven to higher brightness, and will have longer lifetimes. So probably good enough for iPads and Macbooks. These are probably what are driving these iPad and Mac OLED rumors. There's a price per OLED display that everyone is trying to meet, they just haven't got there yet. In the next couple of years probably.

For microLED, where it will be more robust due to using inorganic compounds for the emissive subpixel, it's going to be 4 to 5 years before they can make it affordable for 5 to 30 inch displays. An Apple Watch display in 2025? Maybe. 5 to 30 inch displays at 220 to 350 PPI? Probably another 5 years at least, and there's time for a 3 to 5 year cycle of using OLEDs or miniLEDs.