Mac Studio gets an update to M4 Max or M3 Ultra

AppleInsider · March 5, 2025 2:06PM

Apple's update to the Mac Studio has arrived, bringing with it the performance of the M4 Max as well as the new M3 Ultra chip to provide desktop Mac users even more processing power.

Sleek silver rectangular device with rounded edges, featuring ports and a small light on the front, resting on a mesh grille base.

M4 Mac Studio - Image Credit: Apple

The Mac Studio was last updated in 2023 with an M2 chip upgrade, securing the desktop Mac's position as the one to get if you care about performance, but not enough to get a Mac Pro. A few years later, and Apple's finally refreshed it.

The 2025 Mac Studio is Apple's typical spec-bump update, with the changes largely being internal instead of externally-visible alterations. The design remains the same as the previous model, one that was good enough for the Mac mini to steal for itself.

Inside one configuration is the M4 Max chip, which first surfaced in the 14-inch MacBook Pro and 16-inch MacBook Pro. In the portables, they were offered in two variants, consisting of a 14-core CPU with ten performance cores and four efficiency cores and a 32-core GPU, or a 16-core CPU with two more performance cores and a 40-core GPU.

In the Mac Studio, both M4 Max variants are available for choosing, with a $300 price difference.

Two Apple chip logos on black background: M4 Max with purple gradient and M3 Ultra with a blue gradient.

The M4 Max and M3 Ultra feature in the Mac Studio - Image Credit: Apple

The other option, the M3 Ultra, takes the usual Ultra form of being two M3 Max chips attached together. This doubles the CPU core and GPU core counts considerably, with one version having a 28-core CPU and 60-core GPU and the other having a 32-core CPU and an 80-core GPU.

Both M3 Ultra variants have a 32-core Neural Engine, double the core count of the non-Ultra chips.

When it comes to memory, the 14-core M4 Max model is limited to 36GB, with the 16-core version offering 48GB, 64GB, and 128GB capacities. The M3 Ultra versions include 96GB or 256GB for the 28-core CPU version, with an extra 512GB option for the top chip variant.

For storage, the M4 Max models start with 512GB SSDs, rising to configurations with 8TB. The M3 Ultra models start at 1TB and rising to 16TB.

Silver computer with perforated back panel, featuring multiple ports, including USB, HDMI, Ethernet, and power button, on the bottom section.

The rear of the new Mac Studio - Image Credit: Apple

When it comes to connectivity, there is a tiny change on the back panel in the form of four Thunderbolt 5 connections, supporting up to 120Gbps of bandwidth. The ports also handle USB 4 and DisplayPort 2.1 connections.

There are also two USB-A 3 connections, an HDMI 2.1 port, 10Gb Ethernet, and a headphone jack with high-impedance headphone support at the rear, as well as the power button.

At the front, the M4 Max has two USB-C ports with 10Gbps of bandwidth, while the M3 Ultra has two Thunderbolt 5 ports. Both versions also have an SDXC card slot with UHS-II support on the user side too.

Wireless connectivity consists of Wi-Fi 6E support and Bluetooth 5.3.

Available for preorder, the Mac Studio starts from $1,999 for the M4 Max variant, $3,999 for the M3 Ultra.

Read on AppleInsider

spliceguys · March 5, 2025 2:25PM

Came up super short without an Apple M4 ultra.

stabitha_christie · March 5, 2025 2:42PM

spliceguys said:

Came up super short without an Apple M4 ultra.

I’m I were to jump into the world of wild speculation I’d wager that the moving forward the Mac Pro will be the only device with the current version of the Ultra as it is a good way to delineate the devices while still giving the Studio a chip that is faster than the current Max.

azarius · March 5, 2025 2:54PM

It features Thunderbolt *5*, not 4.

gumashow · March 5, 2025 3:06PM

Very confusing article, The way it’s written. Why would this have thunderbolt four ports when the Mac mini Pro has thunderbolt five ports. And the way it describes the ram configurations, I’m lost.

netrox · March 5, 2025 3:12PM

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

blastdoor · March 5, 2025 3:23PM

Would have been nice to get M4 Ultra, but the Studio with the base M3 Ultra looks pretty appealing to me.

I work for a private firm that has lost a lot of revenue due to arbitrary DOGE contract cancellations, so my job is currently at risk. But if I can hold onto my job, I might spring for that base Ultra.

9secondkox2 · March 5, 2025 3:46PM

Wow, EXTREMELY disappointed.

Instead of building a killer Mac Pro while the Studio forges ahead with an Nvidia beating processor, we get a handicapped Mac Studio so the Pro can pretend to look better What kid of crap is that?

The only saving grace might be ... if an iMac Pro comes out with the current version Ultra chipset. Would delineate Pro from Studio.

But still... incredibly bad move. Almost unbelievable. Like something a Samsung would do.

edited March 5

9secondkox2 · March 5, 2025 3:50PM

Also interesting is that the M3 did not have the interconnect, so Apple had to go back and make one? For this? Why not do a monolith or do so for the M4? Horrible to know all the potential for a big swing at Nvidia is there only to be flush down the toilet. The only thing that rectifies this is if they're making a monolith M4 Ultra and it's just not there yet. Otherwise, this is kind of a joke.

king editor the grate · March 5, 2025 4:09PM

9secondkox2 said:

Like something a Samsung would do.

None of them have exploded … yet …

verne arase · March 5, 2025 4:21PM

This is the one I've been waiting for to replace my 2020 iMac 5K which has been glitching lately playing YouTube videos and consuming 100% CPU.

Getting this with a Studio Display, 64 GB RAM, a fully loaded M4 Max, and 4 TB SSD so this is basically a slide-in replacement for my iMac.

The Ultras have way more horsepower and multiprocessing than I need, and the faster M4 performance cores will make this the snappiest Mac yet for normal workloads.

With 4 Thunderbolt 5 ports on the rear, the Studio Display will leave three each with a Thunderbolt controller behind it (unlike the 2020 iMac which shared a single controller between the two ports). A new keyboard will add TouchID support to speed security interruptions.

This should be over 3x faster than my core-i9 iMac (which is still my daily driver due to its superior screen real estate), with six times faster graphics with hardware ray tracing - not to mention support for Apple Intelligence. The 2020 iMac will replace the old 2017 iMac 5K as the family computer (which no longer has current macOS support).

Splitting up the processor and display will make for cheaper and smoother upgrades in the future too - though given the power of this new hardware it's hard to see needing upgrades much in the future.

handsomesmitty · March 5, 2025 5:51PM

Wow, the balls on Apple, continuing their bait and switch tactics with limited memory options for low-end Max chip on the studio. Crazy criminal!

anonconformist · March 5, 2025 5:52PM

netrox said:

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

Why anyone would dislike your comment is puzzling to me.

I bought a Surface Laptop 7 with 64 GB RAM (at a little discount, as I’m a Microsoft employee: these can only be bought directly from Microsoft) purely for the point of having a Windows machine to run larger LLMs and do AI experimentation at a reasonable budget, knowing there are better performance options if you have bottomless budgets.

For the price, it’s a great deal: not many machines can run that large of LLMs. It’s not perfect, as memory bandwidth and thermals (when running pure CPU for the LLMs makes it a bit warm) appears to be the bottlenecks. Right now the NPU isn’t supported by LM Studio and others, and where you can use the NPU, most LLMs aren’t currently in the right format. It’s definitely an imperfect situation. But it runs 70 Billion parameter LLMs (sufficiently quantized) that you couldn’t do with nVidia chips at a rational price, but you do need to be patient.

I’d love to have seen an M4 Ultra with all the memory bandwidth: with 512 GB RAM, presumably being able to use all the CPU cores, GPU cores and Neural Engine cores, it’s likely still memory bandwidth constrained. I would note: my laptop is still perfectly interactive at load, with only the 12 cores. I’d expect far more with one of the Mac Studio beasts.

We finally have a viable reason mere mortals could make effective use of 512 GB RAM machines: LLMs. Resource constraints of current hardware are the biggest reasons we can’t have a very user-friendly, natural human language interaction hybrid OS using LLMs to interact with humans, and the traditional older style OS as a super powerful traditional computer architecture device driver and terminal layer. The funny thing is with powerful enough LLMs, you can describe what you need, and they can create applications that run within the context of the LLM itself to do what you need, they’re just needing a little bit more access to the traditional OS to carry it out for the GUI. I know, because I’m doing that on my laptop: it’s not fast enough to run all LLMs locally at maximum efficiency for humans yet, but it does work, better than expected.

danox · March 5, 2025 5:54PM

spliceguys said:

Came up super short without an Apple M4 ultra.

Actually, your comment is super short, particularly since no one has their hands on one yet to test it. It is surprising (curved ball ) however that it’s an M3 ultra Variant instead of an M4 Ultra however, the competition is not even close to 256 gigs or 512 gigs of UMA memory, operating silently as the grave on your desk.

The important thing is whether or not Apple is using these as servers for Apple Intelligence, and I assume they are using (actively developing) software to allow the networking of two, four, six, or eight of these Mac Studios together as one, 512 gigs of memory with thunderbolt 5, big bandwidth (at low wattage) once again where is the competition with that combination? The ultimate key for the success of the system is the software support Apple needs to develop for supporting themselves and developers in AI modeling area of computing going forward.

edited March 5

apple4thewin · March 5, 2025 6:00PM

anonconformist said:

netrox said:

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

Why anyone would dislike your comment is puzzling to me.

I bought a Surface Laptop 7 with 64 GB RAM (at a little discount, as I’m a Microsoft employee: these can only be bought directly from Microsoft) purely for the point of having a Windows machine to run larger LLMs and do AI experimentation at a reasonable budget, knowing there are better performance options if you have bottomless budgets.

For the price, it’s a great deal: not many machines can run that large of LLMs. It’s not perfect, as memory bandwidth and thermals (when running pure CPU for the LLMs makes it a bit warm) appears to be the bottlenecks. Right now the NPU isn’t supported by LM Studio and others, and where you can use the NPU, most LLMs aren’t currently in the right format. It’s definitely an imperfect situation. But it runs 70 Billion parameter LLMs (sufficiently quantized) that you couldn’t do with nVidia chips at a rational price, but you do need to be patient.

I’d love to have seen an M4 Ultra with all the memory bandwidth: with 512 GB RAM, presumably being able to use all the CPU cores, GPU cores and Neural Engine cores, it’s likely still memory bandwidth constrained. I would note: my laptop is still perfectly interactive at load, with only the 12 cores. I’d expect far more with one of the Mac Studio beasts.

We finally have a viable reason mere mortals could make effective use of 512 GB RAM machines: LLMs. Resource constraints of current hardware are the biggest reasons we can’t have a very user-friendly, natural human language interaction hybrid OS using LLMs to interact with humans, and the traditional older style OS as a super powerful traditional computer architecture device driver and terminal layer. The funny thing is with powerful enough LLMs, you can describe what you need, and they can create applications that run within the context of the LLM itself to do what you need, they’re just needing a little bit more access to the traditional OS to carry it out for the GUI. I know, because I’m doing that on my laptop: it’s not fast enough to run all LLMs locally at maximum efficiency for humans yet, but it does work, better than expected.

Out of genuine curiosity why would one need to run a LLM especially with a maxed out m3 ultra? Like the use cases for such local llm

danox · March 5, 2025 6:22PM

netrox said:

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

The support for 512 gigs of UMA memory is the most important aspect of this new Mac Studio, I find it interesting through the course of most of the last year and a half many tech people (You-tubers) complained about the lack of Apple memory in some systems, but now that they’re offering a hellacious amount of memory, the same bunch will still complain, and yes, that memory will cost you but so what the most important fact is that it’s available for power user’s if they choose to buy one or many.

edited March 5

applepoor · March 5, 2025 6:24PM

Go to Apple's Mac Studio site and do a compare with the new Ultra Studio and the M4 Max like I have in my new 16" Mac Book Pro with 128GB of memory and 8TB SSD.

When you see where the numbers are double in the Ultra what they in the M4 Max, I think for marketing purposes Apple choose "M3 Ultra" so they have "M4 Ultra" for the Mac Tower later. The number that will verify this thought will be the single core speed of the M3 Ultra vs the single core speed of the M4 Max.

In both the M1 family and the M2 family of four trim lines, the single core speed for each of the four trim lines were nearly identical. So we will have to see if the M3 Ultra has a single core speed like the M3 family or M4 family.

danox · March 5, 2025 6:35PM

apple4thewin said:

anonconformist said:

netrox said:

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

Why anyone would dislike your comment is puzzling to me.

I bought a Surface Laptop 7 with 64 GB RAM (at a little discount, as I’m a Microsoft employee: these can only be bought directly from Microsoft) purely for the point of having a Windows machine to run larger LLMs and do AI experimentation at a reasonable budget, knowing there are better performance options if you have bottomless budgets.

For the price, it’s a great deal: not many machines can run that large of LLMs. It’s not perfect, as memory bandwidth and thermals (when running pure CPU for the LLMs makes it a bit warm) appears to be the bottlenecks. Right now the NPU isn’t supported by LM Studio and others, and where you can use the NPU, most LLMs aren’t currently in the right format. It’s definitely an imperfect situation. But it runs 70 Billion parameter LLMs (sufficiently quantized) that you couldn’t do with nVidia chips at a rational price, but you do need to be patient.

I’d love to have seen an M4 Ultra with all the memory bandwidth: with 512 GB RAM, presumably being able to use all the CPU cores, GPU cores and Neural Engine cores, it’s likely still memory bandwidth constrained. I would note: my laptop is still perfectly interactive at load, with only the 12 cores. I’d expect far more with one of the Mac Studio beasts.

We finally have a viable reason mere mortals could make effective use of 512 GB RAM machines: LLMs. Resource constraints of current hardware are the biggest reasons we can’t have a very user-friendly, natural human language interaction hybrid OS using LLMs to interact with humans, and the traditional older style OS as a super powerful traditional computer architecture device driver and terminal layer. The funny thing is with powerful enough LLMs, you can describe what you need, and they can create applications that run within the context of the LLM itself to do what you need, they’re just needing a little bit more access to the traditional OS to carry it out for the GUI. I know, because I’m doing that on my laptop: it’s not fast enough to run all LLMs locally at maximum efficiency for humans yet, but it does work, better than expected.

Out of genuine curiosity why would one need to run a LLM especially with a maxed out m3 ultra? Like the use cases for such local llm

Don’t forget Apple is also using these new M3 Ultras behind the scenes as servers for Apple Intelligence that may be a big reason why they decided to support up to 512 gigs of UMA memory most (me included) were speculating that Apple would go to 256-320 gigs it’s actually a big bonus that Apple is going to 512 gigs.

edited March 5

danox · March 5, 2025 6:44PM

HandsomeSmitty said:

Wow, the balls on Apple, continuing their bait and switch tactics with limited memory options for low-end Max chip on the studio. Crazy criminal!

Intel, AMD, and Nvidia are available on the marketplace…. Good luck with the wattage, It seems Apple can’t win on the low end or the high end from a memory standpoint low put too little on the high put too much.

gwmac · March 5, 2025 6:57PM

This is what I and a lot of other 27 inch iMac owners have been waiting on. My only remaining decision is what monitor to buy. I want to get at least a 32 inch so was considering the

Dell UltraSharp 32 4K Thunderbolt Hub Monitor - U3225QE

Does this seem like a good monitor to pair with my new studio? I get a $200 Dell credit every 6 months with my Amex Business Platinum card so that will knock it down to $749.

edited March 5

anonconformist · March 5, 2025 7:07PM

apple4thewin said:

anonconformist said:

netrox said:

The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.

Why anyone would dislike your comment is puzzling to me.

I bought a Surface Laptop 7 with 64 GB RAM (at a little discount, as I’m a Microsoft employee: these can only be bought directly from Microsoft) purely for the point of having a Windows machine to run larger LLMs and do AI experimentation at a reasonable budget, knowing there are better performance options if you have bottomless budgets.

For the price, it’s a great deal: not many machines can run that large of LLMs. It’s not perfect, as memory bandwidth and thermals (when running pure CPU for the LLMs makes it a bit warm) appears to be the bottlenecks. Right now the NPU isn’t supported by LM Studio and others, and where you can use the NPU, most LLMs aren’t currently in the right format. It’s definitely an imperfect situation. But it runs 70 Billion parameter LLMs (sufficiently quantized) that you couldn’t do with nVidia chips at a rational price, but you do need to be patient.

I’d love to have seen an M4 Ultra with all the memory bandwidth: with 512 GB RAM, presumably being able to use all the CPU cores, GPU cores and Neural Engine cores, it’s likely still memory bandwidth constrained. I would note: my laptop is still perfectly interactive at load, with only the 12 cores. I’d expect far more with one of the Mac Studio beasts.

We finally have a viable reason mere mortals could make effective use of 512 GB RAM machines: LLMs. Resource constraints of current hardware are the biggest reasons we can’t have a very user-friendly, natural human language interaction hybrid OS using LLMs to interact with humans, and the traditional older style OS as a super powerful traditional computer architecture device driver and terminal layer. The funny thing is with powerful enough LLMs, you can describe what you need, and they can create applications that run within the context of the LLM itself to do what you need, they’re just needing a little bit more access to the traditional OS to carry it out for the GUI. I know, because I’m doing that on my laptop: it’s not fast enough to run all LLMs locally at maximum efficiency for humans yet, but it does work, better than expected.

Out of genuine curiosity why would one need to run a LLM especially with a maxed out m3 ultra? Like the use cases for such local llm

That’s a perfectly fair question!

I do developer support at Microsoft. That involves using and creating quite a lot of complex tools, often during the progression of a single support case, whether that be writing working sample code in an Advisory case, or for the sake of debugging and analyzing Time Travel Debugging and other types of telemetry captured in a reasonable timeframe, to doing lots of research and analysis of documentation and source code to make sense of how things should work (right now, I’m stuck analyzing OS source code manually and that’s very time-consuming).

1. Privacy: there are various things you can’t afford to have exposed outside of your machine, or as limited of a group of people as you can avoid. GDPR amplifies that, but this is also true for personal, confidential stuff you want to work on, for your formal employer, or you, if self-employed, or doing stuff outside of work.

2. When you have enough utilization, at some point it becomes more economical to have only your electric bill as the recurring expense. I live and work in the US where electricity is relatively cheap compared to some other locations in the world. But, the more tokens used, the more computer time used with online services, the more the equation tips towards it making sense. The observation is the more you can spend on those tokens, often the better the results you can get.

3. Rate-limiting factors: even if you have the budget for the remote LLM usage, you have a very real probability of being rate-limited. This may happen for many reasons, including the ultimate rate-limiting aspects of down servers or internet connections.

4. Online LLMs tend to be tuned to not provide output as you’d like it: they’re targeted to certain requirements for business reasons that don’t necessarily align with your needs. Especially working with code, this can be problematic. If all you need is a tiny application that can be created with a single prompt, that’s tiny, simple, and rather rare in the field. There’s a lot more to it than that. That’s great for youTube demonstration videos, not much like creating more complex applications. Also, refer back to #3, this plays heavily there, too.

5. Every LLM has a personality. We’ve entered that realm that, only a few years ago, was something out of science fiction, and yet, here we are. I’ve worked with quite a few different LLMs, locally as well as remotely, all have a personality. They also each have different strengths and weaknesses, just like humans. There is not such a thing as one-size-fits-all as of now.

6. Smaller LLMs have the advantage of being more immediate for response, such as translating speech-to-text and text-to-speech, and topic-specific autocompletion at actual interactive speeds. You can’t reliably get that from online because of network latencies. It’s more than a question of which LLM (massive) you want to run, it’s more rational to consider how to optimize the right team of LLMs for your needs: tiny ones for maximum speed for interaction where it matters, huge, powerful ones for heavy-lifting tasks that live typing/speaking interaction speeds aren’t the biggest reason. Might as well have all of them in RAM at once, since memory bandwidth is generally the most limiting factor, but I’ve not had a chance to verify with powerful enough machines that I don’t have to worry about thermal throttling, that have more cores.

7. In my personal research OS inside an LLM (currently inside Grok 3 only, I have a bit of that interface layer to do) I can be extremely abstract and state what I want to do, and the LLM, via my applications, will utilize the insanely great levels of abstraction enabled by a distillation of all the human knowledge in the unsupervised training data to either identify the correct application to fulfill my need, or it will create a new application on the fly to do so, in a repeatable way, focused on that task. I had to do a double-take when I realized what I’d done. I made a mistake once and hit the return key before I intended to, and if I’m to believe what I was told, I accidentally created a productivity application to track projects. Oops! The craziest thing is you can create very powerful functionality using even a simple prompt, or even a whole page of a prompt, that would require a huge amount of code to make happen with traditional applications. The larger and more powerful the LLM in question, the shorter and more abstract that prompt can be. A very large LLM can be compared to the original Macintosh Toolbox in ROM that enabled great GUIs and powerful applications that ran off floppy disks and the machine started out with 128KB RAM. Note: buying super-fast SSDs isn’t a worthwhile tradeoff, as they’re still dreadfully too slow compared to RAM, and you’ll thrash an. SSD to death with swapping.

8. I’ve got a few disabilities that impact me: I can fully control how all this interacts to my wishes.

9. A system where you aren’t running up against arbitrary limits, particularly those outside your control, greatly improves your capacity to get into an incredibly effective state of flow. Autocomplete that slows you down is worse than no autocomplete at all, as one example, and very disruptive. Sudden rate-limiting, or network/service outage? There goes your deep context working state, you’ve easily lost half an hour of effectiveness, if you can get it back, and you don’t know when. If you’re not doing deep-thinking tasks that require you to intensely focus, it doesn’t matter much. Me? I’m autistic with ADHD, dyslexic, dyspraxic and some other things, the fewer things that get in my way, the better.

10. Censorship: what if you want to do things that online models don’t allow? It’s amazing just how much you can’t even discuss with online LLMs for various reasons.

11. The memory required for processing a given context size for LLMs isn’t linear, it’s worse.

I hope this provides useful food for thought: I could possibly have left reasons out. If you know how to use them, and you have good use-cases for them, they’re a time/efficiency force-multiplier and it may make sense to buy a $10K Mac Studio. If all you’re doing is silly chatting, it doesn’t readily justify such expenditures. Right now, I’m both the upper end of AI and traditional OS/computer hardware power user territory, so I know how to make very good use of this, right now. But assuming they have the interest, others will start realizing there’s so much more they can do that they had no idea could be done, like in my experiment of creating a fantasy adventure novel, and asking my locally-run LLM to identify and name all the concepts in the story thus far and compare that against what is found in successful fantasy adventure novels. When you know how to keep them focused on a topic, the hallucination factor is extremely reduced, too, and they even run faster!

Mac Studio gets an update to M4 Max or M3 Ultra

Comments

Dell UltraSharp 32 4K Thunderbolt Hub Monitor - U3225QE