The fact that M3 Ultra now support up to 512 GB RAM is pretty amazing. It's great for large scale LLMs. Ultra 2 would only support 192GB at max.
Why anyone would dislike your comment is puzzling to me.
I bought a Surface Laptop 7 with 64 GB RAM (at a little discount, as I’m a Microsoft employee: these can only be bought directly from Microsoft) purely for the point of having a Windows machine to run larger LLMs and do AI experimentation at a reasonable budget, knowing there are better performance options if you have bottomless budgets.
For the price, it’s a great deal: not many machines can run that large of LLMs. It’s not perfect, as memory bandwidth and thermals (when running pure CPU for the LLMs makes it a bit warm) appears to be the bottlenecks. Right now the NPU isn’t supported by LM Studio and others, and where you can use the NPU, most LLMs aren’t currently in the right format. It’s definitely an imperfect situation. But it runs 70 Billion parameter LLMs (sufficiently quantized) that you couldn’t do with nVidia chips at a rational price, but you do need to be patient.
I’d love to have seen an M4 Ultra with all the memory bandwidth: with 512 GB RAM, presumably being able to use all the CPU cores, GPU cores and Neural Engine cores, it’s likely still memory bandwidth constrained. I would note: my laptop is still perfectly interactive at load, with only the 12 cores. I’d expect far more with one of the Mac Studio beasts.
We finally have a viable reason mere mortals could make effective use of 512 GB RAM machines: LLMs. Resource constraints of current hardware are the biggest reasons we can’t have a very user-friendly, natural human language interaction hybrid OS using LLMs to interact with humans, and the traditional older style OS as a super powerful traditional computer architecture device driver and terminal layer. The funny thing is with powerful enough LLMs, you can describe what you need, and they can create applications that run within the context of the LLM itself to do what you need, they’re just needing a little bit more access to the traditional OS to carry it out for the GUI. I know, because I’m doing that on my laptop: it’s not fast enough to run all LLMs locally at maximum efficiency for humans yet, but it does work, better than expected.
Out of genuine curiosity why would one need to run a LLM especially with a maxed out m3 ultra? Like the use cases for such local llm
Data privacy. Using Llama or any of the offline LLM’s would finally use the Neural Engines (which are, at this stage with Apple being 2-3 years behind the curve) a FUTURE-PROOFING feature, that in the interim will only really be used for Local LLMs and things like Adobe Gen AI for designers, video editors and motion graphics folks.
This is what I and a lot of other 27 inch iMac owners have been waiting on. My only remaining decision is what monitor to buy. I want to get at least a 32 inch so was considering the
Does this seem like a good monitor to pair with my new studio? I get a $200 Dell credit every 6 months with my Amex Business Platinum card so that will knock it down to $749.
Nah. We've been waiting on a new 32" iMac.
That would be my preference but the chances of that are slim to none at this point. I would have been happy with a M4 iMac 27" but that also doesn't seem to be in the cards. My advice is just get a new studio like the rest of us.
I'm willing to part with some cash to separate the monitor from the computer.
I'll be replacing a 2017 iMac 5K with a 2020 iMac 5K for the family, and this will be the second perfectly good 5K display I'll have to abandon due to the associated compute hardware being obsolete.
M3Ultra instead of M4Ultra is likely due to Apple being hard at work during 2024 to make that happen and would have had to start over if skipping directly to M4Ultra.
Hopefully any lessons learned with M3Ultra can help accelerate M4Ultra.
They obviously had to reengineer the Thunderbolt controller for Thunderbolt 5 on the M3 Max, but the interconnect may be something that is much harder to insert into the SoC architecture - otherwise they would've probably made a M4 Ultra.
Comments
Maybe some of the high order address lines were damaged in the fab process for a lot of the defective (binned) chips.
They'd almost certainly want to sell their astronomically priced unified memory if they could.
I'll be replacing a 2017 iMac 5K with a 2020 iMac 5K for the family, and this will be the second perfectly good 5K display I'll have to abandon due to the associated compute hardware being obsolete.
They obviously had to reengineer the Thunderbolt controller for Thunderbolt 5 on the M3 Max, but the interconnect may be something that is much harder to insert into the SoC architecture - otherwise they would've probably made a M4 Ultra.