Genuinely curious, why not include higher capacity memory modules if there's demand for it? I imagine the pin outs are the same for every capacity Apple uses, so why not include more than 192 GB for those tasks that demand greater amounts of memory? Or are M Series chips unable to address that much memory?
Misses the other major benefit of SOC integrated memory. In a traditional system if I want to move an image from memory to a graphics card that image data has to be copied from the CPU's RAM to the RAM on the GPU, byte by byte. Similarly, if I want the GPU to perform some action on that image and return it then the result needs to be copied once more from the GPU back to the CPU.
In Apple's SOC design, you do little more than hand the address of the data in RAM to the GPU, which then can perform the operation in place. You get tremendous gains in throughput when you don't have to copy data back and forth.
This is the main reason why the RAM is soldered. The article's mention of interrupts and related information is technically correct, but irrelevant: Apple's design still has a memory controller and still behaves very similarly to designs with outboard memory controllers (northbridge) or integrated memory controllers (current AMD and Intel processors).
The memory controller has exclusive access to the RAM. Everything else goes through the memory controller to request the contents of an address range. Apple's data sharing is possible because the CPU cores and GPU cores (and Neural Engine cores, etc.) all go through the same memory controller. The performance figures (such as 400 GB/s of memory throughput) are because Apple uses multiple DDR5 channels. Each channel gets 50 GB/s. The base M2 has two. The M2 Pro has four. The M2 Max has eight. The M2 Ultra has 16. Each of these channels could lead to a slot if Apple wanted. Yes, there would be a barely-measurable amount of latency added by the longer traces. They would also consume a barely-measurable amount of extra power. Those are insignificant next to the two main reasons Apple doesn't offer slots, though:
Slots take up a lot more physical space than RAM chips soldered directly to the SoC package
If you give users slots, they will put RAM in them with different capacities and performance characteristics
For the first one, four memory channels on the M2 Pro would mean four DIMM/SO-DIMM slots. That's not huge, but it's also not nothing. The M2 Max is more significant. Eight DIMM/SO-DIMM slots take up a lot of space. That many sticks of RAM would more than double the size of the MacBook Pro's logic board. 16 DIMM or SO-DIMM slots would roughly double the size of the Mac Studio.
For the second point, with soldered RAM, Apple can guarantee that each channel has the same amount of RAM and each channel's RAM performs the same. This removes a HUGE amount of situational logic to deal with seven sticks at 50 GB/s and one stick at 44.8 GB/s, or five slots populated with 8 GB each and three empty slots. The system can be designed to just assume certain things because Apple can guarantee at a manufacturing level that those assumptions will never be violated.
Genuinely curious, why not include higher capacity memory modules if there's demand for it? I imagine the pin outs are the same for every capacity Apple uses, so why not include more than 192 GB for those tasks that demand greater amounts of memory? Or are M Series chips unable to address that much memory?
Why not build a Homer?
People sure like to complain about the 192GB max, but the only honest ones are exceptions, the rest are just negative fucking nellies complaining about a limit they'll never reach, never need and have no skin in the game regarding.
Ok, so the writer gets it wrong, as so many others have when it comes to the M series RAM packaging. One would think that’s this simple thing would be well understood by now. So let me make it very clear - the RAM is NOT on the chip. It is NOT “in the CPU itself”. As we should all know by now, it’s in two packages soldered to the substrate, which is the small board the the SoC is itself soldered to. The lines from Apple’s fabric, which everything on the chip is connected with, extend to that substrate, to the RAM chips. Therefore, the RAM chips are separate from the SoC, and certainly not in the CPU itself.
As we also know, Apple offers several different levels of RAM for each M series they sell. That means that there is no limit to their ability to decide how much RAM they can offer, up to the number of memory lines that can be brought out. This is no different from any traditional computer. Every CPU and memory controller has a limit as to how much RAM can be used.
So, it seems to me that Apple could, if it wanted to, have sockets for those RAM packages, which add no latency, and would allow exchangeable RAM packages. Apple would just have to extend the maximum number of memory lines out to the socket. How many would get used would depend on the amount of RAM in the package. That’s nothing new. That’s how it’s done. Yes, under that scheme you would have to remove a smaller RAM package when getting a larger one, but that's also normal. The iMac had limited RAM slots and we used to do that all the time.
Apple could also add an extra two sockets, in addition to the RAM that comes with the machine. So possibly there would be two packages soldered to the substrate, and two more sockets for RAM expansion.
Remember that Apple sometimes does something a specific way, not because that’s the way it has to be done, but because they decided that this was the way they were going to do it. We don’t know where Apple is going with this in the future. It’s possible that the M2, which is really just a bump from the M1, is something to fill in the time while we’re waiting for the M3, which with the 3nm process it’s being built on, is expected to be more than just another bump in performance. Perhaps an extended RAM capability is part of that.
Actually, moving the memory further away from the CPU does add latency. Every foot of wire adds about a nanosecond of delay.
Then there is the issue of how many wires you run. When the memory is physically close to the CPU you can run more wires from the memory to the CPU, this allows you to get data to/from the CPU faster. It's not practical to run a large number of wires to a socket that might be a foot or more of cable run away. That means you transfer less data in each clock cycle.
Generally socketed memory is on an external bus. This lets various peripherals directly access memory. The bus arbitration also adds overhead.
Traditional CPUs try to overcome these memory bottlenecks by using multiple levels of cache. This can provide a memory bandwidth performance boost for chunks of recently accessed memory. However, tasks that use more memory than will fit in the cache, may not benefit from these techniques.
Apples "System on a Chip" design really does allow much higher memory bandwidth. Socketing the memory really would reduce performance.
The trace lengths and widths of PCB memory busses and socket connections are substantially greater than those inside the chip carrier. That adds significant capacitance that must be driven by both the SOC on one end and the memory modules on the other. This takes considerable power. As trace lengths increase, timing skew increases. With clock frequencies approaching 6GHz, a "bit" is approaching one inch in length (PCB signal propagation speed is about C/2). Matching trace lengths is much harder to do as trace length increases. Keeping signals on the PCB also becomes harder as trace lengths approach 1/4 wavelength, turning traces into antennae. There are very good reasons for Apple to take the path they're on. Phones, tablets and laptops are the bulk of the Apple's business and those products don't want server farm memory architectures.
Ok, so the writer gets it wrong, as so many others have when it comes to the M series RAM packaging. One would think that’s this simple thing would be well understood by now. So let me make it very clear - the RAM is NOT on the chip. It is NOT “in the CPU itself”. As we should all know by now, it’s in two packages soldered to the substrate, which is the small board the the SoC is itself soldered to. The lines from Apple’s fabric, which everything on the chip is connected with, extend to that substrate, to the RAM chips. Therefore, the RAM chips are separate from the SoC, and certainly not in the CPU itself.
As we also know, Apple offers several different levels of RAM for each M series they sell. That means that there is no limit to their ability to decide how much RAM they can offer, up to the number of memory lines that can be brought out. This is no different from any traditional computer. Every CPU and memory controller has a limit as to how much RAM can be used.
So, it seems to me that Apple could, if it wanted to, have sockets for those RAM packages, which add no latency, and would allow exchangeable RAM packages. Apple would just have to extend the maximum number of memory lines out to the socket. How many would get used would depend on the amount of RAM in the package. That’s nothing new. That’s how it’s done. Yes, under that scheme you would have to remove a smaller RAM package when getting a larger one, but that's also normal. The iMac had limited RAM slots and we used to do that all the time.
Apple could also add an extra two sockets, in addition to the RAM that comes with the machine. So possibly there would be two packages soldered to the substrate, and two more sockets for RAM expansion.
Remember that Apple sometimes does something a specific way, not because that’s the way it has to be done, but because they decided that this was the way they were going to do it. We don’t know where Apple is going with this in the future. It’s possible that the M2, which is really just a bump from the M1, is something to fill in the time while we’re waiting for the M3, which with the 3nm process it’s being built on, is expected to be more than just another bump in performance. Perhaps an extended RAM capability is part of that.
Having the memory as close as possible to the CPU allows the memory to be clocked faster and less latency. Also minimizes other issues that slow down the system like electromagnetic interference, signal integrity degradation, and complexity of routing to many signals on a board. Moreover, every time you turn or pass through a via on a board, add solder joint from connectors and connections to memory boards, it adds more delays and areas where the contact might fail due to corrosion, cracked solder joints, flex from heating and cooling, etc..
Routing on the package and die mounting allows most density, faster speed, greater reliability, and reduces cost since you do not incur in additional cost for extra memory packages, large packages with many pins, complicated board routing and simpler cooling of CPU and memory due to proximity. However, you give up flexibility for future expansion. Nevertheless, by the time you outgrow your system the CPU is so dated, you end up buying a new computer..
So every single PC or computer manufacturer can use modular RAM. Including servers, workstations, data centers, super computers.
But somehow the Apple chips cannot and are trying to convince us that it is not just plain & simple greed??
Vertical system integration doesn't necessarily equal greed. It is just another way of doing things. As the article points out there are many benefits of having a tightly integrated system as there are trade offs. It would be greedy if the systems were capable of supporting modular RAM and Apple was simply preventing it. They're not. They have stated that their design goal for these SoCs was mostly about efficiency and the best way to achieve that was by integrating everything and not supporting off SoC resources other than I/O.
No one is losing money or gaining money from Apple's systems having UMA. If you need more memory down the road, you can simply upgrade to a new system and sell the old one to make up for much of that cost. A vast majority of people never upgrade anything in their computers. When it gets old, they throw it out and get a new one. The PC market relies on this turnaround.
Food for thought: GPU cards have never had upgradable memory; you're stuck with what it came with. Why do you think that is?
Ok, so the writer gets it wrong, as so many others have when it comes to the M series RAM packaging. One would think that’s this simple thing would be well understood by now. So let me make it very clear - the RAM is NOT on the chip. It is NOT “in the CPU itself”. As we should all know by now, it’s in two packages soldered to the substrate, which is the small board the the SoC is itself soldered to. The lines from Apple’s fabric, which everything on the chip is connected with, extend to that substrate, to the RAM chips. Therefore, the RAM chips are separate from the SoC, and certainly not in the CPU itself.
As we also know, Apple offers several different levels of RAM for each M series they sell. That means that there is no limit to their ability to decide how much RAM they can offer, up to the number of memory lines that can be brought out. This is no different from any traditional computer. Every CPU and memory controller has a limit as to how much RAM can be used.
So, it seems to me that Apple could, if it wanted to, have sockets for those RAM packages, which add no latency, and would allow exchangeable RAM packages. Apple would just have to extend the maximum number of memory lines out to the socket. How many would get used would depend on the amount of RAM in the package. That’s nothing new. That’s how it’s done. Yes, under that scheme you would have to remove a smaller RAM package when getting a larger one, but that's also normal. The iMac had limited RAM slots and we used to do that all the time.
Apple could also add an extra two sockets, in addition to the RAM that comes with the machine. So possibly there would be two packages soldered to the substrate, and two more sockets for RAM expansion.
Remember that Apple sometimes does something a specific way, not because that’s the way it has to be done, but because they decided that this was the way they were going to do it. We don’t know where Apple is going with this in the future. It’s possible that the M2, which is really just a bump from the M1, is something to fill in the time while we’re waiting for the M3, which with the 3nm process it’s being built on, is expected to be more than just another bump in performance. Perhaps an extended RAM capability is part of that.
Yes. It is common knowledge that RAM is not part of the actual SoC, it's on package with the SoC. People use the term "SoC" to describe the whole part ("M1", "M2") which includes the RAM.
Being able to control how much RAM is installed allows Apple to guarantee that all memory channels are filled and being utilized, maximizing performance.
Genuinely curious, why not include higher capacity memory modules if there's demand for it? I imagine the pin outs are the same for every capacity Apple uses, so why not include more than 192 GB for those tasks that demand greater amounts of memory? Or are M Series chips unable to address that much memory?
Do you have evidence of this demand that would yield a substantial profit to Apple?
So every single PC or computer manufacturer can use modular RAM. Including servers, workstations, data centers, super computers.
But somehow the Apple chips cannot and are trying to convince us that it is not just plain & simple greed??
1) Mac's fall under the category of PC and computer.
2) Are you claiming that all WinPCs have slotted RAM for the GPU? If so, that's far from correct and since this united RAM is used by the GPU your attempt at an argument has no basis in reality.
Being able to control how much RAM is installed allows Apple to guarantee that all memory channels are filled and being utilized, maximizing performance.
Yes, and do not forget another point: scalability: memory bandwith scale up with the chips, from M2 to M2Pro to M2Max to M2Ultra. The number of channels go up with the chips.
According to Anatech, the M1 had 8 memory channels. The M1 Pro 16, The M1 Max 32, the M1 Ultra 64. If i understood well, a DIMM include two memory channels. This means that in order to a similar performance, you need 8 DIMM in parallel for the M1 Pro, 16 on the M1 Max, 32 on the Ultra. I suppose that M2 numbers are similar or a bit larger. It doesn't seems very practical to put 8 DIMM on a Mac Mini, or 16 in a Mac Studio or in a Mac Book.Without considering power considerations. I do not think there is a practical alternative for Apple.
For the Mac Pro, yes, there could be alternatives, like a NUMA architecture (Soc memory plus external memory). But that would require specific chips, and specifics extensions of the OS. To cover the market for which 192 Gb of Ram are not enough. Does the size of this market justify the investiment ? I do not know.
For the author of the article: go back to your 101 computer architecture lessons. DMA is not for standard CPU memory access, computers do not use memory busses anymore since at least 20 years, and interrupt is not used during normal memory acces. DMA is used for I/O, to allow memory access without going thru the CPU. And this also was true at the 6502 times, i think situation is a little bit more complex today.
However "t's why macOS finally feels snappy and responsive after feeling slightly rubbery for decades." is wrong. macOS on Intel has been snappy and responsive for quite a long time now.
Genuinely curious, why not include higher capacity memory modules if there's demand for it? I imagine the pin outs are the same for every capacity Apple uses, so why not include more than 192 GB for those tasks that demand greater amounts of memory? Or are M Series chips unable to address that much memory?
Are you sure they don't already use the highest density memory chips (considering size constraints) on those packages!? They need 8 (24GB) chips to achieve that 800GB/s bandwidth (1 chip per 100GB/s controller). Another thing to consider is the amount of power drawn by the increase in memory. Having that much power draw in such a relatively small area might be limiting.
With the M2 Max topping out at 96GB (4x 24GB), I'm thinking the 24GB chips they're currently using are probably the highest capacity chips they could fit in the given area.
The Mac Pro could be made to allow the use of PCI-based RAM as 2nd-tier memory. Then the SoC RAM would act as a rather huge cache. Alternatively, code could be programmed to perform certain less time-sensitive operations in that 2nd-tier RAM. This would then give users the opportunity to upgrade their RAM with aftermarket hardware.
The Mac Pro could be made to allow the use of PCI-based RAM as 2nd-tier memory. Then the SoC RAM would act as a rather huge cache. Alternatively, code could be programmed to perform certain less time-sensitive operations in that 2nd-tier RAM. This would then give users the opportunity to upgrade their RAM with aftermarket hardware.
Why? Mac computers are not for you. Just buy a PC.
The Mac Pro could be made to allow the use of PCI-based RAM as 2nd-tier memory. Then the SoC RAM would act as a rather huge cache. Alternatively, code could be programmed to perform certain less time-sensitive operations in that 2nd-tier RAM. This would then give users the opportunity to upgrade their RAM with aftermarket hardware.
An interesting idea. I suspect that many would complain that Apple was purposely throttling the performance of the third party socketed memory.
In any case, that socketed memory would not provide the same level of performance as the main memory. This means it may not meet the needs of those that really need huge amounts of memory. The current M2 chips use the SSD as backing. This "virtual memory" scheme allows programs to address more memory than physically exists. Memory above and beyond what exists is stored on the SSD, and the actual memory is used a "cache" for the memory currently in use.
So every single PC or computer manufacturer can use modular RAM. Including servers, workstations, data centers, super computers.
But somehow the Apple chips cannot and are trying to convince us that it is not just plain & simple greed??
Where do I plug RAM into my iPad? iPhone?
Yeah, not a greed thing. It’s an architecture thing. One line of extremely high performing chips for their product line is just smart.
Charging $200 for an 8GB RAM upgrade is not a "greed thing"? Charging $200 for a 256GB SSD upgrade is not a "greed thing"? Those are up to 10X the cost a PC user with modular upgrades pays. It most certainly is part greed. But then, Apple couldn't make that $3T valuation, could they?
Comments
The memory controller has exclusive access to the RAM. Everything else goes through the memory controller to request the contents of an address range. Apple's data sharing is possible because the CPU cores and GPU cores (and Neural Engine cores, etc.) all go through the same memory controller. The performance figures (such as 400 GB/s of memory throughput) are because Apple uses multiple DDR5 channels. Each channel gets 50 GB/s. The base M2 has two. The M2 Pro has four. The M2 Max has eight. The M2 Ultra has 16. Each of these channels could lead to a slot if Apple wanted. Yes, there would be a barely-measurable amount of latency added by the longer traces. They would also consume a barely-measurable amount of extra power. Those are insignificant next to the two main reasons Apple doesn't offer slots, though:
- Slots take up a lot more physical space than RAM chips soldered directly to the SoC package
- If you give users slots, they will put RAM in them with different capacities and performance characteristics
For the first one, four memory channels on the M2 Pro would mean four DIMM/SO-DIMM slots. That's not huge, but it's also not nothing. The M2 Max is more significant. Eight DIMM/SO-DIMM slots take up a lot of space. That many sticks of RAM would more than double the size of the MacBook Pro's logic board. 16 DIMM or SO-DIMM slots would roughly double the size of the Mac Studio.For the second point, with soldered RAM, Apple can guarantee that each channel has the same amount of RAM and each channel's RAM performs the same. This removes a HUGE amount of situational logic to deal with seven sticks at 50 GB/s and one stick at 44.8 GB/s, or five slots populated with 8 GB each and three empty slots. The system can be designed to just assume certain things because Apple can guarantee at a manufacturing level that those assumptions will never be violated.
People sure like to complain about the 192GB max, but the only honest ones are exceptions, the rest are just negative fucking nellies complaining about a limit they'll never reach, never need and have no skin in the game regarding.
MacRumors would like a word with those people.
Routing on the package and die mounting allows most density, faster speed, greater reliability, and reduces cost since you do not incur in additional cost for extra memory packages, large packages with many pins, complicated board routing and simpler cooling of CPU and memory due to proximity. However, you give up flexibility for future expansion. Nevertheless, by the time you outgrow your system the CPU is so dated, you end up buying a new computer..
Vertical system integration doesn't necessarily equal greed. It is just another way of doing things. As the article points out there are many benefits of having a tightly integrated system as there are trade offs. It would be greedy if the systems were capable of supporting modular RAM and Apple was simply preventing it. They're not. They have stated that their design goal for these SoCs was mostly about efficiency and the best way to achieve that was by integrating everything and not supporting off SoC resources other than I/O.
No one is losing money or gaining money from Apple's systems having UMA. If you need more memory down the road, you can simply upgrade to a new system and sell the old one to make up for much of that cost. A vast majority of people never upgrade anything in their computers. When it gets old, they throw it out and get a new one. The PC market relies on this turnaround.
Food for thought: GPU cards have never had upgradable memory; you're stuck with what it came with. Why do you think that is?
Yes. It is common knowledge that RAM is not part of the actual SoC, it's on package with the SoC. People use the term "SoC" to describe the whole part ("M1", "M2") which includes the RAM.
Being able to control how much RAM is installed allows Apple to guarantee that all memory channels are filled and being utilized, maximizing performance.
2) Are you claiming that all WinPCs have slotted RAM for the GPU? If so, that's far from correct and since this united RAM is used by the GPU your attempt at an argument has no basis in reality.
No.
Apple's architecture is delivering incredible performance for the price, but all you see is greed? Hmmm...
However "t's why macOS finally feels snappy and responsive after feeling slightly rubbery for decades." is wrong.
macOS on Intel has been snappy and responsive for quite a long time now.
With the M2 Max topping out at 96GB (4x 24GB), I'm thinking the 24GB chips they're currently using are probably the highest capacity chips they could fit in the given area.
In any case, that socketed memory would not provide the same level of performance as the main memory. This means it may not meet the needs of those that really need huge amounts of memory. The current M2 chips use the SSD as backing. This "virtual memory" scheme allows programs to address more memory than physically exists. Memory above and beyond what exists is stored on the SSD, and the actual memory is used a "cache" for the memory currently in use.