Originally posted by Amorph
The only way I can make sense of it is if the 970's companion chip has a bus for direct connection to other 970 companion chips. That could be ApplePI (eWeek) or HyperTransport (CNET) or - heck - both, if ApplePI is a protocol.
I don't think that's out of the question at all. It's probably one of the more elegant and efficient ways to handle the sort of grunt work that can cripple multiprocessor efficiency (such as keeping the caches coherent, which in plain English means making sure that every CPU is operating on the most current data). It ensures that the memory controller doesn't become any more of a Grand Central Station than it will already be if it has a robust DMA engine on board.
No way. Having the memory controller chip with both processor busses connected to it is the most efficient method (in this instance). It can then also provide a large shared L3 cache (even out of embedded DRAM would be fine). The logic for detecting coherency issues, i.e is data dirty or not, benefits from being in one place and not having to compare notes with another chip.
If I were Apple (I know, I know, I'm not), I would design a very small dedicated memory controller chip that acted like a router, it ensures the requested data gets to the correct CPU OR peripherial chip. By making the chip as small as possible you could leave room on board the chip for some kind of cache, even if only as buffering. I would design the chip the same for uni- or dual- processor operation and have a high speed HT link between it and the rest of the system. And I would buy the chip that supplied PCI, AGP, ethernet, FireWire, USB (2), blah blah blah off of someone else (nVidia?) and concentrate my resources on uniqueness.
But they specifically talk about between processors, and we just don't know enough about the chip to do anything better than guess