PC3200 and 970s?

Posted:
in Future Apple Hardware edited January 2014
Am I correct that only PC-3200 would come close to having enough bandwidth to feed a SINGLE 1.8ghz 970? What options are on the horizon?
«1

Comments

  • Reply 1 of 29
    amorphamorph Posts: 7,112member
    Yes. Furthermore, if Apple wants to extend their current architecture then they'd have to go dual channel, or quad for dual 970s.



    Why do that? Because for all that's called a hack, the current architecture has some real advantages. With QE, the graphics card needs to be fed as well as the CPU, so memory should be able to saturate both at once (at least insofar as the busses allow). Currently, it can. The fact that there's memory bandwidth enough for all the other I/O channels at once is a bonus. It means that the machine as a whole doesn't start choking when you really push it hard. I'd like to see Apple continue down this road, and their OS seems to indicate that they will.
  • Reply 2 of 29
    ompusompus Posts: 163member
    If PC 3200 is the fastest memory we can expect in 2003, and if it can barely keep up with a SINGLE 970, then isn't it highly unlikely that we'll see DUAL 970s in 2003?
  • Reply 3 of 29
    amorphamorph Posts: 7,112member
    That depends on the architecture Apple settles on. Dual channel SDRAM (two banks running in parallel) is already deployed on Athlon-based motherboards. They could both feed into one (really fast) memory controller that in turn fed the processor bus (and whoever else needed to be fed). That's one option.



    The other is to give each CPU its own bank of RAM, and hook the CPU/RAM units together on a high-speed fabric. I wouldn't be surprised if Apple goes this way. This is the approach used in SGI workstations. It wasn't cost-effective in PCs until HyperTransport and RapidIO appeared, and even with those it won't be easy (because the system has to keep all those banks of RAM synchronized, and enable one CPU to grab a value from another's RAM). There would probably have to be another bank of RAM to sate AGP and PCI and I/O channels as well, if that's feasible.



    Caveat: I'm getting in over my head, here, so I fully expect one of the more hardware-minded folks here to correct something I've said. Don't tattoo this post on your forehead or anything.
  • Reply 4 of 29
    brussellbrussell Posts: 9,812member
    [quote]Originally posted by Amorph:

    <strong> Don't tattoo this post on your forehead or anything. </strong><hr></blockquote>Damn, you could have made that your first sentence instead of your last. These things are permanent you know.
  • Reply 5 of 29
    I don't think we'll see the RAM-proc-"North Bridge" combo with IBM Power-PCs until the Power5 variant (PPC 980?) hits the streets. The Moto 7457-RM will also be of this design, and it really is the future, despite the interesting implications of a NUMA design for cache coherency and DMA access for peripherals.

    The 6.4 Gb/s of the 970 @ 1.8 GHz is actually 2*3.2 Gb/s, so won't be as good as other solutions for some algorithms.

    According to IBM, the 970 can work in a shared FSB mode, but this seems rather strange considering the design (2*32 bit one-way busses), and has caused some speculation over at Ars (the perpetual future Apple CPU thread).
  • Reply 6 of 29
    wmfwmf Posts: 1,164member
    [quote]Originally posted by Amorph:

    <strong>The other is to give each CPU its own bank of RAM, and hook the CPU/RAM units together on a high-speed fabric.</strong><hr></blockquote>



    That sounds like it would involve a northbridge chip for each CPU (expensive) instead of one northbridge for the whole system as we have today. If the RAM attached directly to the CPU it would be different, but that's not the case for the 970.
  • Reply 7 of 29
    programmerprogrammer Posts: 3,457member
    There is actually good reason to have a dual northbridge system, and while it might be a bit pricey we might still see it appear.



    For starters, Apple will only want to design one chip so it either has 2 970-FSB ports, or they use two of them and interconnect (perhaps with RapidIO, and/or ApplePI). A single chip with two ports would be much larger, have many pins, and thus more expensive, whereas building two simpler ones gives you additional volume to help drive costs down.



    These chips communicate with the 970 via a very high speed bus -- 900 MHz in the case of the 1.8 GHz chip. This will require a high quality circuit board and the shortest traces they can manage. One way to do this and minimize costs is to mount the high speed parts on a daughtercard. The rest of the motherboard is much like the current ones, and the interconnect is via something like RapidIO or HyperTransport. With dual northbridges they have two options: put both processors & northbridges on the same daughtercard, or on seperate daughtercards. If they are together they can use a high speed interconnect (ApplePI), if they are seperate they can use something like RapidIO. 2 single processor cards has the advantage that single and dual processor machines use the same card... but its slower.



    The third question is where the memory lives and is divided. Motherboard? Daughtercard? Managed seperately by each northbridge (if there are two), or somehow shared? If seperate a NUMA architecture is needed... and again this is something RapidIO is designed for. If shared then somehow both northbridges need to talk to the controller, but HyperTransport can do this (with multiple ports on the memory controller).



    Lots of options.
  • Reply 8 of 29
    nevynnevyn Posts: 360member
    [quote]Originally posted by Programmer:

    <strong>Lots of options.</strong><hr></blockquote>



    Can you (or anyone) definatively say the ppc 970's 2x one-way buses aren't a varient of RapidIO or HT? As far as I could tell (which isn't saying much) the description of the 970's FSB uses pretty much the same terminology, and could pretty easily be an understatement of that one aspect. Or are they possibly set up the way the Power4's bus lines were such that 4 of the chips could be assembled into a larger module by just rotating each 90 degrees relative to the previous chip?



    It just seems odd that Apple would end up transitioning to an whole new set of northbridges for just a year or so until their suppliers switch over to RIO. And once Apple had a dual RIO CPU setup, it should be possible to step to more processors with off the shelf RIO switches...



    I don't know - just wild-assed guessing. The whole RIO thing screams "Supercomputer's switched fabric backplane in very thin disguise" to me.
  • Reply 9 of 29
    powerdocpowerdoc Posts: 8,123member
    I doubt that Apple will ever use PC 3200 memory.

    First of all this type of memory is not officialy recocnize by the consortium of producers.

    PC 3200 memory are not officialy supported in the PC world by the producers of mobo.

    PC 3200 lack of stability and performance. CL2 PC 2700 are more performant than CL2,5 PC 3200.



    In the future it will be better to use DDR 2 memory, or double channel PC 2700.
  • Reply 10 of 29
    wmfwmf Posts: 1,164member
    [quote]Originally posted by Programmer:

    <strong>There is actually good reason to have a dual northbridge system, and while it might be a bit pricey we might still see it appear.



    For starters, Apple will only want to design one chip so it either has 2 970-FSB ports, or they use two of them and interconnect (perhaps with RapidIO, and/or ApplePI). A single chip with two ports would be much larger, have many pins, and thus more expensive, whereas building two simpler ones gives you additional volume to help drive costs down.</strong><hr></blockquote>



    My only concern with that approach is that a northbridge with one 970 FSB and one other high-speed port doesn't sound any simpler than a northbridge with two 970 FSBs.



    And who says Apple's suppliers are switching to RapidIO? Once you have the 970 FSB, why switch?
  • Reply 11 of 29
    bartobarto Posts: 2,246member
    QDR! Go QDR! In production now, cheaper than DDR-II and backwards compatible with DDR! Go QDR! Woohoo!



    Barto



    PS I hope the Power Mac G5 uses quad channel QDR400.
  • Reply 12 of 29
    baconbacon Posts: 15member
    Regardless of the ultimate solution, a dual 970 system would necessitate some pricey components. Such a system would as expensive as it would be fast.
  • Reply 13 of 29
    Programmer, these diagrams were made by "I Want X" on the Ars forums, and show what you're saying wrt to having a "North Bridge" on the daughter board, and then RIO/HT links to other daughter boards.

    For a single proc system:



    and for a dual proc system

    .

    Now of course with these, the North Bridges could be the same, and it could even be possible to put all of the processors and "companion" chips on to daughter boards, and then have a RIO/HT link to the mother, meaning that the same motherboard could be used for all the systems.
  • Reply 14 of 29
    [quote]Originally posted by wmf:

    <strong>My only concern with that approach is that a northbridge with one 970 FSB and one other high-speed port doesn't sound any simpler than a northbridge with two 970 FSBs.



    And who says Apple's suppliers are switching to RapidIO? Once you have the 970 FSB, why switch?</strong><hr></blockquote>



    Different busses are designed for very different purposes. the 970 FSB is designed for very short path, point-to-point, high speed communication. This is similar to HyperTransport, but it isn't HT.



    RapidIO, on the other hand, is being pushed by Motorola, IBM and others. It is designed to connect together many chips in kind of a on-board packet network. It is slower (currently maxing out at around 2 GB/sec in the fastest implementation, IIRC). It uses fewer, slower traces (well, slower than the 970 FSB anyhow).



    The reason to use both is to separate the high speed components from the lower-speed components, which allows costs to be minimized. The "companion chip" has to run at the FSB's speed, but the I/O system doesn't. If you integrate them all into one chip, however, then the whole thing has to be able to run fast and yields will be lower.



    I notice that my terminology doesn't match those diagrams -- above I was using "northbridge" to mean the "companion chip", whereas they are called out separately on the diagrams from Ars. I'll switch to using "companion chip".



    [ 01-11-2003: Message edited by: Programmer ]</p>
  • Reply 15 of 29
    No, you're right. Bad Andy pointed out that traditionally, the North Bridge is just whatever is connected to the processor and contains the memory controller. It is just that in recent times we in the Macintosh community have become used to a kitchen sink style NB, which contains all high speed peripherals. What we may be seeing come back in a NB which just has a memory controller and a RIO/HT output, up from the traditional PCI connection.
  • Reply 16 of 29
    [quote]Originally posted by Programmer:

    <strong>



    Different busses are designed for very different purposes. the 970 FSB is designed for very short path, point-to-point, high speed communication. This is similar to HyperTransport, but it isn't HT.



    RapidIO, on the other hand, is being pushed by Motorola, IBM and others. It is designed to connect together many chips in kind of a on-board packet network. It is slower (currently maxing out at around 2 GB/sec in the fastest implementation, IIRC). It uses fewer, slower traces (well, slower than the 970 FSB anyhow).

    </strong><hr></blockquote>



    Up to a point, lord Copper.

    Actually current RapidIO runs at a base frequency of 500MHz, with DDR to give an effetive 1GHz signal rate, so it is clocked faster than the 970's bus. It is, however, 16 bit each way instead of 32 bit each way, so fewer traces, but gets to 4GB/s bidirectional bandwidth, 2GB/s unidirectional. RapidIO and HyperTransport are also both point to point.

    This confusion arises because of marketing peoples' preference for stating the highest number when quoting speeds, DDR266 is, of course, a 133MHz bus, and P4's 533MHz, is actually also a 133MHz bus using QDR signalling. The 970's bus will actually clock at twice the speed of the P4's upcoming (so called 800MHz) bus.

    I cannot be certain that the 970 bus uses low voltage differential signalling, but strongly suspect it does, and if that is is added to IBM's wave pipelining techniques and deskewing circuitry, it may be that the 970 bus is actually easier to route than RapidIO or HyperTransport. If the bus is not differential then very short traces will be required.



    michael
  • Reply 17 of 29
    outsideroutsider Posts: 6,008member
    Another thing to consider is the negineering involved in making a 900MHz 32bit (times 2) bus travel through the physical connection between the daughterboard and motherboard. So putting the main memory controller and IO connector on the daughter card connecting to the main peripheral controller on the motherboard using a bus designed for this like HT or RIO would be the ideal solution. But riddle me this: Unless you plan on putting the memory slots on the daughter card (unlikely but conceivable), you would still need a 64bit interface to the slots on the motherboard coming out of the physical connector between processor card and motherboard. That's not too bad but with a NUMA style architecture you'll need a second set of traces to the motherboard via the connector. So either slots on the daughter-card so you can design the mobo to accept dual or single processor modules, or the mobo needs to be designed specifically for dual machines and only for duals. Unless the controller can auto-sense the number of processors and assign all slots or just half to each processor. Interesting to see how they will work this out.
  • Reply 18 of 29
    programmerprogrammer Posts: 3,457member
    [quote]Originally posted by mmicist:

    <strong>

    Actually current RapidIO runs at a base frequency of 500MHz, with DDR to give an effetive 1GHz signal rate, so it is clocked faster than the 970's bus. It is, however, 16 bit each way instead of 32 bit each way, so fewer traces, but gets to 4GB/s bidirectional bandwidth, 2GB/s unidirectional. RapidIO and HyperTransport are also both point to point.

    </strong><hr></blockquote>



    Well I don't want to get into an argument about the details of bus design (I'll lose), but I'll point out some differences...
    • RapidIO is packet based and thus has only data lines. Addresses are sent as part of the packet. The system is designed to be part of a fabric switched network.

    • The 970's FSB has an address bus (46 lines or so?) and operates as a more traditional pipelined, split transaction bus.

    • RIO will eventually scale to almost 8 GB/sec raw bandwidth (according to the RIO FAQs). At the moment, however, they are well short of that. The packet-oriented nature of the bus exacts a 5-25% overhead from its theoretical transmission speed.

    • The 970 FSB is synchronous, locked at half the processor's clock rate. This means its highest initial rate is 900 MHz, but as the 970's speed increases (and I suspect we'll see it reach at least 2.5 GHz) the bus speed will increase (to 1.25 GHz if I'm right about the 2.5 GHz). That rate applies to both the data and address busses.

    • The 970 FSB will have a much lower potential latency than the packet oriented RIO bus.

    • RIO is designed to be scalable from slow implementations to very fast ones, whereas the 970 is pretty much just always fast.

  • Reply 18 of 29
    amorphamorph Posts: 7,112member
    I'm also hoping - perhaps foolishly - that they'll find a way to ensure that there's enough memory bandwidth left over to keep everything else on the motherboard happy. That will help them squeeze every last bit of performance out of their hardware, which is especially useful considering that they're beginning to rely on that (Quartz Extreme).



    Given the complexity that's already involved in implementing a traditional NUMA architecture with the technology the 970's built around, I'm not sure how they'd go about doing that. But some part of me is sure that they will, and I wouldn't be surprised if the board looks nothing like anything that's ever come out of a PC vendor as a result.
  • Reply 20 of 29
    [quote]Originally posted by Amorph:

    <strong>I'm also hoping - perhaps foolishly - that they'll find a way to ensure that there's enough memory bandwidth left over to keep everything else on the motherboard happy. That will help them squeeze every last bit of performance out of their hardware, which is especially useful considering that they're beginning to rely on that (Quartz Extreme).



    Given the complexity that's already involved in implementing a traditional NUMA architecture with the technology the 970's built around, I'm not sure how they'd go about doing that. But some part of me is sure that they will, and I wouldn't be surprised if the board looks nothing like anything that's ever come out of a PC vendor as a result.</strong><hr></blockquote>



    Oh, I'm pretty sure that'll be true (the "it'll look like nothing from a PC vendor before" part).



    I'm doubtful that all the PowerMacs will have enough memory bandwidth to feed the 970 (or two of them) and all the I/O ports, and the GPU. At least not when all of them are going full tilt. That really doesn't matter, however, because it gives Apple room to maneuver & improve their designs. Right now they've maxed out the MPX bus and don't really have any options. In the new scheme the busses have legs so Apple has options. And we know they like options.
Sign In or Register to comment.