Programmer, are you sure about the separate address bus? Page 12 of the mpf .pdf says that address data and control is multiplexed in. Also, the 6.4 GB/s is of usable bandwidth, brought down from 7.2GB/s because of address and protocol overhead. The pdf says that there is a separate side bus for cache snooping and ACK (acknowledge?).
I do not think that the 970 will reach 2.5GHz on the .13 process, especially considering that the same pdf is only saying for 1.4-1.8 GHz, not for initial speeds.
Well I don't want to get into an argument about the details of bus design (I'll lose), but I'll point out some differences...
RapidIO is packet based and thus has only data lines. Addresses are sent as part of the packet. The system is designed to be part of a fabric switched network.
The 970's FSB has an address bus (46 lines or so?) and operates as a more traditional pipelined, split transaction bus.
RIO will eventually scale to almost 8 GB/sec raw bandwidth (according to the RIO FAQs). At the moment, however, they are well short of that. The packet-oriented nature of the bus exacts a 5-25% overhead from its theoretical transmission speed.
The 970 FSB is synchronous, locked at half the processor's clock rate. This means its highest initial rate is 900 MHz, but as the 970's speed increases (and I suspect we'll see it reach at least 2.5 GHz) the bus speed will increase (to 1.25 GHz if I'm right about the 2.5 GHz). That rate applies to both the data and address busses.
The 970 FSB will have a much lower potential latency than the packet oriented RIO bus.
RIO is designed to be scalable from slow implementations to very fast ones, whereas the 970 is pretty much just always fast.
</strong><hr></blockquote>
Actually, I believe the 970's bus is also packet based, which explains the 12% overhead, reducing the useful bandwidth from 7.2 to 6.4 GB/s for the 900MHz version. The 970 has 42 bit addressing but not any address lines.
At higher frequencies the 970 may drop down to a lower divisor for the bus (eg: 2.4GHz 970 using 2400 / 6 * 2 = 800MHz bus), but I'd be dissapointed by that. Of course I haven't seen any proof of this yet either, but there follows a quote from <a href="http:////www.realworldtech.com/page.cfm?AID=RWT101502203725" target="_blank">David Wang</a> who was at the presentation.
[quote] David Wang
<strong>Packet Based System Interconnect
One of the more interesting aspects of the PowerPC 970 processor is the system interconnect. Unlike the bi-directional processor busses seen on Intel IA-32 and IA-64 processor, or even the bi-directional point to point interconnects used on Alpha EV6 and AMD Athlon processors, the system interconnect of the PowerPC 970 processor are uni-directional, point to point, source synchronous interconnects that do not have to worry about bus loading factors or bus turn around times, and the interconnect can wave-pipeline multiple number of bits of data on the wires concurrently. The most difficult part of such high frequency system interconnect may be the deskewing circuitry that would be required. In this case, the PowerPC 970 appears to have benefitted well from the POWER4 lineage, where the deskewing circuitry for a wavepipelined interconnect was previously disclosed by IBM.
The system interconnect on the PowerPC 970 has been designed to operate at an integer fraction of the CPU core frequency. At a CPU core frequency of 1.8 GHz, the system interconnect will operate at a frequency of 900 MHz. With two unidirectional 32 bit wide interconnects, one from the CPU to the companion system controller chip, the other from the companion system controller chip back to the CPU, the system interconnects can provide 3.6 GB/s of raw system bandwidth on each direction for an aggregate bandwidth of 7.2 GB/s. However, the unidirectional links must multiplex address and control information onto the same interconnects, and when these overheads are taken into considerations, IBM claims an effective peak data bandwidth of 6.4 GB per second.
<strong>Programmer, are you sure about the separate address bus? Page 12 of the mpf .pdf says that address data and control is multiplexed in. Also, the 6.4 GB/s is of usable bandwidth, brought down from 7.2GB/s because of address and protocol overhead. The pdf says that there is a separate side bus for cache snooping and ACK (acknowledge?).
I do not think that the 970 will reach 2.5GHz on the .13 process, especially considering that the same pdf is only saying for 1.4-1.8 GHz, not for initial speeds.</strong><hr></blockquote>
Heh, well I was right about one thing -- I would lose. I went back and looked again, and you guys are quite right. that David Wang quote is particuarly interesting. I had looked at the diagram and mis-interpreted the control path as an address bus. Woops.
I didn't mean that the 970 would reach 2.5 GHz on the 0.13 micron process. The 0.09 micron version will likely have the same bus interface though.
Hmmm... so the 970 FSB is much like a paired uni-directional double-wide RIO bus. It doesn't look like it would be scalable downward, however, so I'd still contend that they'd likely be using RIO or HT to the motherboard I/O chip, and leave the FSB on a daughtercard for communicating between processor(s) and the memory controller(s). Or perhaps that's just all wrong an rather than a motherboard / daughterboard setup they'll so something radically different. I/O on a daughtercard?
[quote]
who knows who cares, 970 is the future, it will be blazing fast<hr></blockquote>
I do -- the details are important, and I'd rather be corrected than spewing incorrect information. Thanks to the guys for correcting me.
I don't know, given the details about the 970 2.5GHz seems like it's approaching the higher limits of the 130nm process. I think it may scale to that and like what the Wang comment that mmicist showed and what I saw heard about in an internal IBM doc that the bus of the 970 is not as simple as 2:1 ratio of the processor speed, that there is an internal multiplier like (1800/4)*2=900MHz and the internal multiplier can change although the final number is always at a 2:1 ratio. So something like (2250/5)*2=900MHz is also conceivable although for 2GHz the bus should be able to hit 1GHz. We'll see soon.
<strong>I don't know, given the details about the 970 2.5GHz seems like it's approaching the higher limits of the 130nm process. I think it may scale to that and like what the Wang comment that mmicist showed and what I saw heard about in an internal IBM doc that the bus of the 970 is not as simple as 2:1 ratio of the processor speed, that there is an internal multiplier like (1800/4)*2=900MHz and the internal multiplier can change although the final number is always at a 2:1 ratio. So something like (2250/5)*2=900MHz is also conceivable although for 2GHz the bus should be able to hit 1GHz. We'll see soon.</strong><hr></blockquote>
I don't think anybody would object to a 2.5 GHz 0.13 implementation, but I wonder if IBM would even bother with the 0.09 implementation so close behind? A fast 0.13 would run very hot.
I hope they don't turn down the frequency multiplier of the bus, as that seems to be a key feature of the 970. Using the FSB as a system wide interconnect doesn't seem (to me) to be as compelling as using RapidIO because of Motorola's support of RapidIO. The 970 is an attractive high end part, but I won't rule out Motorola pushing the G4 line to the 7457-RM (integral memory controller, RapidIO, and ~1.8 GHz -- i.e. a very compelling low-end part). If that happens then having both processor subsystems using RapidIO simplifies Apple's system design, and RapidIO's scalability fits well with their need to design a whole line of portable and desktop (and other?) computers.
Hmmm... so the 970 FSB is much like a paired uni-directional double-wide RIO bus. It doesn't look like it would be scalable downward, however, so I'd still contend that they'd likely be using RIO or HT to the motherboard I/O chip, and leave the FSB on a daughtercard for communicating between processor(s) and the memory controller(s). Or perhaps that's just all wrong an rather than a motherboard / daughterboard setup they'll so something radically different. I/O on a daughtercard?
</strong><hr></blockquote>
I quite agree. I think that we need to know exactly what the protocol is going to be on the 970 bus, but it may be very simple in order to minimise latency of memory accesses, leaving a more capable, higher latency, bus for the I/O (and remote memory accesses if using NUMA).
I wouldnt be so sure about that.. Everybody thinks that this chip is going to be THE messias, that will save Apple and give us 10% marketshare..
Back to the real world guys! </strong><hr></blockquote>
No, we think this chip will free us from the chains that hold the G4 back from competing well with the leading x86 processors. It may increase marketshare by virtue of realizing the pent up demand for a fast PowerMac, but I don't think its the answer to give flight to Apple's market volumes.
Having 2 different processors does suddenly open a much larger gap between Apple's low and high end machines, however. Suddenly Apple would be able to ship low end machines with leading edge G4's without fear of cannabilizing their PowerMac market. Buy a single 1.4 GHz G4 tower/slab/cube cheap (i.e. sub $1K), or buy a dual 1.8 GHz 970 monster tower for their usual Pro price point.
Comments
I do not think that the 970 will reach 2.5GHz on the .13 process, especially considering that the same pdf is only saying for 1.4-1.8 GHz, not for initial speeds.
<strong>
Well I don't want to get into an argument about the details of bus design (I'll lose), but I'll point out some differences...
- RapidIO is packet based and thus has only data lines. Addresses are sent as part of the packet. The system is designed to be part of a fabric switched network.
- The 970's FSB has an address bus (46 lines or so?) and operates as a more traditional pipelined, split transaction bus.
- RIO will eventually scale to almost 8 GB/sec raw bandwidth (according to the RIO FAQs). At the moment, however, they are well short of that. The packet-oriented nature of the bus exacts a 5-25% overhead from its theoretical transmission speed.
- The 970 FSB is synchronous, locked at half the processor's clock rate. This means its highest initial rate is 900 MHz, but as the 970's speed increases (and I suspect we'll see it reach at least 2.5 GHz) the bus speed will increase (to 1.25 GHz if I'm right about the 2.5 GHz). That rate applies to both the data and address busses.
- The 970 FSB will have a much lower potential latency than the packet oriented RIO bus.
- RIO is designed to be scalable from slow implementations to very fast ones, whereas the 970 is pretty much just always fast.
</strong><hr></blockquote>Actually, I believe the 970's bus is also packet based, which explains the 12% overhead, reducing the useful bandwidth from 7.2 to 6.4 GB/s for the 900MHz version. The 970 has 42 bit addressing but not any address lines.
At higher frequencies the 970 may drop down to a lower divisor for the bus (eg: 2.4GHz 970 using 2400 / 6 * 2 = 800MHz bus), but I'd be dissapointed by that. Of course I haven't seen any proof of this yet either, but there follows a quote from <a href="http:////www.realworldtech.com/page.cfm?AID=RWT101502203725" target="_blank">David Wang</a> who was at the presentation.
[quote] David Wang
<strong>Packet Based System Interconnect
One of the more interesting aspects of the PowerPC 970 processor is the system interconnect. Unlike the bi-directional processor busses seen on Intel IA-32 and IA-64 processor, or even the bi-directional point to point interconnects used on Alpha EV6 and AMD Athlon processors, the system interconnect of the PowerPC 970 processor are uni-directional, point to point, source synchronous interconnects that do not have to worry about bus loading factors or bus turn around times, and the interconnect can wave-pipeline multiple number of bits of data on the wires concurrently. The most difficult part of such high frequency system interconnect may be the deskewing circuitry that would be required. In this case, the PowerPC 970 appears to have benefitted well from the POWER4 lineage, where the deskewing circuitry for a wavepipelined interconnect was previously disclosed by IBM.
The system interconnect on the PowerPC 970 has been designed to operate at an integer fraction of the CPU core frequency. At a CPU core frequency of 1.8 GHz, the system interconnect will operate at a frequency of 900 MHz. With two unidirectional 32 bit wide interconnects, one from the CPU to the companion system controller chip, the other from the companion system controller chip back to the CPU, the system interconnects can provide 3.6 GB/s of raw system bandwidth on each direction for an aggregate bandwidth of 7.2 GB/s. However, the unidirectional links must multiplex address and control information onto the same interconnects, and when these overheads are taken into considerations, IBM claims an effective peak data bandwidth of 6.4 GB per second.
</strong><hr></blockquote>
michael
who knows who cares, 970 is the future, it will be blazing fast
...
<strong>Programmer, are you sure about the separate address bus? Page 12 of the mpf .pdf says that address data and control is multiplexed in. Also, the 6.4 GB/s is of usable bandwidth, brought down from 7.2GB/s because of address and protocol overhead. The pdf says that there is a separate side bus for cache snooping and ACK (acknowledge?).
I do not think that the 970 will reach 2.5GHz on the .13 process, especially considering that the same pdf is only saying for 1.4-1.8 GHz, not for initial speeds.</strong><hr></blockquote>
Heh, well I was right about one thing -- I would lose.
I didn't mean that the 970 would reach 2.5 GHz on the 0.13 micron process. The 0.09 micron version will likely have the same bus interface though.
Hmmm... so the 970 FSB is much like a paired uni-directional double-wide RIO bus. It doesn't look like it would be scalable downward, however, so I'd still contend that they'd likely be using RIO or HT to the motherboard I/O chip, and leave the FSB on a daughtercard for communicating between processor(s) and the memory controller(s). Or perhaps that's just all wrong an rather than a motherboard / daughterboard setup they'll so something radically different. I/O on a daughtercard?
[quote]
who knows who cares, 970 is the future, it will be blazing fast<hr></blockquote>
I do -- the details are important, and I'd rather be corrected than spewing incorrect information. Thanks to the guys for correcting me.
[ 01-12-2003: Message edited by: Programmer ]</p>
<strong>I don't know, given the details about the 970 2.5GHz seems like it's approaching the higher limits of the 130nm process. I think it may scale to that and like what the Wang comment that mmicist showed and what I saw heard about in an internal IBM doc that the bus of the 970 is not as simple as 2:1 ratio of the processor speed, that there is an internal multiplier like (1800/4)*2=900MHz and the internal multiplier can change although the final number is always at a 2:1 ratio. So something like (2250/5)*2=900MHz is also conceivable although for 2GHz the bus should be able to hit 1GHz. We'll see soon.</strong><hr></blockquote>
I don't think anybody would object to a 2.5 GHz 0.13 implementation, but I wonder if IBM would even bother with the 0.09 implementation so close behind? A fast 0.13 would run very hot.
I hope they don't turn down the frequency multiplier of the bus, as that seems to be a key feature of the 970. Using the FSB as a system wide interconnect doesn't seem (to me) to be as compelling as using RapidIO because of Motorola's support of RapidIO. The 970 is an attractive high end part, but I won't rule out Motorola pushing the G4 line to the 7457-RM (integral memory controller, RapidIO, and ~1.8 GHz -- i.e. a very compelling low-end part). If that happens then having both processor subsystems using RapidIO simplifies Apple's system design, and RapidIO's scalability fits well with their need to design a whole line of portable and desktop (and other?) computers.
<strong>
Hmmm... so the 970 FSB is much like a paired uni-directional double-wide RIO bus. It doesn't look like it would be scalable downward, however, so I'd still contend that they'd likely be using RIO or HT to the motherboard I/O chip, and leave the FSB on a daughtercard for communicating between processor(s) and the memory controller(s). Or perhaps that's just all wrong an rather than a motherboard / daughterboard setup they'll so something radically different. I/O on a daughtercard?
</strong><hr></blockquote>
I quite agree. I think that we need to know exactly what the protocol is going to be on the 970 bus, but it may be very simple in order to minimise latency of memory accesses, leaving a more capable, higher latency, bus for the I/O (and remote memory accesses if using NUMA).
Interesting, isn't it?
michael
<strong>...
who knows who cares, 970 is the future, it will be blazing fast
...
I wouldnt be so sure about that.. Everybody thinks that this chip is going to be THE messias, that will save Apple and give us 10% marketshare..
Back to the real world guys!
<strong>
I wouldnt be so sure about that.. Everybody thinks that this chip is going to be THE messias, that will save Apple and give us 10% marketshare..
Back to the real world guys!
No, we think this chip will free us from the chains that hold the G4 back from competing well with the leading x86 processors. It may increase marketshare by virtue of realizing the pent up demand for a fast PowerMac, but I don't think its the answer to give flight to Apple's market volumes.
Having 2 different processors does suddenly open a much larger gap between Apple's low and high end machines, however. Suddenly Apple would be able to ship low end machines with leading edge G4's without fear of cannabilizing their PowerMac market. Buy a single 1.4 GHz G4 tower/slab/cube cheap (i.e. sub $1K), or buy a dual 1.8 GHz 970 monster tower for their usual Pro price point.