Please Help Put the 970 in Context

jeromba · March 27, 2003 8:03AM

If you look on the Intel Mobo, they will have in may the following items :

FSB 800 MHz, AGP 8X, Serial ATA, Dual DDR 333/400.

Apple must at least have these specs too. And for the best they can put PCI-Express* ! Look at this article !!! This is remind me at the "think outside of the box" of Mr NSX on Ars !

What do you think ? We will see an "Apple-Mix" with PCI-Express-HyperTransport-Rapid I/O ?

*PCI Express architecture retains the PCI usage model and software interfaces for investment protection and smooth development migration. The technology is aimed at multiple market segments in the computing and communication industries, and supports chip-to-chip, board-to-board and adapter solutions at an equivalent or lower cost structure than existing PCI designs. PCI Express currently runs at 2.5GBps, or 250MBps per lane in each direction, providing a total bandwidth of 16GBps in a 32-lane configuration. Future frequency increases will scale up total bandwidth to the limits of copper and significantly beyond that via other media without impacting any layers above the Physical Layer in the protocol stack. PCI Express provides I/O attach points for high-performance graphics, 1394b, USB 2.0, InfiniBand Architecture, Gigabit networking and so on.

If you want more info here is a PDF file.

EDit: oh yeah and Apple is a member of PCI-SIG and IBM is president of the board of directors.

willywalloo · March 27, 2003 8:41AM

Currently it looks like the 970 will debut at 1.7Ghz. But that may change, as it isn't due for a while.

Personlly I hope it will debut at the original 2.5Ghz that was stated on the press release on IBM's net'site a few weeks ago.

blelech,

-walloo.

socinean1 · March 27, 2003 9:23AM

Is the 970 a 32bit data pipe or a 64bit data pipe? I know it has 64bit memory addressing, and allows for 64bit integers, but I have been told that it is still only a 32bit data pipe. Does anyone know?

tomb of the unknown · March 27, 2003 9:34AM

Quote:

Originally posted by willywalloo

Currently it looks like the 970 will debut at 1.7Ghz.

According to who?

tomb of the unknown · March 27, 2003 9:36AM

Quote:

Originally posted by Socinean1

Is the 970 a 32bit data pipe or a 64bit data pipe? I know it has 64bit memory addressing, and allows for 64bit integers, but I have been told that it is still only a 32bit data pipe. Does anyone know?

It is 32 bit asynchronous. In other words it has two data pipes, one to send and one to receive, each 32 bits wide for a grand total of 64 bits.

jeromba · March 27, 2003 9:39AM

Quote:

Originally posted by Tomb of the Unknown

According to who?

That's right. The only _from IBM_ infos we have is this :

PPC970 @ 1.0 / 1.2 / 1.4 / up to 1.8 / and briefly on their site up to 2.5

amorph · March 27, 2003 10:01AM

Quote:

Originally posted by Programmer

If the FSB is a ring then the output of the chipset goes to the proc1's input, the output of proc1 goes to proc2's input, and the output of proc2 goes to the chipset's input. All data goes all the way around the ring and is thus transmitted (in turn) on 3 output buses and 3 input buses. This means the total bandwidth available is 3.2 GB/sec for all processors. Star topologies means each processor gets 3.2 GB/sec in each direction.

There might be one problem with this particular approach: I'm given to understand that for various reasons the GigaBus is going to be quite short. The longer it gets, the more intractable the fast clock and the skewed signals become. So it seems like setting the CPUs up in a token ring using the GigaBus would result in a curious layout with the 970s' companion chips nestled right up against each other. This is not impossible, but it's not exactly scalable or flexible, unless I'm missing something.

Since a dual channel memory controller would have a baffling amount of bandwidth to deal with (more than doubling the complexity of the RAM architecture for one thing), I'm currently entertaining the idea of a series of cards, each with a single 970 with a bank of RAM, linked into a token ring network over HyperTransport. (I have to admit here that I might well be imposing a network topology on top of a protocol that already has one — I haven't looked at HT in any depth. I'm just thinking aloud.)

programmer · March 27, 2003 10:10AM

Quote:

Originally posted by Amorph

There might be one problem with this particular approach: I'm given to understand that for various reasons the GigaBus is going to be quite short. The longer it gets, the more intractable the fast clock and the skewed signals become. So it seems like setting the CPUs up in a token ring using the GigaBus would result in a curious layout with the 970s' companion chips nestled right up against each other. This is not impossible, but it's not exactly scalable or flexible, unless I'm missing something.

Distance is a problem in any topology. In a ring each element needs to be next to both its neighbours. In a star they must all be close to the companion. IBM's FSB design uses their new "elastic bus" technology which deals with skew between the bus' lines and different lengths.

Quote:

Since a dual channel memory controller would have a baffling amount of bandwidth to deal with (more than doubling the complexity of the RAM architecture for one thing), I'm currently entertaining the idea of a series of cards, each with a single 970 with a bank of RAM, linked into a token ring network over HyperTransport. (I have to admit here that I might well be imposing a network topology on top of a protocol that already has one — I haven't looked at HT in any depth. I'm just thinking aloud.)

Dual PC2700 would only be 5.4 GB/sec maximum theoretical... hardly "baffling". It would probably fall considerably short of the theory too. The network-like topology you are describing is probably better implemented using RapidIO than HyperTransport, and it doesn't use a ring-style topology.

ericj551 · March 27, 2003 10:11AM

Quote:

Originally posted by kroehl

Put it on grill for a few minutes and you get a nice crispy crust on your proof. Don't overdo it or your proof will be hard to swallow!

The proof is in the pudding anyway.

Kroehl

For best results, use a Dutch Oven.

netromac · March 27, 2003 12:57PM

Quote:

Originally posted by ericj551

For best results, use a Dutch Oven.

I'll have to call my Dutch friend Gert then and ask him to send me one.

Sigh! Making "proof" isn't as easy as I thought.

outsider · March 27, 2003 1:05PM

Quote:

Originally posted by ericj551

For best results, use a Dutch Oven.

You mean when you fart under the covers? eww

amorph · March 27, 2003 1:15PM

Quote:

Originally posted by Programmer

Distance is a problem in any topology. In a ring each element needs to be next to both its neighbours. In a star they must all be close to the companion. IBM's FSB design uses their new "elastic bus" technology which deals with skew between the bus' lines and different lengths.

Right, but what I meant was that distance seems to be a particularly acute problem for the GigaBus. RIO and HT both seem to be designed for longer traces.

Quote:

Dual PC2700 would only be 5.4 GB/sec maximum theoretical... hardly "baffling".

Sorry, that paragraph wasn't clear. I said two things in one sentence: that, assuming a memory controller feeding two 970s, the total amount of bandwidth it would have to handle would be baffling (12.8 GB/s GigaBus bandwidth + RAM bandwidth), and that RAM would have to be installed in a significantly more complicated arrangement (multiple channels) in order to come anywhere near sating two GigaBuses.

To get around these issues, I posited single 970s with dedicated RAM, hooked up via some other technology. I said HT, but I meant RIO.

I'm probably still wrong, but I at least want to be clear about the way in which I'm wrong.

kupan787 · March 27, 2003 2:21PM

Quote:

Originally posted by jeromba

If you look on the Intel Mobo, they will have in may the following items :

FSB 800 MHz, AGP 8X, Serial ATA, Dual DDR 333/400.

Apple must at least have these specs too.

How much difference does it make that the P4 has a quad pumped bus (200x4) vs the 970's double pumped (450x2)? Is the final value all that matters (800 vs 900), or is there more overhead in a quad pumped bus vs double pumped bus? How much GB/sec does Intel claim with their bus? How does it stack up to the 970's bus?

Also what kind of bus is the P4's? I have seen mention about speculation on the 970 bus (Programmer mentioned a ring based bus I think). Is the 970's bus similar to the P4's, more advanced, worse?

Also, isn't it true the the 970's bus is a ratio of the processor speed (4:1). Meaning that as the processor scales, so does the bus? So a 2.0GHz 970 would have a 1GHz bus (really 500MHz double pumped), and a 2.5GHz 970 would have a 1.25GHz bus.

telomar · March 27, 2003 3:13PM

Quote:

Originally posted by jeromba

And for the best they can put PCI-Express* !

I wouldn't be praying for PCI-Express unless you want to wait until 2004 for a new PowerMac.

placebo · March 27, 2003 4:45PM

Why??? I'm not the card slot nerd, but I know that...I know nothing.

Nevermind.

rbr · March 27, 2003 5:14PM

Quote:

Originally posted by Telomar

I wouldn't be praying for PCI-Express unless you want to wait until 2004 for a new PowerMac.

What I have seen recently seems to be a battle between the PCI-X crowd which wants the backward compatibility versus the PCI-Express crowd which wants the clean break to serial connections and higher speed without the synchronization problems of parallel connections. Fall of '04 seems to be the timeframe now being mentioned and that may slip as the two sides fight it out.

Did you notice that the AGP 8X specification allows a second 8X slot? That could provide for some interesting possibilities until PCI-Express gets on board if only Apple would get busy and do it.

hmurchison · March 27, 2003 5:14PM

Well from what I've heard

PCI Express still has a few bugs that need to be stamped out . Intel would have added it to Canterwood and Springdale if it was ready.

The potential nice...let's hope the reliability is there.

I'd like to see apple start augmenting the Motherboard with specialized chips. Get things off of the current PCI Bus and try to prevent high bandwidth items from choking.

Since these Computers are designed from the ground up with OSX in mind I expect to see Apple branch out and take advantage of things that weren't available in OS9. It's time to take a leap forward.

zapchud · March 27, 2003 5:28PM

Quote:

Originally posted by kupan787

How much difference does it make that the P4 has a quad pumped bus (200x4) vs the 970's double pumped (450x2)? Is the final value all that matters (800 vs 900), or is there more overhead in a quad pumped bus vs double pumped bus? How much GB/sec does Intel claim with their bus? How does it stack up to the 970's bus?

AFAIK is there a bigger overhead in a quad-pumped bus, compared to a double-pumped bus. The double-pumped one can send addresses at half the rate of the data, which is double the quad-pumped. I don't have any numbers to show any documented difference though.

I also read that the Intel 800MHz bus will transfer addresses at 400MHz (on anandtech.com), which makes me think that it is a double-pumped bus. Maybe I overread something.... Intel will probably claim (have they yet claimed anything?) a full 6,4GB/s of bandwidth on their 800Mhz bus, but that number is only a theoretical one, which differs from IBM's claimed 6,4GB/s effective bandwidth. The theoretical maximum bandwidth on the 900MHz PPC970 bus, is 7,2GB/s

Quote:

Originally posted by kupan787

Also what kind of bus is the P4's? I have seen mention about speculation on the 970 bus (Programmer mentioned a ring based bus I think). Is the 970's bus similar to the P4's, more advanced, worse?

The P4's bus-topology is star-based (as opposed to ring-based), which is the same as I expect the 970 to have/use. A big difference is that the bus of the P4 is bidirectional, and the 970 one is unidirectional. This means that the P4 can send data at full speed in either direction, while the 970 only can send data at half speed (It's really two 32-bit 900MHz buses, 64-bit on P4), either send or recieve, which means it only has 3,2GB/s (@ 1,8Ghz/900MHz FSB) of bandwidth when sending, or recieving. Luckily, it can both send and recieve at the same time, different from the P4 bus that does one thing at the time. The good thing about having an unidirectional bus is that the effective bandwidth is much higher (the effective bandwidth is always lower than the theoretical, maximum bandwidth you can calculate by multiplying bus-clockfrequency and bus-width), as the overhead for having to switch the bus between send and recieve-mode isn't there like on an bidirectional bus, and because of other factors more enlightened heads than me can tell about.

Quote:

Originally posted by kupan787

Also, isn't it true the the 970's bus is a ratio of the processor speed (4:1). Meaning that as the processor scales, so does the bus? So a 2.0GHz 970 would have a 1GHz bus (really 500MHz double pumped), and a 2.5GHz 970 would have a 1.25GHz bus.

That's what they say, but it isn't ***CONFIRMED by any source yet. We might be that lucky, or we might end up with a constant 900Mhz bus, and a multiplier that differs.

tht · March 27, 2003 5:49PM

Quote:

Originally posted by kupan787

How much difference does it make that the P4 has a quad pumped bus (200x4) vs the 970's double pumped (450x2)?

Any sort of of difference would only manifest itself in situations where the advantages and disadvantages of the said buses can be exploited. Otherwise, 95% of the time, it will depend on the memory used (dual-channel DDR SDRAM or RAMBUS, single channel, etc) since the buses already have more than or up to twice the bandwidth of today's memory. If the same memory tech is used, and the processor bus has bandwidth, there likely won't be much difference at all.

Quote:

Is the final value all that matters (800 vs 900), or is there more overhead in a quad pumped bus vs double pumped bus? How much GB/sec does Intel claim with their bus? How does it stack up to the 970's bus?

A 1.8 GHz PPC 970 900 MHz processor is about equivalent to an 800 MHz P4 bus because the 970 bus also transmits memory addresses on the same lines as data are. Thusly, it's data rate is about equivalent to an 800 MHz bus. Both will be about 6.4 Gbyte/s.

Quote:

Also what kind of bus is the P4's? I have seen mention about speculation on the 970 bus (Programmer mentioned a ring based bus I think). Is the 970's bus similar to the P4's, more advanced, worse?

The P4 bus is a parallel bi-directional multi-drop broadcast bus. It is similar to the Moto G4 bus or the G3 bus in concept, except that it has more features, primarily the quad data rate.

In a bus, each bit of data needs a corresponding "wire trace" on the motherboard to represent it. So if a bus is 64 bits wide, it needs 64 traces for data. In the P4/G4/G3 buses there are also 32 traces for memory address information, plus some more for clocks, ground, and others. This is a bidirectional bus, so info travel in both directions along the wire. It's a parallel bus, so all of the traces must have the signals traveling through synchronously. The signals must arrive at virtually the same time. The consequences of forcing the signals to arrive at the same time makes it very hard to clock the bus at high clock rates. The longer the trace, the harder it is to have high bus clock rates.

The big advantage of the PPC 970 bus is that it mitigates these problems. It also adds the "elastic" feature. The 970 bus is a unidirectional (data only travels one direction) elastic bus. The 970 essentially has one bus for outgoing data and one bus for incoming data. This bus only has 32 traces for data and addresses (plus ground, sidebands, command, clocks). So it has far fewer traces and signals aren't as sensitive to not being in synch. The two benifits for this are longer trace length and faster bus clocks, at the cost of reducing the width of the bus. The elastic feature in the bus essentially means that a succeeding signal doesn't have to wait for the preceeding to get to its destination before being sent. That is, it ensures that all the bus traces on the motherboard doesn't have to be nearly the same length.

Overall I think it'll come out as a wash, but there are situations where the P4's 800 MHz bus bandwith will be better I think. It initially has all of 6.4 Gbyte/s going to the processor, while the 970 only has half. Once, the P4 begins to write to memory, it'll probably start to even out.

The speculation Programmer is talking about is for multiprocessor topologies. I think Apple will only go dual at most and will just use a system/memory controller that supports 2 PPC 970s directly. Anything more would involve a switched fabric. A ring topology would have some weird latency issues, and we don't even know if it would even physically support such a configuration.

Quote:

Also, isn't it true the the 970's bus is a ratio of the processor speed (4:1). Meaning that as the processor scales, so does the bus? So a 2.0GHz 970 would have a 1GHz bus (really 500MHz double pumped), and a 2.5GHz 970 would have a 1.25GHz bus.

Yes. Yes.

zapchud · March 27, 2003 5:56PM

Oh shite, I really messed up my post! Switched bi- and unidirectional totally.

Consistency is good though

Listen to THT, not me!

Edit: Fixed the mess in the post, I think it's ready to be slaughtered by disagreements and corrections now

Please Help Put the 970 in Context

Comments