As far as I can discern, this all means that the 7450 - and more broadly, the MPX bus - don't support clock doubling, although a system built around a 7450 (or, in particular, more than one 7450) could still benefit in certain applications from DDR RAM.
[/QB]<hr></blockquote>
From my understanding of the documents on Motos website, this is not true, since both processors not only share the memory bandwidth (in which case the individual processors' bandwidth requirements would neatly add up), but also share the MPX bus itself (like the Pentium does, but unlike the Athlon does), which in turn becomes the bottleneck (not the memory controller).
Bye,
RazzFazz
(EDIT: Oops, redundant... Guess I should read the entire thread before posting...)
Particularly in light of the rumor that the G5 will feature an on-die memory controller. Assuming for the sake of discussion that this is true, the CPU should have a much fatter pipe to the memory controller - that being the point of putting the controller on die, as far as I can tell - and then the only potential bottleneck becomes the bus to memory. Correct?
[/QB]<hr></blockquote>
From what I understand, the main advantage of putting the memory controller onto the processor die is lower latency, not increased bandwidth. Also, keep in mind that the latter is only useful if the memory can keep up.
<strong>It would be possible, but rather complicated and expensive, to build a memory controller (and motherboard) that had multiple MPX busses... but Apple is unlikely to do it.
</strong><hr></blockquote>
IIRC, this is exactly how it works with SMP Athlons.
From what I understand, the main advantage of putting the memory controller onto the processor die is lower latency, not increased bandwidth. Also, keep in mind that the latter is only useful if the memory can keep up.
</strong><hr></blockquote>
Yes, but since the discussion centers around whether the bus can keep up with the memory, I felt it was a relevant question.
Lower latency is good. Anything that keeps the processor fed is good.
As far as Rambus goes, I have heard that it's especially difficult to engineer their technology onto motherboards - lots of interference and crosstalk and very low tolerance for noise and timing errors. That information has aged a year or so, so they might have worked out the kinks by now, but it's something to consider. Otherwise, the streaming bandwidth sounds great. Maybe it'll take off when a disgusted judge throws all their IP into the public domain.
<strong>[qb]In my experience there isn't much to choose between RamBus and DDR when doing media processing work (i.e. running sequentially through large data sets), but both are much faster than SDRAM. I've seen some algorithms run almost twice as fast.</strong>
Even dual-channel Rambus solutions v single channel PC2100?
[/QB]<hr></blockquote>
It occurs to be somewhat belatedly that the processors this was run on were probably unable to utilize much more than the DDR bandwidth on the algorithms in question. For faster processors,or less work per data element, RamBus may have shown better throughput.
When the specs first came out (~1995) I was impressed by RamBus too... but after using it (especially on the PlayStation2 where the processor has lousy cache) I hate the stuff. I agree with the faster/narrower bus concept and all the theoretical advantages, but in my experience it hasn't delivered yet and the costs are very high.
It occurs to be somewhat belatedly that the processors this was run on were probably unable to utilize much more than the DDR bandwidth on the algorithms in question. For faster processors,or less work per data element, RamBus may have shown better throughput.</strong>
Most of the P4 dual-channel Rambus v P4 DDR SDRAM v Athlon DDR SDRAM benchmarks I've seen show the dual-channel Rambus setup dominating memory bandwidth applications. And for pure memory throughput, Rambus utterly crushes DDR SDRAM. I don't think it's a "may". What would hurt it is random I/O style memory accesses where the latency would kill it, but that style of memory accessing shouldn't be as prevalent as sustained reads and writes for most modern apps.
<strong>I agree with the faster/narrower bus concept and all the theoretical advantages, but in my experience it hasn't delivered yet and the costs are very high.</strong>
It all depends on Intel to reduce costs. I think Rambus truly delivers on memory throughput in multi-channel configurations.
From my understanding of the documents on Motos website, this is not true, since both processors not only share the memory bandwidth (in which case the individual processors' bandwidth requirements would neatly add up), but also share the MPX bus itself (like the Pentium does, but unlike the Athlon does), which in turn becomes the bottleneck (not the memory controller).
Agree: The Athlon's bus EV6 is point to point topology, oposite to G4 and also PentiumIII which both share only one bus betwen CPU and N-bridge.
That means that G4 and PIII in multiCPU configuration are conected with N-bridge over one bus.
Athlons MPX in a dual CPU configuration are connected to N-bridge each with a separate bus, causing a N-bridge to have a great numbers of pins.
Moto will adress this problems in G5 design, incorporating on die memory controler, ommiting L3 cashe and adding RapidIo switch. At a 8 bit width the Rapidio will only add 8Tx + 8Rx + some control and clock pins, making also a design and implementation of multiple RapidIo devices for interconnection easy to design and implement.
Due to the fact that the specifications mentioned in rapidio.org take care of the full implementation of SMP is in my oppinion very unlikly that we will ever see the multicore G5.
Due to the fact that the specifications mentioned in rapidio.org take care of the full implementation of SMP is in my oppinion very unlikly that we will ever see the multicore G5.
rooster</strong><hr></blockquote>
I don't follow your logic here -- a multicore processor can have much wider and faster busses between the elements on the die, and can share many resources that seperate chips simply cannot share efficiently. The use of RapidIO or HyperTransport doesn't really affect that.
If the 74xx doesn't support a FSB higher than 133MHz/ 64 bits, where is the 250MHz DDR (effective 500MHz) L3 cache connected? I assume that this connection cannot be used for DRAM
<strong>If the 74xx doesn't support a FSB higher than 133MHz/ 64 bits, where is the 250MHz DDR (effective 500MHz) L3 cache connected?
</strong><hr></blockquote>
The L3 interface has its own dedicated pins on the 745x. (The lack of those it what allows the 744x's package to be smaller and have less pins).
[quote]<strong>I assume that this connection cannot be used for DRAM </strong><hr></blockquote>
Not that I knew of.
(The 745x explicitely allows you to use the cache as a (damn fast) main memory replacement ("direct mapped address space"), but that's only really applicable to the embedded space, since it's limited to 2MB.)
I wonder if Mot would just expand the L3 tags for up to 36bit addressing and just use MPX to communicate with peripherals. The ram slots would have to be REAL close to the processor but it does have a 256bit bus to the processor core.
Do you have any idea how expensive SRAM is? And it is very low density compared to SDRAM. It would be easier and much cheaper to add DDR capability to MAXBus, I think.
I don't follow your logic here -- a multicore processor can have much wider and faster busses between the elements on the die, and can share many resources that seperate chips simply cannot share efficiently. The use of RapidIO or HyperTransport doesn't really affect that.</strong><hr></blockquote>
Programmer you are right, and i agree with you.
I just wanted to say, due to the fact that a multiprocessor G5 system would be easy to implement (if the final incarnation of G5 will really have all the stuff mentined in rapidio.org) - in my opinion Moto will not develope a multicore version of G5 because of extra developement time and costs.
according to tomshardware, rdram does much better. they show a pentium 4 at 2.6ghz (or so) using rdram beating out a pentium4 at 3.0ghz using ddr in quite a few benchmarks (especially the sisoft synthetic ones, as well as many others that seems more memory bandwidth reliant like multimedia stuff and of course quake3). here's linkage:
(it compares athlon 2300+ with intel at 3k, but again, the 2.6 seems to perform very well)
also, in terms of price, according to pricewatch..
256 megs:
rdram = $70 dollars a chip
ddr = $49 dollars a chip
(so about 21 dollars, so it's 60$ more expensive for 768 megs, which isn't much at all)
128 megs:
rdram = $35
ddr = $29
again, just a couple of bucks, i mean if we're buying $2000 dollar machines, this isn't very much at all.
again, more prices..
rdram = $140
ddr = $112
personally (i actually heard this from someone else first) i think that rdram got it's worst press when intel introduced it with the pentium 3. There, rdram was super expensive and it didn't provide a performance benifit even over sdram. I mean how would one feel if they paid a premium of 500 dollars for no performance increase? well, that just gave rdram a very poor rap and people stayed away from it. when paired with a chip that seems to be designed to exploit the faster memory, it performs very well.
DRDRAM also will scale to PC1200 (600MHz x 2) by the end of the year and PC1000(500MHz x 2) is expected this summer. Sure it only runs at a 16bit wide path but the brute force of it should more than make up for it. And remember our favorite peripheral connector (Firewire) works on the same principal; super fast serial connection instead of a slow parallel connection.
<strong>I wonder if Mot would just expand the L3 tags for up to 36bit addressing and just use MPX to communicate with peripherals. The ram slots would have to be REAL close to the processor but it does have a 256bit bus to the processor core.</strong><hr></blockquote>
No, L3 only has a 64 bit wide bus. The 256 bit wide one you're talking about is the one to L2.
Regarding RDRAM vs DDR RAM, alot depends on the memory controller and how efficiently it uses the bandwidth. Eg, the early Socket A DDR controllers were BS, giving a 5-%10 speed increase. However, the Intel i845-D DDR controller gives better performance than the i850 RDRAM controller which has 100MHz x Quad pumped x Dual channel for HUGE bandwidth.
Also, latency is HORRIBLE for RDRAM, so I bet DSP people would HATE it.
If Apple can design a decent DDR controller for the G5 then it will be far preferable to expensive, messy RDRAM.
Barto
PS I know RDRAM has gone down in price, but it would double again if Apple gave 'em a foothold.
<strong>Also, latency is HORRIBLE for RDRAM, so I bet DSP people would HATE it.</strong><hr></blockquote>
While RDRAM latency is horrible, DSP algorithms are the ones which will usually do well with it. Signal processing usually works on long sequential streams of data, which is what you want in a high latency system. Its all the other algorithms out there that do badly on high latency memory.
RDRAM has come down in price, but it still has issues which are ugly... the need for "blanks" to fill empty slots, changing path lengths affecting performance, and RamBus' stupid behaviour. Among other things, they tried to acquire patents based on material being shared in good faith between the memory manufacturers for the purposes of developing standards, and then they used those patents to demand royalties on memory designs other than their own! This artificial manipulation of the playing field for their own advantage is against the interests of the industry and the customers of the industry. We (the consumers) ought to be angry with RamBus for engaging in this kind of activity because in the end we're the ones that pay for it.
Comments
As far as I can discern, this all means that the 7450 - and more broadly, the MPX bus - don't support clock doubling, although a system built around a 7450 (or, in particular, more than one 7450) could still benefit in certain applications from DDR RAM.
[/QB]<hr></blockquote>
From my understanding of the documents on Motos website, this is not true, since both processors not only share the memory bandwidth (in which case the individual processors' bandwidth requirements would neatly add up), but also share the MPX bus itself (like the Pentium does, but unlike the Athlon does), which in turn becomes the bottleneck (not the memory controller).
Bye,
RazzFazz
(EDIT: Oops, redundant... Guess I should read the entire thread before posting...)
[ 01-23-2002: Message edited by: RazzFazz ]</p>
Particularly in light of the rumor that the G5 will feature an on-die memory controller. Assuming for the sake of discussion that this is true, the CPU should have a much fatter pipe to the memory controller - that being the point of putting the controller on die, as far as I can tell - and then the only potential bottleneck becomes the bus to memory. Correct?
[/QB]<hr></blockquote>
From what I understand, the main advantage of putting the memory controller onto the processor die is lower latency, not increased bandwidth. Also, keep in mind that the latter is only useful if the memory can keep up.
Bye,
RazzFazz
<strong>It would be possible, but rather complicated and expensive, to build a memory controller (and motherboard) that had multiple MPX busses... but Apple is unlikely to do it.
</strong><hr></blockquote>
IIRC, this is exactly how it works with SMP Athlons.
Bye,
RazzFazz
<strong>
From what I understand, the main advantage of putting the memory controller onto the processor die is lower latency, not increased bandwidth. Also, keep in mind that the latter is only useful if the memory can keep up.
</strong><hr></blockquote>
Yes, but since the discussion centers around whether the bus can keep up with the memory, I felt it was a relevant question.
Lower latency is good. Anything that keeps the processor fed is good.
As far as Rambus goes, I have heard that it's especially difficult to engineer their technology onto motherboards - lots of interference and crosstalk and very low tolerance for noise and timing errors. That information has aged a year or so, so they might have worked out the kinks by now, but it's something to consider. Otherwise, the streaming bandwidth sounds great. Maybe it'll take off when a disgusted judge throws all their IP into the public domain.
<strong>[qb]In my experience there isn't much to choose between RamBus and DDR when doing media processing work (i.e. running sequentially through large data sets), but both are much faster than SDRAM. I've seen some algorithms run almost twice as fast.</strong>
Even dual-channel Rambus solutions v single channel PC2100?
[/QB]<hr></blockquote>
It occurs to be somewhat belatedly that the processors this was run on were probably unable to utilize much more than the DDR bandwidth on the algorithms in question. For faster processors,or less work per data element, RamBus may have shown better throughput.
When the specs first came out (~1995) I was impressed by RamBus too... but after using it (especially on the PlayStation2 where the processor has lousy cache) I hate the stuff. I agree with the faster/narrower bus concept and all the theoretical advantages, but in my experience it hasn't delivered yet and the costs are very high.
[ 01-23-2002: Message edited by: Programmer ]</p>
It occurs to be somewhat belatedly that the processors this was run on were probably unable to utilize much more than the DDR bandwidth on the algorithms in question. For faster processors,or less work per data element, RamBus may have shown better throughput.</strong>
Most of the P4 dual-channel Rambus v P4 DDR SDRAM v Athlon DDR SDRAM benchmarks I've seen show the dual-channel Rambus setup dominating memory bandwidth applications. And for pure memory throughput, Rambus utterly crushes DDR SDRAM. I don't think it's a "may". What would hurt it is random I/O style memory accesses where the latency would kill it, but that style of memory accessing shouldn't be as prevalent as sustained reads and writes for most modern apps.
<strong>I agree with the faster/narrower bus concept and all the theoretical advantages, but in my experience it hasn't delivered yet and the costs are very high.</strong>
It all depends on Intel to reduce costs. I think Rambus truly delivers on memory throughput in multi-channel configurations.
[QB]
From my understanding of the documents on Motos website, this is not true, since both processors not only share the memory bandwidth (in which case the individual processors' bandwidth requirements would neatly add up), but also share the MPX bus itself (like the Pentium does, but unlike the Athlon does), which in turn becomes the bottleneck (not the memory controller).
Agree: The Athlon's bus EV6 is point to point topology, oposite to G4 and also PentiumIII which both share only one bus betwen CPU and N-bridge.
That means that G4 and PIII in multiCPU configuration are conected with N-bridge over one bus.
Athlons MPX in a dual CPU configuration are connected to N-bridge each with a separate bus, causing a N-bridge to have a great numbers of pins.
Moto will adress this problems in G5 design, incorporating on die memory controler, ommiting L3 cashe and adding RapidIo switch. At a 8 bit width the Rapidio will only add 8Tx + 8Rx + some control and clock pins, making also a design and implementation of multiple RapidIo devices for interconnection easy to design and implement.
Due to the fact that the specifications mentioned in rapidio.org take care of the full implementation of SMP is in my oppinion very unlikly that we will ever see the multicore G5.
rooster
<strong>[QUOTE]Originally posted by RazzFazz:
[QB]
Due to the fact that the specifications mentioned in rapidio.org take care of the full implementation of SMP is in my oppinion very unlikly that we will ever see the multicore G5.
rooster</strong><hr></blockquote>
I don't follow your logic here -- a multicore processor can have much wider and faster busses between the elements on the die, and can share many resources that seperate chips simply cannot share efficiently. The use of RapidIO or HyperTransport doesn't really affect that.
THT, thanks for posting the down-to-earth explanation about how the system bus is broken down and how those parts must change to support DDR....
<a href="http://e-www.motorola.com/collateral/SNDFH1112.pdf" target="_blank">http://e-www.motorola.com/collateral/SNDFH1112.pdf</a>
page 6
<strong>If the 74xx doesn't support a FSB higher than 133MHz/ 64 bits, where is the 250MHz DDR (effective 500MHz) L3 cache connected?
</strong><hr></blockquote>
The L3 interface has its own dedicated pins on the 745x. (The lack of those it what allows the 744x's package to be smaller and have less pins).
[quote]<strong>I assume that this connection cannot be used for DRAM
Not that I knew of.
(The 745x explicitely allows you to use the cache as a (damn fast) main memory replacement ("direct mapped address space"), but that's only really applicable to the embedded space, since it's limited to 2MB.)
Bye,
RazzFazz
[ 01-31-2002: Message edited by: RazzFazz ]</p>
<strong>
I don't follow your logic here -- a multicore processor can have much wider and faster busses between the elements on the die, and can share many resources that seperate chips simply cannot share efficiently. The use of RapidIO or HyperTransport doesn't really affect that.</strong><hr></blockquote>
Programmer you are right, and i agree with you.
I just wanted to say, due to the fact that a multiprocessor G5 system would be easy to implement (if the final incarnation of G5 will really have all the stuff mentined in rapidio.org) - in my opinion Moto will not develope a multicore version of G5 because of extra developement time and costs.
Rooster
according to tomshardware, rdram does much better. they show a pentium 4 at 2.6ghz (or so) using rdram beating out a pentium4 at 3.0ghz using ddr in quite a few benchmarks (especially the sisoft synthetic ones, as well as many others that seems more memory bandwidth reliant like multimedia stuff and of course quake3). here's linkage:
<a href="http://www6.tomshardware.com/cpu/02q1/020128/index.html" target="_blank">http://www6.tomshardware.com/cpu/02q1/020128/index.html</a>
(it compares athlon 2300+ with intel at 3k, but again, the 2.6 seems to perform very well)
also, in terms of price, according to pricewatch..
256 megs:
rdram = $70 dollars a chip
ddr = $49 dollars a chip
(so about 21 dollars, so it's 60$ more expensive for 768 megs, which isn't much at all)
128 megs:
rdram = $35
ddr = $29
again, just a couple of bucks, i mean if we're buying $2000 dollar machines, this isn't very much at all.
again, more prices..
rdram = $140
ddr = $112
personally (i actually heard this from someone else first) i think that rdram got it's worst press when intel introduced it with the pentium 3. There, rdram was super expensive and it didn't provide a performance benifit even over sdram. I mean how would one feel if they paid a premium of 500 dollars for no performance increase? well, that just gave rdram a very poor rap and people stayed away from it. when paired with a chip that seems to be designed to exploit the faster memory, it performs very well.
Ted
<strong>I wonder if Mot would just expand the L3 tags for up to 36bit addressing and just use MPX to communicate with peripherals. The ram slots would have to be REAL close to the processor but it does have a 256bit bus to the processor core.</strong><hr></blockquote>
No, L3 only has a 64 bit wide bus. The 256 bit wide one you're talking about is the one to L2.
Bye,
RazzFazz
Also, latency is HORRIBLE for RDRAM, so I bet DSP people would HATE it.
If Apple can design a decent DDR controller for the G5 then it will be far preferable to expensive, messy RDRAM.
Barto
PS I know RDRAM has gone down in price, but it would double again if Apple gave 'em a foothold.
<strong>Also, latency is HORRIBLE for RDRAM, so I bet DSP people would HATE it.</strong><hr></blockquote>
While RDRAM latency is horrible, DSP algorithms are the ones which will usually do well with it. Signal processing usually works on long sequential streams of data, which is what you want in a high latency system. Its all the other algorithms out there that do badly on high latency memory.
RDRAM has come down in price, but it still has issues which are ugly... the need for "blanks" to fill empty slots, changing path lengths affecting performance, and RamBus' stupid behaviour. Among other things, they tried to acquire patents based on material being shared in good faith between the memory manufacturers for the purposes of developing standards, and then they used those patents to demand royalties on memory designs other than their own! This artificial manipulation of the playing field for their own advantage is against the interests of the industry and the customers of the industry. We (the consumers) ought to be angry with RamBus for engaging in this kind of activity because in the end we're the ones that pay for it.