# Power5

Posts: 222member
[quote]This is a fairly exotic case, but I saw a real-world example with the RC5 client (if memory serves): A DP machine ran the program just a bit over twice as fast as a SP machine at the same clock speed.]]] <hr></blockquote>

I think what you might be referring to is a "superlinear speedup" I distinctly remember discussing this concept with Chris Cox from Adobe (as well as another PPC programer).

Basically, this term is more or less *techno-babble* that means: "more apparent acceleration than can be explained by a simple clock speed difference". For example, 1000MHz is 1.25 times 800MHz, but some programs that use AltiVec effectively and efficiently can sometimes be accelerated by more than 25%. Programmer, can correct me if he feels that I misunderstood the term, but I think it's what you're referring to ;-) Another term that people might here from time to time is "orthogonal".

In programming and code optimization "orthogonal" means that two things (i.e., performance gains) are independent of each other in a specific and distinct sense. In the case relative to the G4/AltiVec the bonuses due to AltiVec and other bonuses due to MP (let's say) are independent of each other. In other words, the respective speedups add up fully, with no loss due to the possible interactions between them.

Perhaps Programmer could elaborate a bit more.

--

Ed M.
Posts: 360member
[quote]Originally posted by Ed M.:

<strong>

I think what you might be referring to is a "superlinear speedup" ...</strong><hr></blockquote>

In parallel computing parlance, "superlinear speedup" would be focused on the number of CPUs while holding the Hz constant.

Case A: 1x 1.8GHz CPU.

Case B: 2x 1.8GHz CPU.

If case B = 2x case A -&gt; perfectly linear speedup.

If case B &gt; 2x case A -&gt; superlinear speedup.

if case B &lt; 2x case A -&gt; normal.

If there was a case C with 4x CPU, then the comparisons would be to 4x case A.

Starting from an inherently parellizable program, it's quite easy to get B = (0.98)x2xA or so. Getting "superlinear speedup" is a lot tougher - unless the code is 'pathological'. Excess swapping around or whatnot in the "A" case that just isn't present in the B or C case.

There's also monolithic programs, that are _not_ amenable to parallelization, to the point that B = (0.51)x2xA or so. That sucks. The extra CPU is then just running the OS & twiddling its thumbs.

I've seen textbooks that claim superlinear speedup is impossible - but that only applies if you are in charge of writing the 'scheduler' also. If problem1 and problem2 are both compute heavy problems, and they keep trading off (as they would under a desktop OS in case A), then there's a lot of 'wasted' time. Very easy to turn around and get superlinear speedup, since the amount of process swapping would drop dramatically with more CPUs. In the textbook, they'd have you write things to completely finish 'problem1' _then_ 'problem2'. Very tough to get superlinear speedup (on the same case A & case B setup!) if they're scheduled thatway. (Assuming the OS is doing nothing much at all.)

Hoping for Case D.

[ 02-19-2003: Message edited by: Nevyn ]</p>
Posts: 110member
[quote]Originally posted by Nevyn:

<strong>

...

Case A: 1x 1.8GHz CPU.

Case B: 2x 1.8GHz CPU.

... case C with 4x CPU...

</strong><hr></blockquote>

- PowerMac

- PowerMac Dual

- Good

- Better

- Best

- 1 x price

- 1.5 x price

- 2 x price

[ 02-19-2003: Message edited by: Kendoka ]</p>
Posts: 2,383member
[quote] Two years between chips is enough time to run away and do some dramatic things."

<hr></blockquote>

So, why have we been stalling on the G4 for nearly four years?

Lemon Bon Bon <img src="graemlins/hmmm.gif" border="0" alt="[Hmmm]" />
Posts: 477member
Regarding either single 970 or dual G4s in the towers. Between the arrival of the PCI PPC in 1995 to the B&W G3 generation the mac suers was quite happy with the performance compared to PCs.

When Morollas miscariage slid out on stage in late 99 the protests roared and it has not abated. The Mac got way behind then and while the dual CPUs and OS X have lessened the blow they still are.

For Apple to replace say the dual 1.25 G4 with a single 1.4 970 today would bring them on par with 2.6 GHz P4 a good thing but hardly much to brag about. A dual 1.4 that would beat the 3 GHz P4 soundly and then topping it of with a dual 1.6 just to increase the pleasure would be so much sweeter.

Apple have had the tower buyers quite unhappy for several years. Introducing the 970 in a pro range work station series with 6 figure price tags would just make many presumtive tower customers to stay on hold and wait until the 970 trickle down.

With its UNIX underpinings and applications in place I can imagine Apple serving Tower Extereme with 4 and 8 CPUs and what have you. But into its fourth year after the misserable G4 intruction Apple has to get its current line up of towers in good shape. By that I do not mean tossing in faster G4! If the Motorola/G4 was a good thing it would have showed signs of catching up the x86 crowd during these four years!

As far as desktop CPUs go G4 is dead end. It might scale up during the years to come but nothing short of a miracle will make it competetive. In note books and embedded applications OK, but desktops <img src="graemlins/oyvey.gif" border="0" alt="[oyvey]" />

lastly to vere into the nominal topic: If the Power 5 is anywere close to price of the Power 4 I can not imagine what Apple would do with it
Posts: 174member
[quote]Originally posted by Kendoka:

<strong>

</strong><hr></blockquote>

So they're quad-processor 68k Macs now? Maybe Motorola has resumed the development of the 68060 for Apple, to give us the blazing speed of a 100MHz CISC processor. <img src="graemlins/lol.gif" border="0" alt="[Laughing]" />
Posts: 1,861member
Posts: 2,383member
Describing the G4 as a 'miscarriage'.

Ouch.

Line up, line up...can anybody top that?

Lemon Bon Bon
Posts: 3,453member
I find it funny when people bandy about terms like "super-linear speedup" like they have some deep meaning. It is really very simply -- a linear relationship is one where one quantity varies directly by a fixed constant with another. If performance and clock rate are related by a linear relationship then increase one of them by a certain percentage will increase the other by the same percentage.

Super-linear means that the second number (performance in this case) increases by a larger percentage than the first. Sub-linear means it increases by a smaller percentage.

See? No magic.

Heh. The Macintosh Quadra was called that because of the '4' in 68040. It only had one processor -- back in that day multiprocessing was far less effective than it is now even with the G4's limited bus bandwidth.

As for the POWER5, I expect "low-end" is probably still way more expensive than Apple will pay -- look at the prices of the IA-64 and Xeon processors from Intel. We have every indication, however, than after the 970 will come a process-shrunk 970+ and then a 980. The 980 will be based on the POWER5 and will have VMX added (if the POWER5 itself doesn't have it). My guess is that the 970+ will hit production sometime around May 2004 and the 980 will hit production sometime around May 2005. No doubt there will be a 990 based on the POWER6 as well. The clock rates should ramp steadily through the lifetime of each processor so the 970, for example, might climb over 2 GHz by the time the 970+ arrives (which is purported to target at least 2.5 GHz).

I also suggest you go and look at Tom's Hardware's article on the last 65 x86 processors. If you do the math you'll notice that the performance over the last few years has been decidedly "sub-linear" with clock rate. The 100 MHz Pentium got twice as much done per clock as the latest P4 despite all the improvements in system, bus, etc. This trend is likely to continue -- pushing hard on any system parameter is always going to result in diminishing returns. Other facets of processor performance become more important (SIMD, superscalar, bandwidth, etc etc etc). Apple may never have the highest quoted clock rate, but people don't need clock rate they need performance.
Posts: 12,408member
Excellent Point Programmer!
Posts: 777member
[quote]Originally posted by DrBoar:

<strong>

Apple have had the tower buyers quite unhappy for several years. Introducing the 970 in a pro range work station series with 6 figure price tags would just make many presumtive tower customers to stay on hold and wait until the 970 trickle down.

With its UNIX underpinings and applications in place I can imagine Apple serving Tower Extereme with 4 and 8 CPUs and what have you. But into its fourth year after the misserable G4 intruction Apple has to get its current line up of towers in good shape. By that I do not mean tossing in faster G4! If the Motorola/G4 was a good thing it would have showed signs of catching up the x86 crowd during these four years!

As far as desktop CPUs go G4 is dead end. It might scale up during the years to come but nothing short of a miracle will make it competetive. In note books and embedded applications OK, but desktops <img src="graemlins/oyvey.gif" border="0" alt="[oyvey]" />

lastly to vere into the nominal topic: If the Power 5 is anywere close to price of the Power 4 I can not imagine what Apple would do with it</strong><hr></blockquote>

Good points, Apple has not been able to address the needs of the Pro's with their tower offerings in the last 12-24 months due to Motorolla's problems with scaling the G4. However, at some point in time Apple does need to address the performace gap the has risen throughout the lifetime of the G4 processor line for their core professional customers. It would be do Apples advantage to produce Dual 970's for these markets, and if/when processor supplies allow it to add a new class of computers above the PowerMac range with 4 or more processors (Xstations ?).
Posts: 1,626member
[quote]Originally posted by Programmer:

<strong>...Apple may never have the highest quoted clock rate, but people don't need clock rate they need performance....</strong><hr></blockquote>

Amen.

It still baffles me how people are comparing the 970 to Intel's PIV. I'm guessing most of this speculation is based on IBM's announced approximate SPEC #'s. But real world performance may be slower than or faster than assumptions based on SPEC. My bet is on faster when considering the 970, maybe much faster.
Posts: 2,383member
[quote]It would be do Apples advantage to produce Dual 970's for these markets, and if/when processor supplies allow it to add a new class of computers above the PowerMac range with 4 or more processors (Xstations ?).

<hr></blockquote>

Looking forward, if Apple buys into the Powerlite roadmap...I couldn't care less about Intel's mhz. Why? Because I'd know I'd be getting great performance and bandwidth.

The Powerlite roadmap would give Apple huge options for its desktop line over the next couple of years. The misery of the megahurts years will be outright transformed as the battle field moves to 64 bit.

Rickag. I'm still on a private bet that the fact that the 970 does so well in Spec...is a conservative estimate of the 970's REAL performance. In real world tests? I think it will storm the gates.

My glass table awaits...

Lemon Bon Bon :cool:

Yay. Ed M learned to quote! Whoop, whoop...

[ 02-19-2003: Message edited by: Lemon Bon Bon ]</p>
Posts: 3,453member
[quote]Originally posted by rickag:

<strong>

Amen.

It still baffles me how people are comparing the 970 to Intel's PIV. I'm guessing most of this speculation is based on IBM's announced approximate SPEC #'s. But real world performance may be slower than or faster than assumptions based on SPEC. My bet is on faster when considering the 970, maybe much faster.</strong><hr></blockquote>

I'm looking forward to seeing how fast a 1.8 GHz w/ full DDR333 (or better) memory bandwidth really is when running AltiVec code.

Smmmmmmokin'!
Posts: 360member
[quote]Originally posted by Programmer:

<strong>

I'm looking forward to seeing how fast a 1.8 GHz w/ full DDR333 (or better) memory bandwidth really is when running AltiVec code.</strong><hr></blockquote>

...and double precision floating point. The combination is a heck of a lot of FP.

Read: Lightwave will become something worth demoing.

The part I'm interested in is: will they spring for two banks of memory (or something more exotic even.) The "Pro" groups can certainly find a use for more sockets, and that one feature could have a scary impact on the Dual's performance.

An expensive feature for a \$1600 computer perhaps... but there's plenty of room in the \$3999 models for a feature like that.

[Edit: braces]

[ 02-19-2003: Message edited by: Nevyn ]</p>
Posts: 1,224member
I'm looking forward to the time when Stevie pulls one of his PS bake-offs and before he can look up at the big screen the FlowerMac is already done.
Posts: 6,008member
[quote]Originally posted by Nevyn:

<strong>The part I'm interested in is: will they spring for two banks of memory (or something more exotic even.) The "Pro" groups can certainly find a use for more sockets, and that one feature could have a scary impact on the Dual's performance.

An expensive feature for a \$1600 computer perhaps... but there's plenty of room in the \$3999 models for a feature like that.

</strong><hr></blockquote>

In the beginning Apple should stick to a simple easy set up using cheaper standard RAM like PC2700 or maybe even DDR-II. Dual bank is fine but the requirements of the chipset and motherboard real estate would make the price further skyrocket. KISS. I think even a fast Rambus set up may even be beneficial, and with Apple endorsing it on some level, it can legitimize the technology rather than the focus on the company policy of litigation (Rambus that is).

The POWER4 systems IBM now has use a proprietary Memory card that runs at 400MHz DDR (200 x 2) and is dual ported with a wide bus out. This would be overkill for the PowerMac. I'm sure the memory set up for the POWER5 will be equally and linearly impressive, not to mention expensive also.
Posts: 79member
[quote]Originally posted by Outsider:

<strong>

In the beginning Apple should stick to a simple easy set up using cheaper standard RAM like PC2700 or maybe even DDR-II. Dual bank is fine but the requirements of the chipset and motherboard real estate would make the price further skyrocket. KISS. I think even a fast Rambus set up may even be beneficial, and with Apple endorsing it on some level, it can legitimize the technology rather than the focus on the company policy of litigation (Rambus that is).

The POWER4 systems IBM now has use a proprietary Memory card that runs at 400MHz DDR (200 x 2) and is dual ported with a wide bus out. This would be overkill for the PowerMac. I'm sure the memory set up for the POWER5 will be equally and linearly impressive, not to mention expensive also.</strong><hr></blockquote>

Since the 970 is supposed to have a 7.2 GBps FSB giving 6.4 GBps typical at 1.8GHz, even DDR400 (3.2 GBps) in a single channel setup would not be able to keep the processor supplied with data. This is why Intel went to dual channel RDRAM with the 850 chipset because the P4 had 3.2 GBps FSB and SDRAM was at 0.8 GBps. Dual PC1600 gave them 3.2 GBps...

So I predict that Apples first 970 system will have dual channel DDR support. This is not much more expensive than single channel. Look at the NVidia nForce chipset which has dual DDR. It is expensive for a chipset but is &lt;~\$70 and includes good graphics and great sound. Pull out the sound and graphics and you have, say, a ~\$45 chipset. Not expensive when considering a system is &gt;\$2000...

MM