Intel at 4 GHz??? Come on Motorola...

onlooker · March 13, 2002 7:08PM

Damn this is a great thread! :eek:

eric d.v.h · March 13, 2002 11:13PM

[quote]Originally posted by G-News:

No I wasn't thinking about the McKinley, I was speaking of the Itanium, and only saying that, opposite to a previous post, saying it never shipped in systems, already had shipped in systems.

That included no statement on its performance, design or efficiency whatsoever.<hr></blockquote>

[quote]Originally posted by Rickag:

I think what THT is referring to is a couple of posts I made on Arstechnic's Macintoshian Achaia message board.<hr></blockquote>

Okay. I just made that comment because it sounded like G-News thought that _I_ said the Itanium wasn't shipping yet. my reply was basically meant to say:

Of course it's shipping. but it's still a piece of garbage.

Eric,

randycat99 · March 13, 2002 11:49PM

Well that's a side of THT I have not seen here before! If this is some sort of gig, I must have missed it, cuz I don't know what is going on.

eric d.v.h · March 14, 2002 1:53AM

[quote]Originally posted by rickag:

I became somewhat sarcastic in what I hope was a good natured way and made some jokes concerning building my own computer, but need some tools. I hope I didn't offend any one.<a href="http://arstechnica.infopop.net/OpenTopic/page?a=tpc&s=50009562&f=8300945231&m=3360953743&r= 6040919843#6040919843" target="_blank">My feeble attempt at sarcasm</a>

Any way, I contacted all the "where to buy sites" and none of them listed the MPC8540. I also stumble across a list of all available processors manufactured by Motorola for the 1st quarter of 2002 and the MPC 8540 is not listed.[ 03-13-2002: Message edited by: rickag ]<hr></blockquote>

[quote]Originally posted by THT:

Yes, the Motorola MPC 8540 is not shipping yet. No one can find a seller for it and no one can find one person that has one. And Motorola themselves say it won't be shipping until 2H 02. That's if Motorola is lucky. One wonders if it has even hit first silicon yet!

Moreover, I'm tired of the 8540, and similarly IBM's 440, used as evidence of a hypothetical processor for Apple's supposed Power Mac G5. They are Book E processors. That's nothing for Apple to crow about, and IBM and Moto call the architecture "Book E" for a reason. Guess what the "E" stands for. We all better hope to whatever power one subscribes to that Apple does not use the 8540 in the desktops and portables.

Otherwise, the 8540 is an embedded processor with no FPU and no AltiVec that clocks from 0.6 to 1 GHz on a 0.13 micron process. That's a 0.13 micron process. And specs out at 2300 mips and is not expected to ship in volume until 2H 02. That is not something for Apple to crow about unless they've got some super secret Darwin based PPC PDA or Tablet. Even then, I think the IBM 750fx might be better choice.

Feh, the current 7455 G4 processor specs in at 2300 mips at 1 GHz on a 0.18 micron SOI process. It has the same integer performance as a 1 GHz 8540, has actual FPU and AltiVec, and is shipping in the current Power Macs only a mere year earlier than any probable 8540 hardware. Apple gains nothing with the 8540 chip and architecture.<hr></blockquote>

Hmmm? I wasn't saying that Apple should actually use the 8540 _specifically_ in upcoming Macs. I just meant that it was a sign that Motorola was done with the 85xx line's design. and has now or soon will present Apple with a chip to use in motherboard designs. I already knew that the 8540 is one of those wacky embedded chips like the 6xxx and 82xx. but I didn't know that the 8540 wasn't shipping yet. but

[quote]Originally posted by THT:

Motorola and Apple has some straight forward options. The easy way out is to add a second FPU to the 7455,<hr></blockquote>

Absolutely not. Apple needs to go 64-bit as soon as possible.

[quote]Originally posted by THT:

modify the MPX bus for DDR, perhaps bumping one of the integer units to do multipy, divide et al,<hr></blockquote>

I think we can all agree that DDR is bleeding obvious. as for bumping an IPU for another FPU: no. another FPU can be added in clean. like the way how the AltiVec VPU was slid into the G3 core for the G4(After all. what much else is a G4 but a G3 with a VPU?

).

[quote]Originally posted by THT:

increasing the backside cache size, and increasing the issue and completion width to 4. Further down the road, modifying the cache design (I think super fast and huge (8 to 128 MB) backside cache is the way to go),<hr></blockquote>

Backside cache is meant to be a small cache. using SRAM. that runs at or near the clock frequency of the CPU. this dumbed-down DDR SDRAM they're using now is moronic.

[quote]Originally posted by THT:

modifying AltiVec to do 64 bit precision ops,<hr></blockquote>

AltiVec is just fine the way it is(Although adding another vector unit. similar to the dual FPUs in some CPUs. would be a good idea).

[quote]Originally posted by THT:

extending the pipeline to 10 to 14 stages,<hr></blockquote>

Adding pipeline stages is against the principles of the PowerPC. and RISC itself(Hint: I didn't like the 7450 _one bit_

. and think they should heave ho the extra stages). the smarter thing would be to add extra pipelines. like in the Alpha.

[quote]Originally posted by THT:

and adding SMT.<hr></blockquote>

Speaking of the Alpha. let's have a moment of silence for the EV8 that never was.

.......................

Ahem. it took DEC a tremendous amount of time, effort and expertise to think of and implement SMT. I seriously doubt that Motorola(Or anyone else for that matter?) is going to get their hands on SMT. especially now that Intel owns the EV8's mortal remains(If only these DEC engineers hadn't signed away their intellectual property rights

).

[quote]Originally posted by rickag:

The only thing worthwhile in the 8540 I see is the Ocean switched fabric, and in Apple's case, it should be used in a core logic chipset rather than wasting the precious die acreage of the CPU.<hr></blockquote>

Yuck! I hate "Fabric"s. if you're going to bother with extra connections. total interconnect(Like in IBM's POWER4 or SGI's HyperCube Cluster) is the only way to go.

Eric,

[ 03-14-2002: Message edited by: Eric D.V.H ]

eric d.v.h · March 14, 2002 2:48AM

[quote]Originally posted by Transcendental Octothorpe:

This really isn't true. Sure, sand is cheap. But it costs a bundle to make it into a 99.99999999% pure single crystal the size of a punching bag. And that's just the beginning. The manufacture process is hugely expensive, in the chemicals they use, the capital equipment they must buy, the protocols they must follow, and the power (both man and energy) to run it all.

Believe me. I walk the testing floor at Big Blue every day, and we have some of the test equipment here in our office. You would not believe how much it costs to run that stuff for even a month.

However, I do agree with you that CD duplication is dirt cheap. The analogy just doesn't hold for Semi Manufacturing, however.<hr></blockquote>

[quote]Originally posted by Eskimo:

Eric, you could not be more wrong. The semiconductor industry would kill to have a manufacturing structure as easy as the automotive industry. Do you have any idea of how semiconductors are manufactured? I wrote up a little thread a while back to try to help out people who had no experience in the field you might want to check out. <a href="http://forums.appleinsider.com/cgi-bin/ultimatebb.cgi?ubb=get_topic&f=10&t=000022"; target="_blank">http://forums.appleinsider.com/cgi-bin/ultimatebb.cgi?ubb=get_topic&f=10&t=000022</a>;

The Semiconductor industry is by far the most capital intensive industry in the world. It is also the most technically advanced and precise. That technology and precision does not come cheap. The tools used inside a fab cost millions a piece, and in order to produce in any volume you require multiple pieces of each. A leading edge fab today costs $5 BILLION most of which is in capital equipment. And before you somehow try to rationalize that amount remember that leading edge in the semiconductor industry lasts 5 years at the max. GM and Ford are still producing cars in the same plants they have for decades.

Tools change every year, they get more advanced and at the same time more expensive. The technology to produce the chips of 2005 does not even exist yet. It is being worked on in R&D labs which represent billions of dollars in investment. The tools of today did not exist in 1998. In addition to tools you have all the material costs. As Transcendental Octothorpe mentioned we use silicon that has been made ultra pure by various companies, we then pay companies to form this ultra pure Si into ingots and saw those ingots into wafers for us. A starting 8" wafer will at the minimum cost $50-$100. Then you have to buy and maintain some of the most pure chemicals and solvents in the world on a massive scale.

Semiconductor Fabs use HUGE amounts of water and electricity, so much in fact that many towns are wary about new fab construction now of days and semi companies have to carefully scout for new water sources. The largest user of electricity and water in Austin, TX by far is Motorola, followed by AMD and Samsung. All using these resources for their fabs. Not only do the tools require MegaWatts of electricity but it must be delivered reliably with absolutely no voltage variation. This requires individual substations, investments in large voltage regulation systems and back up emergency power. These factories run 24 hours a day and employ thousands of very well paid employees

.

Now I know how much it costs to produce some chips but I can't say due to being under NDA and having some professional courtesy for competitors. But suffice to say when you say pennies you are a long ways off. And don't try to compare the chip industry with the CD duplication industry, you do a great deal of highly skilled, highly gifted people a disservice.<hr></blockquote>

I read through that link. although I already knew a fairly large amount of it. there were some very key parts missing from my knowledge. and clearly still are. thanks _a lot_ for the information though.

The expenses involved in the production of individual chips _is_ appearantly greater than I suspected. but I still doubt that the total yearly costs of developing and producing CPUs comes anywhere near enough to contribute. say. $70 of real expenses per chip. assuming 3,000,000 marketable chips per year of output.

I just want to know whether or not it exceeds about $70 per chip averaged over a year. as I'm simply curious as to a top price inflation for a mid-volume chip like the 744x. I won't mind _at all_ if you decline to answer for your aforementioned reasons. and won't ask any more questions on the matter.

Eric,

g-news · March 14, 2002 3:37AM

you must not forget that, depending on the production yields, one chip has to pay the cost for his 3 still born brothers as well.

With G4 yields down around 15% so time ago, one G4 would cost roughly 7x the price it cost to produce the working chip.

If you cook a meal and spoil it, it still cost the energy, work and ingredients you put into it, even if you can't sell it to a customer anymore.

Most industry branches opt for as little "failures" as possible. The IT industry is one of the few industries were error correction is to quite some degree beyond human capabilities.

There's just no way to stop acid from burning out a little too much silicon, if the silicon has a little flaw here and there etc...

I'm pretty sure out of a range of speeds on a certai chip, the lowest speeds are very very near production cost. The money is made on the high speed chips that typically cost up to 6x more than the lowest end models. (yet being based on the same production process).

G-news

eric d.v.h · March 14, 2002 4:48AM

[quote]Originally posted by IntlHarvester:

I didn't know that Apple was so involved in the design, and thought they were coming at from more of a customer perspective (I've heard they looked hard at Sparc, Alpha, and even x86) -- thanks for the info.<hr></blockquote>

Your welcome

.

[quote]Originally posted by IntlHarvester:

Nope -- Several years before CHRP, IBM and Moto released a spec called PReP, designed to support NT, OS/2, and AIX. Apple did not participate at all and apparently there was some fundamental incompatibilities with MacOS. These machines actually shipped, although IBM backed off when they realized that OS/2-PPC was never going to get off the ground.<hr></blockquote>

Hmmm? it appears you're absolutely correct.

<a href="http://home1.gte.net/res008nh/nt/ppc/default.htm"; target="_blank">IBM support page for MS Windows NT/PPC</a>

<a href="http://www.ora.com/reference/dictionary/terms/P/PowerPC_Reference_Platform.htm"; target="_blank">Dictionary definition of PReP</a>

[quote]Originally posted by IntlHarvester:

From all appearances, at the launch of PowerPC, Apple was insistent on their old strategy of being a single source provider. Maybe a year or two later, in desperation, Apple did a 180 and decided to support cloning. Only problem was that their OS wasn't properly abstracted from the hardware, which meant that the 'clone' companies had to buy PowerMac board designs from Apple. CHRP was the solution, and Apple planned to include CHRP support in the upcoming "Copeland" release.<hr></blockquote>

Quite right. but actually. Apple was working on a <a href="http://www.applefritter.com/macosppc/starmax/index.html"; target="_blank">System 7.6 CHRP enabler</a> too.

[quote]Originally posted by IntlHarvester:

What happened? Motorola finally gave up selling Windows-based PPC machines, and Copeland never shipped, and Apple did another 180 and canned their clone strategy. CHRP support became pointless because you couldn't buy a CHRP board if you wanted to, and Apple wouldn't sell you the OS (for business reasons) anyway. For all we know, OS X has kick-ass CHRP support...<hr></blockquote>

I really can't think of what happened. Copland was practically done(For those whom would refute me. I point to the fact of that. mere months after being fired. the Copland team regrouped, formed Be, made BeOS. a world class and highly polished OS if ever there was one and sold it on the BeBox. which they also built from scratch. I find it highly unlikely that a bunch of stumbling dullards that make such a mess as Copland is popularly portrayed as. could pull off such a feat as Be), CHRP actually WAS done. and OpenDoc was already out. they(Apple) had it all set up. and I honestly can't think of a logical reason why they blew it to bits. I quess that's one for the ages

.

[quote]Originally posted by IntlHarvester:

Bottom line is that AIM was predicting 25-50% market penetration for PowerPC in the desktop space, and hopefully it's obvious that they weren't counting on MacOS to deliver all of that. They ended up with 4%, and that dictates a very different investment strategy in the architecture.<hr></blockquote>

Yeah. Apple should have first nailed Intel with CHRP. and then gone mano a mano with Microsoft.

[quote]Originally posted by IntlHarvester:

Conspiracy theories are interesting, and no doubt there was quite a bit of conspiracy going on.<hr></blockquote>

Just call me "Mr. paranoiac" <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> .

[quote]Originally posted by IntlHarvester:

However, my opinion is that it's much more plausible that the massive investment in RISC chips in the early 90s were based on an incorrect business assumption, and that there were perfectly good reasons why this investment slacked off.<hr></blockquote>

The "Massive investment in RISC chips in the early 90s" was based on the same thing as the massive investment in CISC chips in the early 60s. RISC was just the "New thang" for CPUs back then. the CPU industry players were merely attempting to continue their earlier efforts into the next decade. some win and some lose. it's a basic part of the free market(Even the corperately distorted fascist version in the US

). the anomaly in this history is. of course. the artificial, unethical and often illegal impeding and delaying of better products. and the preservation and continuation of older. clearly inferior technologies. by Intel. this is why we hate them. and why they must be stopped(But then. you already knew that).

[quote]Originally posted by IntlHarvester:

Alpha, MIPS, and PowerPC just never made it into the low end (NT) markets, primarily because Intel scaled up much further than anyone thought they would.<hr></blockquote>

I think you already know how I think Intel "Acquired" that lead

.

[quote]Originally posted by IntlHarvester:

In the case of DEC, they bet the company on Alpha, and lost. Where should that leave Alpha?<hr></blockquote>

DEC's failure? aside from the later legal beating Intel gave them. DEC's failure mostly stemmed from their so called "Stealth marketing"(Supposedly even better than Apple's :eek: ). one famous joke among era DEC-ites was:

[quote]From a 1997 DECUS symposium keynote:

Bruce Claflin?s keynote presentation was

preceded by opening remarks from DECUS

US Chapter President Joe Pollizzi,

who noted that DECUS members ?continue

to buy DEC products despite

DEC?s best efforts to the contrary.? Once

the laughter died down, the man responsible

for ensuring that customers continue

to buy DEC gear took the podium.<hr></blockquote>

Much like Apple. a large amount of their bad condition could have been solved by some slightly more aggressive(Or perhaps existant

) advertising and sales.

As for the Alpha. Compaq should have either tried a _little_ bit harder to promote it. or have sold it to someone better suited for the awesome stewardship of the Alpha.

[quote]Originally posted by IntlHarvester:

Hopefully you got the irony. The real question is did Intel just **** up, or did they deliberately mislead people in order to get RISC investment to slack off earlier than it should have? The companies without a backup plan (Compaq and HP) missed a lot of $$$ during the dotcom days.<hr></blockquote>

Who knows. but if you ask me. I think it was all a massively orchestrated plot of darkness, evil and FUD by Intel and possibly Microsoft. to SCARE those involved with RISC into hesitating long enough for Intel to come at their throats. as well as to bring such fear into some of them. that they would come crawling and begging on their knees(Read: HP) to Intel for mercy. but that's just my opinion. also. most of this transpired before the internet really took off(1993-1998).

Eric,

eric d.v.h · March 14, 2002 5:12AM

[quote]Originally posted by G-News:

you must not forget that, depending on the production yields, one chip has to pay the cost for his 3 still born brothers as well.

With G4 yields down around 15% so time ago, one G4 would cost roughly 7x the price it cost to produce the working chip.

If you cook a meal and spoil it, it still cost the energy, work and ingredients you put into it, even if you can't sell it to a customer anymore.

Most industry branches opt for as little "failures" as possible. The IT industry is one of the few industries were error correction is to quite some degree beyond human capabilities.

There's just no way to stop acid from burning out a little too much silicon, if the silicon has a little flaw here and there etc...<hr></blockquote>

Note my usage of the term "Marketable chips per year of output".

[quote]Originally posted by G-News:

I'm pretty sure out of a range of speeds on a certai chip, the lowest speeds are very very near production cost. The money is made on the high speed chips that typically cost up to 6x more than the lowest end models. (yet being based on the same production process).<hr></blockquote>

Yeah. I recall that sleazy garbage started when Apple had to deal with IBM(I certainly don't recall them calling it the Quadra 950/33

). Apple should have gone with ANYONE else(Look at my list of alternatives in my earlier post). or just skipped it and gone with a straight Motorola 8xxxx.

Eric,

rickag · March 14, 2002 8:31AM

[quote]Eric D.V.H

quote:

------------------------------------------------------------------------

Originally posted by rickag:

The only thing worthwhile in the 8540 I see is the Ocean switched fabric, and in Apple's case, it should be used in a core logic chipset rather than wasting the precious die acreage of the CPU.

------------------------------------------------------------------------<hr></blockquote>

Uh, I actually didn't say this, I think THT did, I don't have anywhere near the expertise to make such a statement.

That said, if you wander over to Arstechnica, and read some of BadAndy's posts, I tend to think he would agree with THT. I say "tend to think" because a lot of what you, THT, RazzFazz, Eskimo and the many knowledgable people post, is quite beyond my knowledge.

If I'm not mistaken, BadAndy mentioned his dream machine using switched fabric with 2 controllers for 8 MPC8540 cores(8 cores 2 controllers?) I'll apologize to BadAndy in advance if I scewed this up.<a href="http://arstechnica.infopop.net/OpenTopic/page?a=tpc&s=50009562&f=8300945231&m=1770949283&r= 6200924593#6200924593" target="_blank">The Link</a>

programmer · March 14, 2002 9:49AM

Copland wasn't nearly done when Apple killed it -- a bunch of the technology was established an functioning at the 90% level, but it wasn't anything close to being a shippable system for Apple. The compatibility stuff that had been planned, and various bits and pieces still had a fair way to go. There is a rule of thumb in the software industry that the last 10% of a project takes 90% of the effort, and Copland was so disorganized that it wasn't making progress on that last 10%.

tht · March 14, 2002 12:12PM

Originally posted by Eric D.V.H:

Hmmm? I wasn't saying that Apple should actually use the 8540 _specifically_ in upcoming Macs. I just meant that it was a sign that Motorola was done with the 85xx line's design. and has now or soon will present Apple with a chip to use in motherboard designs.

That's quite of leap of speculation. Oh, I forgot this was AI. Nevermind.

Originally posted by THT:

Motorola and Apple has some straight forward options. The easy way out is to add a second FPU to the 7455

Absolutely not. Apple needs to go 64-bit as soon as possible.

Hmm... brand new 64 bit PowerPC chip versus modifying current 7455? Which one would be the easy way out? It's seems fairly obvious.

If Motorola comes out with 64 bit CPU with multiple FPUs, not just one, all the more power to them and great for Apple. I hope they got started in 1999 if they want to deliver be 2003.

Originally posted by THT:

modify the MPX bus for DDR, perhaps bumping one of the integer units to do multipy, divide et al,

I think we can all agree that DDR is bleeding obvious. as for bumping an IPU for another FPU: no. another FPU can be added in clean. like the way how the AltiVec VPU was slid into the G3 core for the G4

The 7450 based G4s have 4 integer units. The G3 and 7400 based G4s (the G3 with AltiVec) have 2 integer units. Of the 4 in the 7450 based G4s, 1 IU can do all integer ops including multiplies and divides and certain logical and register ops. The other 3 can do all integer ops except multiplies and divides and certain logical and register ops. When I say "bumping one of the integer units", it doesn't mean making it an FPU. It means bumping 1 of the 3 simpler IUs to do multiplies and divides.

The more execution resources the better, which is also the basic idea for adding another scalar FPU to the 7455.

Originally posted by THT:

increasing the backside cache size, and increasing the issue and completion width to 4. Further down the road, modifying the cache design (I think super fast and huge (8 to 128 MB) backside cache is the way to go),

Backside cache is meant to be a small cache. using SRAM. that runs at or near the clock frequency of the CPU. this dumbed-down DDR SDRAM they're using now is moronic.

I foresee that large cache designs are the only way to go. It should be fairly obvious. Large memory architectures will never keep up with CPU clock rates or performance. With increasingly large software loads, CPU designs have no choice but to have large on-die cache and large backside cache.

The DDR SDRAM thing in the new Power Macs seems to be a marketing blunder to me. The 7455 technical documents say that it supports "MSUG2 dual data rate (DDR) synchronous Burst SRAMs, PB2 pipelined synchronous Burst SRAMs, and pipelined (register-register) Late Write synchronous Burst

SRAMs". So until someone pops open their new Power Mac, takes the heat sink off the CPU module and takes a picture of the DDR SDRAM, I won't believe it's DDR SDRAM.

Originally posted by THT:

modifying AltiVec to do 64 bit precision ops,

AltiVec is just fine the way it is(Although adding another vector unit. similar to the dual FPUs in some CPUs. would be a good idea).

The more execution resources the better, like having more then one FPU and double precision AltiVec.

Originally posted by THT:

extending the pipeline to 10 to 14 stages,

Adding pipeline stages is against the principles of the PowerPC. and RISC itself(Hint: I didn't like the 7450 _one bit_

. and think they should heave ho the extra stages). the smarter thing would be to add extra pipelines. like in the Alpha.

The Alpha 21264 has a 7 stage execution pipeline just like the 7450. And it has two FPUs. Wait, didn't I suggest that for the 7450?

And in todays world of 10+ million transistor CPUs being the low end, ISA has very little relavancy to microarchitecture anymore. The only thing the PowerPC ISA does is give assembly programmers less of a headache.

Ahem. it took DEC a tremendous amount of time, effort and expertise to think of and implement SMT. I seriously doubt that Motorola(Or anyone else for that matter?) is going to get their hands on SMT.

SMT is not the exclusive IP of DEC, now Intel. Only their perticular circuit designs are. If it was, one wonders how Sun did it with the MAJC, a shipping processor, and will do it for the upcoming generation of UltraSPARC CPUs.

Yuck! I hate "Fabric"s. if you're going to bother with extra connections. total interconnect(Like in IBM's POWER4 or SGI's HyperCube Cluster) is the only way to go.

What extra connections? The Ocean design is an on-die fabric between all the IO on the 8540. Apple can put one in a core logic chipset to bridge all the IO they have. Sort of like now, but better.

tht · March 14, 2002 12:20PM

Originally posted by Eric D.V.H:

For those whom would refute me. I point to the fact of that. mere months after being fired. the Copland team regrouped, formed Be, made BeOS. a world class and highly polished OS if ever there was one and sold it on the BeBox. which they also built from scratch. I find it highly unlikely that a bunch of stumbling dullards that make such a mess as Copland is popularly portrayed as. could pull off such a feat as Be

I would very much like for you to prove this statement. In particular: "I point to the fact of that. mere months after being fired. the Copland team regrouped, formed Be, made BeOS."

tht · March 14, 2002 1:21PM

Originally posted by Randycat99:

Well that's a side of THT I have not seen here before! If this is some sort of gig, I must have missed it, cuz I don't know what is going on.[/QB]

Oh come on, can't a person have some fun at satirizing the pomposity of it all, including my own? It was meant to be funny.

I'm not really as serious as I appear to be.

And Transcendental Octothorpe, no I don't think I'm high. Last time I checked, my elevation is only about 30 ft above sea level. If a big hurricane comes through, I'm fish food instead of worm food. Now, Eskimo on the other hand is like almost up there in the North pole.

Seriously, Eskimo, if you didn't really appreciate it, my apologies.

randycat99 · March 14, 2002 11:26PM

[quote]Originally posted by THT:

Oh come on, can't a person have some fun at satirizing the pomposity of it all, including my own? It was meant to be funny.

I'm not really as serious as I appear to be.<hr></blockquote>

Oh, I don't mind your episode there- just never seen you cut loose like that before.

I'm elated to hear your suggestion for bigger on-die caches. It was good to hear you say it, because I was saying something similar to that here:

<a href="http://forums.appleinsider.com/cgi-bin/ultimatebb.cgi?ubb=get_topic&f=10&t=001074"; target="_blank">http://forums.appleinsider.com/cgi-bin/ultimatebb.cgi?ubb=get_topic&f=10&t=001074</a>;

Maybe the context I bring it up in is a bit bogus as a justification for larger caches, but I certainly do agree with your point.

eric d.v.h · March 15, 2002 4:10AM

[quote]Originally posted by rickag:

Uh, I actually didn't say this, I think THT did, I don't have anywhere near the expertise to make such a statement.

That said, if you wander over to Arstechnica, and read some of BadAndy's posts, I tend to think he would agree with THT. I say "tend to think" because a lot of what you, THT, RazzFazz, Eskimo and the many knowledgable people post, is quite beyond my knowledge.

If I'm not mistaken, BadAndy mentioned his dream machine using switched fabric with 2 controllers for 8 MPC8540 cores(8 cores 2 controllers?) I'll apologize to BadAndy in advance if I scewed this up.<a href="http://arstechnica.infopop.net/OpenTopic/page?a=tpc&s=50009562&f=8300945231&m=1770949283&r= 6200924593#6200924593" target="_blank">The Link</a><hr></blockquote>

Sorry about that. I misquoted it(I assemble my longer posts in SimpleText. and do the UBB code through a mixture of pasting in from the "Instant UBB Code" buttons and hand-writing in SimpleText. so I must've forgotten which post it was from).

Eric,

eric d.v.h · March 15, 2002 5:47AM

[quote]Originally posted by Programmer:

Copland wasn't nearly done when Apple killed it -- a bunch of the technology was established an functioning at the 90% level, but it wasn't anything close to being a shippable system for Apple. The compatibility stuff that had been planned, and various bits and pieces still had a fair way to go. There is a rule of thumb in the software industry that the last 10% of a project takes 90% of the effort, and Copland was so disorganized that it wasn't making progress on that last 10%.<hr></blockquote>

[quote]Originally posted by THT:

I would very much like for you to prove this statement. In particular: "I point to the fact of that. mere months after being fired. the Copland team regrouped, formed Be, made BeOS."<hr></blockquote>

AHWOOOOOOPS <img src="graemlins/surprised.gif" border="0" alt="[Surprised]" /> . I was TOTALLY WRONG. let's start from the top. here's what _actually_ happened:

Be Inc.:

In 1990. Jean-Louis Gassé left Apple to form Be Inc. they started work on BeOS. it originally ran on only on AT&T's Hobbit(A code optimized chip. like Sun's MAJC. only it was optimized for C instead of Java). after several years. Gassé decided to port it to the PowerPC. and in august of 1996. the BeOS and BeBox that we know and love today was unveiled.

Copland

In late september of 1996(Just one month after the BeBox was announced. just by _sheer_ coincidence of course ) the lid came off of Copland's troubles. as this was when Apple started to speak of chopping Copland into bits. instead of just waiting for one awe inspiring release. basically. this was the true end of Copland as originally envisioned. oh well.

Although I wonder as to the true fate and final condition of Copland right up to the point of Hancock's massacre(That's a topic for another thread :cool: ). there's a fairly high likelyhood that Programmer's guess is correct.

(PS: it was incredibly difficult to locate any information on BeOS's origins and Copland's final days. and utterly impossible to find out _anything_ about Be Inc's first four years aside from it's founding date. I'd appreciate some more info/links from anyone out there in the crowd)

Eric,

eric d.v.h · March 15, 2002 10:07AM

[quote]Originally posted by THT:

That's quite of leap of speculation. Oh, I forgot this was AI. Nevermind.<hr></blockquote>

Yup. like I said. it's a sign.

[quote]Originally posted by THT:

Hmm... brand new 64 bit PowerPC chip versus modifying current 7455? Which one would be the easy way out? It's seems fairly obvious.

If Motorola comes out with 64 bit CPU with multiple FPUs, not just one, all the more power to them and great for Apple. I hope they got started in 1999 if they want to deliver be 2003.<hr></blockquote>

I was just saying that "The easy way out" chould be avoided like the plague by Apple and Motorola. and I doubt it will take THAT long.

[quote]Originally posted by THT:

The 7450 based G4s have 4 integer units. The G3 and 7400 based G4s (the G3 with AltiVec) have 2 integer units. Of the 4 in the 7450 based G4s, 1 IU can do all integer ops including multiplies and divides and certain logical and register ops. The other 3 can do all integer ops except multiplies and divides and certain logical and register ops. When I say "bumping one of the integer units", it doesn't mean making it an FPU. It means bumping 1 of the 3 simpler IUs to do multiplies and divides.<hr></blockquote>

Oh. okay.

[quote]Originally posted by THT:

The more execution resources the better, which is also the basic idea for adding another scalar FPU to the 7455.<hr></blockquote>

Hmmm? in my opinion. all of the respective processing units should be able to do all operations related to their role.

[quote]Originally posted by THT:

I foresee that large cache designs are the only way to go. It should be fairly obvious. Large memory architectures will never keep up with CPU clock rates or performance. With increasingly large software loads, CPU designs have no choice but to have large on-die cache and large backside cache.<hr></blockquote>

Such a design would be absurd. if your going to have large caches. you might as well just speed up the main RAM more(And don't think this would require any development. there's plenty of fast RAM technologies waiting in the high end as we speak. er? type). as CPU's NEED a _fast_, medium sized cache(L3). as for a compromise. this whole thing reminds me of an old idea I had about a couple years ago.

Why not have something between main RAM and CPU cache? it would use extremely fast. (Quadruple Data Rate. or "Quad pumped") SDRAM chips. which would be faster than the main RAM(Which would be DDR DRAM). and slower than the CPU cache(Which would be extremely fast SRAM). to balance it out. you might want to slow down the main RAM a bit. and speed up the CPU cache a bit. an example system would be:

CPU: 1Ghz(Let's just use a slightly modified 7455 in our example. for simplicity's sake)

On chip L1: 256k SRAM, 128-bit(To match AltiVec), 1Ghz

On chip L2: 1MB(In two 512k pieces) SRAM, 128-bit(To match AltiVec), 1Ghz

Off chip, upgradable, backside L3 cache: 2-8MB SRAM, 128-bit(To match AltiVec), 1Ghz

CaRAMche : 16-128MB interleaved QDR SDRAM, 128-bit, 250Mhz(Effectively 1Ghz. due to quadruple data rate)

Main RAM: 128-8,192MB interleaved DDR SDRAM, 64-bit, 166Mhz(Effectively 332Mhz. due to double data rate)

I don't think it's even possible to fit more than ten megabytes of storage in a CPU without it ending up the size of a VHS cassette(Like the Merced

).

[quote]Originally posted by THT:

The DDR SDRAM thing in the new Power Macs seems to be a marketing blunder to me. The 7455 technical documents say that it supports "MSUG2 dual data rate (DDR) synchronous Burst SRAMs, PB2 pipelined synchronous Burst SRAMs, and pipelined (register-register) Late Write synchronous Burst

SRAMs". So until someone pops open their new Power Mac, takes the heat sink off the CPU module and takes a picture of the DDR SDRAM, I won't believe it's DDR SDRAM.<hr></blockquote>

[quote]From a reply of mine to a similar statement by G-News:

I fervently wish you were right. but I sincerely doubt it. after <a href="http://www.apple.com/quicktime/qtv/mwsf02/"; target="_blank">Steve Jobs</a>, Apple's <a href="http://www.apple.com/powermac/architecture.html"; target="_blank">G4 page</a>. and even their own <a href="http://developer.apple.com/techpubs/hardware/Developer_Notes/Macintosh_CPUs-G4/PowerMacG4/pmg4.pdf"; target="_blank">developer notes</a> all say "DDR SDRAM". the chance of it being a mistake is rather slim.<hr></blockquote>

[quote]Originally posted by THT:

The more execution resources the better, like having more then one FPU and double precision AltiVec.<hr></blockquote>

Yeah.

[quote]Originally posted by THT:

The Alpha 21264 has a 7 stage execution pipeline just like the 7450. And it has two FPUs. Wait, didn't I suggest that for the 7450?

And in todays world of 10+ million transistor CPUs being the low end, ISA has very little relavancy to microarchitecture anymore. The only thing the PowerPC ISA does is give assembly programmers less of a headache.<hr></blockquote>

The G4 ALREADY HAS duel FPUs. as for adding pipelines. I _still_ think they should shrink the pipeline back down to five stages. and instead add parallel(But still short) pipelines. RISC was a good idea back then. and it's still a good idea now. all this extra baggage is precisely the reason it takes so long for AIM to design PPCs now.

[quote]Originally posted by THT:

SMT is not the exclusive IP of DEC, now Intel. Only their perticular circuit designs are. If it was, one wonders how Sun did it with the MAJC, a shipping processor, and will do it for the upcoming generation of UltraSPARC CPUs.<hr></blockquote>

I wasn't saying that DEC owned some kind of exclusive domain over SMT. I was just pointing out that even the finest engineers on earth(The people of Project Alpha), working on the finest CPU on earth(The EV8). STILL required an _immense_ amount of time, effort and expertise to acheive such an engineering marvel(Sun sort of cheated with MAJC. as it was a very simple chip. as opposed to the full blown CPU the EV8 was. and MAJC was kinda icky too. mostly due to Java. which is total garbage). and for someone like AIM to do so as well. within the foreseeable future. would require a great deal of prior effort. which I think we'd have noticed by now if they'd been doing it.

[quote]Originally posted by THT:

What extra connections? The Ocean design is an on-die fabric between all the IO on the 8540. Apple can put one in a core logic chipset to bridge all the IO they have. Sort of like now, but better.<hr></blockquote>

What I'm saying is that fabrics, switches, chains and loops are _all_ inferior to total interconnect(I wish it had a snappy name that just rolled off the tongue like "Fabric"

).

The distinguishing feature of total interconnect is that every single device in the interconnect has a dedicated. two way connection to every other device in the interconnect. so a four device total interconnect would have six connections, an eight device total interconnect would have twentyeight connections and so-on-so-forth. total interconnect can get expensive very rapidly if you have too many devices. but it is VERY fast. and simply can't be beat in a simple network of devices.

Eric,

[ 03-15-2002: Message edited by: Eric D.V.H ]

programmer · March 15, 2002 10:29AM

[quote]Originally posted by THT:

Motorola and Apple has some straight forward options. The easy way out is to add a second FPU to the 7455, modify the MPX bus for DDR, perhaps bumping one of the integer units to do multipy, divide et al, increasing the backside cache size, and increasing the issue and completion width to 4. Further down the road, modifying the cache design (I think super fast and huge (8 to 128 MB) backside cache is the way to go), modifying AltiVec to do 64 bit precision ops, extending the pipeline to 10 to 14 stages, and adding SMT.<hr></blockquote>

So the should really take the G4 and...

Add another complex integer unit (or two)

Add another FPU unit (or two)

Extend the integer registers to 64-bit

Extend all the integer units to 64-bit

Lengthen all the pipelines again to 10 or 14 to ensure higher clock rates

Double the number of register reservation stations

Double the dispatch and completion unit capacity

Add an on-die memory controller or high speed bus interface

Put the largest L2 cache possible on-die

Hmmm. That sounds just like a G5 to me, no matter what you call it or what it was based on originally. No single bullet item is that hard, but they add up to quite a bit of work. Potentially Moto could add the bullet items one at a time because they are all relatively independant (although I'm sure the chip layout people would disagree!)... but that leads to an awfully large number of chips to put through the development & production pipeline. The last round of rumours (7560 & 7500) could actually be this strategy -- take half my list and put it into the 7560, and add the other half in the 7500.

The 8540 isn't evidence of the "G5" (i.e. the next chip Apple will use), and neither is the existence of "E Book". There is no evidence of the next thing except that way back just post-604 / pre-G3 they were talking about how they had started their Core2K project, and the fact that big R&D companies always overlap their product development cycles so that the interval between products isn't so long. Clearly if they have had it in development for that long it has hit a few bumps in the road and taken missteps because it is late any account.

I'm not a big fan of longer pipes, either, but if its an overall win then it makes sense. It isn't inherently counter to RISC's philosophy, but super-scalar was the original tenant of POWER. Still, if going from a 7-stage to a 10-stage pipe gives them more than a 50% improvement in clock rate then it might make sense. Certainly more super-scalar is interesting, but even with the 604 they were running into the problem of how to keep all those units busy -- Intel's HyperThreading idea can definitely address this, although it sounds like the first version of that isn't living up to expectations (at least without recompiling the threads to pretend that they know about fewer execution units). Having a lot more execution units would likely make HyperThreading work better (BTW: is this what SMT means?).

I don't think extending AltiVec to double precision is a good idea. Doubling the register size would make the context switching hugely expensive, and having only 2 doubles in a vector isn't really worth the effort to vectorize the code. It would also be troublesome to avoid affecting existing AltiVec code. Better to just make the scalar double precision twice as fast (via higher clock rate & super scalar), and then use AltiVec's cache controls to improve memory access. Improving the scalar unit(s) will also benefit the SPECmarks.

I don't think we'll see truly huge caches. Off-chip caches are better replaced by a faster memory system (i.e. HyperTransport or RapidIO to fast memory), and there are limits to how big you want to push the L2 cache. If you make it huge then it will cause your die size to increase and this will decrease your yields. As a cache's size increases its effect on performance diminishes, so there is a real "sweat spot" that currently seems to be about 0.5 megabytes for the onchip L2. With the AltiVec cache streaming it is possible to ensure your data is headed for the cache well before you need it.

programmer · March 15, 2002 10:48AM

[quote]Originally posted by Eric D.V.H:



The G4 ALREADY HAS duel FPUs. as for adding pipelines. I _still_ think they should shrink the pipeline back down to five stages. and instead add parallel(But still short) pipelines. RISC was a good idea back then. and it's still a good idea now. all this extra baggage is precisely the reason it takes so long for AIM to design PPCs now.

<hr></blockquote>

You're going to have to back that one up with some documentation from Motorola -- I've been waiting for dual FPU execution units since the 604, and I'm pretty sure I haven't missed its arrival!

 [quote]

Why not have something between main RAM and CPU cache?

<hr></blockquote>

Because its too complex and would result in diminishing returns? It might make sense to replace the L3 cache, however. A few years ago I read a paper that suggested that it might make more sense to treat the off-chip high speed memory as a virtual memory paging space, rather than as a cache. The idea here is to have a larger amount of memory (8-32 mbytes), and treat the main memory as a backing store for the virtual memory system which then would page into the fast memory. Clearly, however, this depends greatly on your access patterns. If you are just touching small pieces of memory spread across many more megabytes than is contained in the fast memory you are going to thrash the system (just like virtual memory thrashes to the hard disk at times).

tht · March 15, 2002 11:17AM

Originally posted by Eric D.V.H:

I was just saying that "The easy way out" chould be avoided like the plague by Apple and Motorola. and I doubt it will take THAT long.

In the early and mid 90s, chip development times were less. But in today's world of 30+ million transistors, development times have doubled. I don't think there is a new desktop processor today that takes less than 4 years to ship.

I don't think it's even possible to fit more than ten megabytes of storage in a CPU without it ending up the size of a VHS cassette

I said "CPU designs have no choice but to have large on-die cache and large backside cache". On-die cache would be in the area of 1 MB and large backside cache would be in the area of 8 to 128 MB. Essentially, I was contemplating how a quad short channel Rambus solution would help increase performance. It can provide 32 to 128 MB (8 to 32 MB per channel) of memory at 6 to 10 GByte/s of bandwidth.

The G4 ALREADY HAS duel FPUs. as for adding pipelines. I _still_ think they should shrink the pipeline back down to five stages. and instead add parallel(But still short) pipelines. RISC was a good idea back then. and it's still a good idea now. all this extra baggage is precisely the reason it takes so long for AIM to design PPCs now.

The G4 has one scalar FPU and one vector FPU. It needs a second scalar FPU. If you think AltiVec is fine at single precision and Programmer thinks that double precision is too expensive, than a second scalar FPU is needed.

PPC development hasn't stagnated because of baggage. It's a very clean design that can be modified fairly easily. The problem is that IBM and Moto have no economic stimulus to do so. That's why I've been saying for the longest time, almost 5 years now, that Apple has to take control of PPC development.

 I was just pointing out that even the finest engineers on earth(The people of Project Alpha), working on the finest CPU on earth(The EV8). STILL required an _immense_ amount of time, effort and expertise to acheive such an engineering marvel

You're overstating the case. If it's such a tremendous effort, one wonders how an even smaller company can implement SMT:

<a href="http://www.clearwaternetworks.com/news_events_2.html"; target="_blank">Clearwater Networks, Formerly XStream Logic, Unveils CNP810SP? Network Services Processor</a>

Processor Uses Simultaneous Multithreading (SMT) to Provide Advanced Services to Edge Routers, Networking Appliances, and Network Attached Storage (NAS) Devices

... The CNP810SP network services processor is based on the CNP810 core, an 8-thread, 10-issue SMT processor that can process up to 8 packets simultaneously. SMT, or Simultaneous Multithreading is a major advancement over traditional superscalar RISC architectures that are used by other network processor vendors. Instead of working on a single packet at one time, as typical RISC processors do, the CNP810 core processes multiple packets at the same time within a single core. ...

The distinguishing feature of total interconnect is that every single device in the interconnect has a dedicated. two way connection to every other device in the interconnect.

This is what a switched fabric is. The Ocean on-die fabric allows "concurrent load and store any port to any port" and has "Full Duplex port connections 128Gb/s concurrent throughput".

Intel at 4 GHz??? Come on Motorola...

Comments