I want you to know that I had a great respnse to this post yesterday but the server apparently started having problems. So here is an attempt at a condenced version.
Originally posted by Programmer
This doesn't jive with my understanding. If a Cell chip is a main PowerPC core with 8 vector cores this means it will be executing 9 different instruction streams: 1 stream of PowerPC instructions and 8 streams of whatever instructions the vector cores use (probably not PowerPC). Each vector core will be able to do operations similar in nature to what AltiVec currently does on the G4 or G5.
What is clear is that I really don't have much information here, the issue is that I'd be surprised if those 8 vector cores are as wide as the Alt-Vec unit in PPC. What I see is 8 cores of very modest width (maybe one word) that are used together in various combinations.
The problem as I see it is that having that many cores of the Alt-Vec type will take up a huge amount of room. Not just for the cores but for the supporting caches and communications logic. I just don't see current processes supporting that many Alt-Vec type units well on one chip.
The other problem is the efficency of that sort of implementation. Making full use of that many wide vector units would be a problem. Thus the thought that the vector units would be narrow devices, very much like many of the DSP's available today.
In other words the Cell will be (theoretically) capable of 9 times the computation if running at the same clock rate (and the Cell's clock rate will likely be substantially higher). To support this level of computation they have developed a memory scheme that is different than the traditional PowerPC model. It also goes without saying that existing code won't just work on the Cell's vector cores... but this is just as well since these vector cores aren't likely to be out of order superscalar processors like the 970 is so careful coding and algorithmic redesign will be required to run at all, nevermind get peak performance.
I read this and think that maybe you know more about Cell than you are willing to let on
Even so the implied simple execution units lead me back to thinking narrow vector units.
Given its target market in the embedded and console space, I don't think this is a requirement. Also, with a substantially different memory addressing scheme it might not be necessary to go to 64-bit pointers to acheive larger memory sizes. The scalar integer units may well be 64-bit, however -- at least on the main core. The cost in terms of processor complexity is not that high.
64 bits will be huge in the future. In the case of Cell I think it would simplfy things more than anything. As you say the cost isn't that high.
On the other hand the embedded an console market is heading towards 64 bit machinery. It is simply a matter of getting costs under control, with memory beign the big issue. So we are talking a year or two before large memory systems are cost effective. I can't see this team designing a chip that is only going to be competitive for a year or two.
On the other hand they still provide a considerable amount of system services which use hardware acceleration internally. The OpenGL shaders, CoreAudio, CoreImage, QuickTime, vector library, network stack, etc could all be re-optimized over time to take advantage of specialized hardware. Nonetheless I tend to think that Apple would rather stick with the traditional PowerPC w/ AltiVec programming model and start adding cores. I don't actually expect to see Cell in Apple's future.
I see Cell in Apples future but maybe not in the way you see it. I see Cell as an opportunity on IBM's part to optimize the execution units within the PPC line. What would be really neat is if Cell lead to hand crafted execution units to replace the dense sea of logic that is the 970 we all know and love. The idea here being to replace some of the hot stuff within the 970's.
So hopefully Cell becomes a proving or development framework for things that can be extended to the 970 series.
Cell cannot "emulate" AltiVec. They are two different beasts.
I'm not sure you meant to say that. Obviously Cell can emulate anything it wants to emulate, that is simply a matter of writing the right code.
What I was getting at and this is only a consderation if Apple is interestedin Cell, has there been an attempt to design the Cell vector units so they could emulate or work in place of, in an efficent manner the Alt-Vec units.
Well I'm glad somebody got the joke.
AMD isn't at 90 nm yet and they recently pushed back their scheduled move to that process. I think Apple's use of water cooling on the 2.5 GHz machines was a direct result of a deep desire to keep the machines quiet and deal with the very significant heat density issues that result from being able to suddenly expend such a huge amount of power from such a small area. These G5s can go from very low power consumption to very high power consumption very quickly and water has the specific heat capacity to absorb that initial spike without having to continuously keep fans blowing at full speed. If the darn unit didn't looks so... so... automotive then it might actually be a compelling piece of technology.
That automotive look is technology none the less. Old technology yes but well understood and reliable.
The G5's though are just the opposite. Very new technology that frankly has not meant expectations of anybody. That is not to say that the 970's don't work, just that they haven't gotten to where people (JOBS) had expected. What is worst they didn't get there and where they are now is problematic.
It is all well and nice that IBM has a 90nm process but we shouldn't forget that the process needs alot of work. Instead of claiming to have hit a wall IBM really should be saying they are working on breaching the wall. Otherwise one is left with the impression that PPC doesn't mean much to them. IBM needs a leading edge 90nm process not a trailing edge process.