970GX and low power 970s for PowerBooks

rhumgod · November 12, 2004 10:30PM

From Think Secret:

Since our July report on Antares, the forthcoming dual-core PowerPC 970MP processor, sources have provided Think Secret with additional notes regarding IBM's PowerPC development and the direction it might take Apple.

The biggest news is that Antares will also be available in a single-core version, code-named AntaresSP, which is expected to be named the PowerPC 970GX. At present, Apple's dual-2.5GHz Power Mac G5 uses the PowerPC 970FX processor. Like Antares, the 970GX will initially come in at speeds around 3GHz and is said to feature 1MB of L2 cache, double what the 970FX processor sports. Like the 970FX, however, the processor will not have any L3 cache.

At present, sources suggest that the 970GX might be ready around the first or second quarter of 2005, while the expected availability of the 970MP remains unknown at this point.

Low-power versions of the PowerPC 970 intended for use in the PowerBook G5 remain in development at speeds between 1.6GHz and 1.8GHZ, but little else is known. Apple's current PowerBook line-up hasn't been updated in seven months, however, suggesting a revised model could arrive as soon as Macworld Expo San Francisco in January.

If the PowerBook G5 isn't ready then, sources say Apple may turn to Freescale's PowerPC 7448, which is said to be almost finished. The processor is pin compatible with the PowerPC 7447A used in current PowerBooks, will exceed 1.5GHz, and feature 1MB of L2 cache, double the amount of the 7447A. The PowerPC 7448 will also be manufactured on 90nm silicon-on-insulator process technology, delivering improved power savings over the 130nm 7447A.

Linky

powerdoc · November 13, 2004 3:26AM

Interesting.

However I doubt that this chip will be clocked around 3 ghz.

IBM have huge problems producing 2,5 ghz G5, worse this chip have to be watercooled. Some sources said that without watercooling the 2,5 g5 reach temperatures above 90 ° Celsius.

marcuk · November 13, 2004 4:57AM

Maybe they added a few extra pipelines. Any news on 256bit Altivec 2?

powerdoc · November 13, 2004 5:55AM

Quote:

Originally posted by MarcUK

Maybe they added a few extra pipelines. Any news on 256bit Altivec 2?

The problem with 256 bit altivec is to find a way to feed that beast. I don't think that many memory controller are able to feed such a thing.

eugene · November 13, 2004 8:28AM

Quote:

Originally posted by Powerdoc

Interesting.

However I doubt that this chip will be clocked around 3 ghz.

IBM have huge problems producing 2,5 ghz G5, worse this chip have to be watercooled. Some sources said that without watercooling the 2,5 g5 reach temperatures above 90 ° Celsius.

Without any cooling, it would reach well above 90C.

The water-cooling unit is designed rather poorly as a strict CPU cooling device, but that's not its only function...

dfiler · November 13, 2004 10:28AM

Quote:

Originally posted by Eugene

The water-cooling unit is designed rather poorly as a strict CPU cooling device, but that's not its only function...

Please elaborate!

programmer · November 13, 2004 10:39AM

Quote:

Originally posted by MarcUK

Any news on 256bit Altivec 2?

I hope it doesn't exist.

As I just posted in another thread, introducing an AltiVec variation is mostly a bad idea. It fragments the software base (which program supports what?) and adds uncertainy for developers when aiming for their target platform. The transition to all machines having AltiVec took many years, and even still a lot of software that could and should take advantage of it doesn't because it is not present in all PowerPCs.

Expanding the registers to 256bit is not a particuarly worthwhile exercise, either. First of all, dealing with the granularity of the SIMD registers is always an issue and making that size larger means it is less likely you can skirt the issue. Second, the context size increases dramatically. Third, the size of the hardware register file becomes enormous (especially if you want to grow the rename pool or mirror the file between units). Fourth, most of the advantage of larger registers can be gained by simply retaining the current size and adding execution units to allow double the instruction throughput... plus this works with existing code and is more flexible in what work it can be doing at the same time. 256bit registers with double precision support would allow 4-way double vectors, but the hardware cost to support that would be huge.

powerdoc · November 13, 2004 3:43PM

Quote:

Originally posted by Eugene

Without any cooling, it would reach well above 90C.

The water-cooling unit is designed rather poorly as a strict CPU cooling device, but that's not its only function...

Well you are right, without cooling the chip will simply burn. What I wanted to say that even cooled this chip is pretty hot. With the current fabbing process the only way to have a 3 ghz PPC 970 FX is to cryogenise it

jzroback · November 13, 2004 5:16PM

So tempted to buy a new powerbook SHould i wait ? or buy the 1.5g4 ?

aphelion · November 13, 2004 5:48PM

Quote:

Originally posted by Powerdoc

... the only way to have a 3 ghz PPC 970 FX is to cryogenise it

Ummn... How about using this:

Quote:

Introduction...

Peltier devices, also known as thermoelectric (TE) modules, are small

solid-state devices that function as heat pumps. A "typical" unit is a few

millimeters thick by a few millimeters to a few centimeters square. It is

a sandwich formed by two ceramic plates with an array of small Bismuth

Telluride cubes ("couples") in between. When a DC current is applied heat

is moved from one side of the device to the other - where it must be removed

with a heatsink. The "cold" side is commonly used to cool an electronic

device such as a microprocessor or a photodetector. If the current is

reversed the device makes an excellent heater.

As with any device, TE modules work best when applied properly. They are not

meant to serve as room air conditioners. They are best suited to smaller

cooling applications, although they are used in applications as large as

portable picnic-type coolers. They can be stacked to achieve lower

temperatures, although reaching cryogenic temperatures would require great care.

They are not very "efficient" and can draw amps of power. This disadvantage is

more than offset by the advantages of no moving parts, no Freon refrigerant, no

noise, no vibration, very small size, long life, capability of precision

temperature control, etc.

powerdoc · November 14, 2004 2:03AM

Quote:

Originally posted by Aphelion

Ummn... How about using this:

I have a small Peltier fridge in my office : no noise at all, but beware to do not touch the back of this fridge : it's insanely hot.

marcuk · November 14, 2004 5:18AM

Quote:

Originally posted by Programmer

I hope it doesn't exist.

As I just posted in another thread, introducing an AltiVec variation is mostly a bad idea. It fragments the software base (which program supports what?) and adds uncertainy for developers when aiming for their target platform. The transition to all machines having AltiVec took many years, and even still a lot of software that could and should take advantage of it doesn't because it is not present in all PowerPCs.

Expanding the registers to 256bit is not a particuarly worthwhile exercise, either. First of all, dealing with the granularity of the SIMD registers is always an issue and making that size larger means it is less likely you can skirt the issue. Second, the context size increases dramatically. Third, the size of the hardware register file becomes enormous (especially if you want to grow the rename pool or mirror the file between units). Fourth, most of the advantage of larger registers can be gained by simply retaining the current size and adding execution units to allow double the instruction throughput... plus this works with existing code and is more flexible in what work it can be doing at the same time. 256bit registers with double precision support would allow 4-way double vectors, but the hardware cost to support that would be huge.

I hope it does exist!

Altivec has already changed 3 times, the original 7400, the 7450 and 970 have different architectures, IF a 256bit altivec was backwards compatible, I see no problem. Intel's SSE has changed 3 times,

4 way 64 bit vectors would be insanely cool for 3d modelling and gaming, even if the bandwidth was the limiting factor.

Were talking progress, nothing stands still in computing. We'll have the room at 65nm. Its just up to the programmers to work out how to use it. Glad im not a programmer

aphelion · November 14, 2004 5:18AM

The drawback to a peltier cooling system is that it works so well at pumping heat from the "cold side" to the "hot side" that it can cause heat accumulation problems in normal heat sinks. This is why most peltier equipped computer systems must have water cooling to remove the excess heat.

Of course the Powermac already has the water cooling needed to take care of this problem. Adding a peltier system to the top of the line Powermac would be a trivial engineering exercise for Apple.

If Apple wants a 3 GHz 970FX Powermac they can do it anytime they want.

marcuk · November 14, 2004 5:20AM

Quote:

Originally posted by Powerdoc

Well you are right, without cooling the chip will simply burn. What I wanted to say that even cooled this chip is pretty hot. With the current fabbing process the only way to have a 3 ghz PPC 970 FX is to cryogenise it

You're forgetting that Intels latest 3.8 chucks out far more heat than a 970, and is air cooled. Apple does it mostly for the quietness and elegance. 970's could be aircooled if we accepted the noise of a 120mm fan.

programmer · November 14, 2004 11:21PM

Quote:

Originally posted by MarcUK

Altivec has already changed 3 times, the original 7400, the 7450 and 970 have different architectures, IF a 256bit altivec was backwards compatible, I see no problem. Intel's SSE has changed 3 times.

The implementation changed, but the interface did not. AltiVec is still essentially unchanged from its original definition. There are lots of problems with extending an ISA -- just look at the horrible mess Intel has with its 3 version of SSE... that's a perfect example of why its a bad idea.

Quote:

4 way 64 bit vectors would be insanely cool for 3d modelling and gaming, even if the bandwidth was the limiting factor.

Were talking progress, nothing stands still in computing. We'll have the room at 65nm. Its just up to the programmers to work out how to use it. Glad im not a programmer

The 64-bit vectors aren't as "insanely cool" as you seem to think, especially when you consider the cost of what you're giving up by adding this extension.

Progress is important, but random and whimsical progress is just expensive and distruptive. Not everything needs to change, and if you avoid changing interface (i.e. ISA) then you avoid alienating your legacy and your developers... which is a good thing.

Its obvious you're not a developer. Software is always trailing behind hardware, and partially because of that non-standard hardware features have a strong tendency to get ignored. One of Apple's strengths has been its relatively stable hardware platform -- it is far more consistent that the huge breadth of machines in the x86 world. Their market is too small to risk fragmenting it unless the potential reward is massive. Double precision and/or double width registers is not nearly compelling enough to warrant the cost of development and deployment.

We should probably take any further AV2 discussion into the other thread where it is being discussed.

aphelion · November 15, 2004 8:19AM

It would seem that AMD has just applied for a patent to "cryogenise" (as Powerdoc coined the term) their futre CPU's:

AMD Drives Integrated Cooling into Chips

Quote:

Advanced Micro Devices, one of the world?s leading makers of central processing units, has patented a technology that would allow the chipmaker to use so-called Peltier cooler with its future microprocessors for better heat dissipation and more efficient cooling of future chips.

?Various embodiments of a semiconductor-on-insulator substrate incorporating a Peltier effect heat transfer device and methods of fabricating the same are provided. In one aspect, a circuit device is provided that includes an insulating substrate, a semiconductor structure positioned on the insulating substrate and a Peltier effect heat transfer device coupled to the insulating substrate to transfer heat between the semiconductor structure and the insulating substrate,? says an abstract description of U.S. Patent number 6,800,933 submitted by AMD.

My take on this AMD patent is that it is for imbedding the Peltier cooling element into the chip itself! Powered from the chip's onboard circuitry it would just need an appropriate heat sink (read that as very efficient) to be implemented by system designers.

The patent further mentions "islands" of discrete circuits on the chip that are hotter running than others, with AMD's patented approach is to concentrate peltier nodes over the problem areas.

To me this seems to be an effective counter to the "hot spots" than have plagued the FX970, and prevented the "legs" that Steve Jobs alluded to when he announced the G5.

IBM and AMD have partnered on development of advanced CPU's and I'll bet that they (IBM) can use this technology if they wish.

More on Peltier Coolers

aphelion · November 15, 2004 9:53PM

More on the cryonization of future chips from OSnews :

Quote:

... Another possibility would be to use a technique Intel plan to use for the next Itanium "Montecito," which includes two peltiers in the heat sink. Peltiers actually consume quite a bit of power themselves but reducing the CPU temperature reduces transistor leakage, this lowers the power consumed by the CPU itself allowing boosts in clock frequency which might not otherwise be possible.

Montecito is expected to consume 100 Watts but its heat sink requires a further 75 watts. The end effect is overall power consumption does not change (it may even go up) as part if moved to the heat sink but the CPU itself does not get so hot when working. AMD have filed a patent on an on-chip peltier so they're evidently considering similar technology.

I don't know if the 9x0 will be so hot as to require such aggressive cooling but things are heading that way. "Power density" is becoming a problem and will seemingly only get worse in the future. Power density is the heat generated in a specific area; as CPUs get ever smaller the heat is generated in a smaller area and thus the unit becomes progressively more difficult to cool. The 970FX used in Apple's PowerMacs actually uses less power than the previous 970 but liquid cooling was added because of the higher power density.

It was news to me that the "Monticito" was designed to use two peltier coolers in the heat sink. I think the integrated peltier just patented by AMD is a better solution.

IP sharing agreements between AMD and IBM could bring this technology to a Mac near you in the future.

wizard69 · November 16, 2004 2:45AM

Quote:

Originally posted by Programmer

The implementation changed, but the interface did not. AltiVec is still essentially unchanged from its original definition. There are lots of problems with extending an ISA -- just look at the horrible mess Intel has with its 3 version of SSE... that's a perfect example of why its a bad idea.

SSE is not a perfect example of anything. On the other hand the many generations of i86 processors from both Intel and AMD show that it is perfectly possible to extend an ISA and move that ISA forward. This idea that adding new instructions to Altvec or adding additional register width, will suddenly cause huge problem for developers is unwarranted. I mean it is like saying that the move to PPC caused problems for developers. Yeah maybe a little but the payoffs have been worthwhile.

Quote:

The 64-bit vectors aren't as "insanely cool" as you seem to think, especially when you consider the cost of what you're giving up by adding this extension.

Again one has to wonder what is beign given up here?

Quote:

Progress is important, but random and whimsical progress is just expensive and distruptive. Not everything needs to change, and if you avoid changing interface (i.e. ISA) then you avoid alienating your legacy and your developers... which is a good thing.

Agian we have a big pile of disinformation. Extended properly nothing would change from the developers standpoint except for the additional capability. Frankly everything needs to change. Look at it this way has TI or any of the other DSP suppliers given up on improving DSP chips.

Quote:

Its obvious you're not a developer. Software is always trailing behind hardware, and partially because of that non-standard hardware features have a strong tendency to get ignored. One of Apple's strengths has been its relatively stable hardware platform -- it is far more consistent that the huge breadth of machines in the x86 world. Their market is too small to risk fragmenting it unless the potential reward is massive. Double precision and/or double width registers is not nearly compelling enough to warrant the cost of development and deployment.

One aspect of the supposed frequency scaling problems is that manufactures have to look at enhancements to their processors to derive additional performance. They can not ignore ideas with potential big payoffs just to keep a certain element in the customer base happy. Ludites are all around us, it is a shame that they have infiltrated the programming world.

Quote:

We should probably take any further AV2 discussion into the other thread where it is being discussed.

programmer · November 16, 2004 9:53AM

Quote:

Originally posted by wizard69

SSE is not a perfect example of anything. On the other hand the many generations of i86 processors from both Intel and AMD show that it is perfectly possible to extend an ISA and move that ISA forward. This idea that adding new instructions to Altvec or adding additional register width, will suddenly cause huge problem for developers is unwarranted. I mean it is like saying that the move to PPC caused problems for developers. Yeah maybe a little but the payoffs have been worthwhile.

PPC transition was a huge headache, but the payoff has enormous. The 680x0 had no future, had fallen way behind x86, and it brought IBM into the mix. If these advantages hadn't existed the platform would have died if Apple tried to force the transition onto developers.

You're looking at the extensions to x86 and seeing that, yes, they successfully added instructions. Whoopy-do. I'm not saying that instructions can't be added, I'm saying that they have a negative effect on the software development for the platform. If a developer wants to use the new instructions (and that's why you put them there) then they either have to abandon the installed base (typically not a wise move) or they have to build, test & support two or more versions of the software (a pain even in the easiest case, real agony in the case of carefully crafted streaming vector code). Or you don't use the extensions, which is the normal developer response if the win isn't big enough. If nobody uses the extensions then the hardware developer just wasted a potentially huge amount of chip real-estate on something nobody is using and that could have been used for something that benefits everyone.

Quote:

Frankly everything needs to change. Look at it this way has TI or any of the other DSP suppliers given up on improving DSP chips.

What is this obsession with changing everything? Should next year's cars all come with 5 pedals and square steering wheels? Change should happen for a reason and benefit, not because you have some compulsion.

Quote:

One aspect of the supposed frequency scaling problems is that manufactures have to look at enhancements to their processors to derive additional performance. They can not ignore ideas with potential big payoffs just to keep a certain element in the customer base happy.

Absolutely, and I'm not saying they should ignore ideas. Any design change and hardware feature has a real and significant cost, and this must be weighed against the potential benefits. What I am telling you is that the win from the AltiVec2 enhancements commonly suggested (256bit & double precision) is that they don't come anywhere near justifying their hardware expense and the impact on the platform. Much more effective changes can be made without changing the instruction set -- SMT, IMC, bigger caches, more execution units, more cores, etc etc etc. Adding expensive hardware features forces software developers to change and it forces support of these features on all future processors of that lineage.

wizard69 · November 16, 2004 12:10PM

Quote:

Originally posted by Programmer

PPC transition was a huge headache, but the payoff has enormous. The 680x0 had no future, had fallen way behind x86, and it brought IBM into the mix. If these advantages hadn't existed the platform would have died if Apple tried to force the transition onto developers.

You leave one with the impression that AltVec shuold not have been developed in the first place. After all it is an addition to the original PPC ISA.

Quote:

You're looking at the extensions to x86 and seeing that, yes, they successfully added instructions. Whoopy-do. I'm not saying that instructions can't be added, I'm saying that they have a negative effect on the software development for the platform. If a developer wants to use the new instructions (and that's why you put them there) then they either have to abandon the installed base (typically not a wise move) or they have to build, test & support two or more versions of the software (a pain even in the easiest case, real agony in the case of carefully crafted streaming vector code). Or you don't use the extensions, which is the normal developer response if the win isn't big enough. If nobody uses the extensions then the hardware developer just wasted a potentially huge amount of chip real-estate on something nobody is using and that could have been used for something that benefits everyone.

This mesage being delivered in the above paragraph is just a little to much. I can make a very good argument that developers that did not incoroprate new technologies into their code bases soon see that code base go the way of the dino. Your logic implies tha developer should have never of developed that streaming vector code in the first place and just stuck with the main ALU. It is unfortunate but the reality is that a developer has to choices. One is to concentrate on the installed base and eventually have no one to sell to, the other is to keep his software competitive and desirable in the market place.

AltVec has clear progressed to the point where it benefits everyone. Sure it takes time, just as it took time for developer to make good use of the high performance FPU's that PPC gave us.

The point is that AltVec can be extended without impacting the installed base. It would not be a loosing proposition any more so than the improvements that have been implemented in the FPU.

Quote:

What is this obsession with changing everything? Should next year's cars all come with 5 pedals and square steering wheels? Change should happen for a reason and benefit, not because you have some compulsion.

Good reasons abound not the least of which is the slowing ability to scale processor performance via the traditional ratcheting of frequency. If you can't get the processor to run significantly faster, then it either has to do more per cycle or do more complex operations per cycle. There is alot of potential in the AltVec unit why not try to improve it? Would you take the same attitude with the main ALU and just have stuck with the same number of execution units that where in the 601? I would hope not, the changes made to the G4 and G5 have moved performance forward the same can be done for the vector unit.

Quote:

Absolutely, and I'm not saying they should ignore ideas. Any design change and hardware feature has a real and significant cost, and this must be weighed against the potential benefits. What I am telling you is that the win from the AltiVec2 enhancements commonly suggested (256bit & double precision) is that they don't come anywhere near justifying their hardware expense and the impact on the platform.

Well this can be debated to no end and like most things depends on the software. If we ignore the wider registers and focus on new instructions I don't see where your concerns about hardware expense is justified.

Quote:

Much more effective changes can be made without changing the instruction set -- SMT, IMC, bigger caches, more execution units, more cores, etc etc etc. Adding expensive hardware features forces software developers to change and it forces support of these features on all future processors of that lineage.

Lets see three of those items (SMT, more cores and etc) require that a developer make some pretty signifcant changes to his code to get any advantage out of them. More execution units are what we are talking about here anyways so I'm not sure how you can use that on both sides of the argument. Further some of the items you suggest here are far more expensive, die realestate wise, than enhancements to AltVec. Just SMT itself required IBM to implement a bit of logic on POWER5 in effect you double the number of hardware registers so with AltVec would be as expensive as 256 bit wide registers.

There is nothing wrong with getting developer to change, without change we would all be staring at a Windows 3.1 screen. As to future processors; many of the "features" you suggest would also end up in all derived processors. So is that really a problem?

dave

programmer · November 16, 2004 9:12PM

Quote:

Originally posted by wizard69

You leave one with the impression that AltVec shuold not have been developed in the first place. After all it is an addition to the original PPC ISA.

You would only get that impression if you didn't bother reading what I wrote. AltiVec was an excellent addition because they did it right and they did it once. The cost/benefit of adding AV was very good because Apple invested in it heavily (as have other key developers), and it was well designed and implemented.

Quote:

Your logic implies tha developer should have never of developed that streaming vector code in the first place and just stuck with the main ALU. It is unfortunate but the reality is that a developer has to choices. One is to concentrate on the installed base and eventually have no one to sell to, the other is to keep his software competitive and desirable in the market place.

Where do you get that? I thought I was quite clear in saying that developers need to weight the cost against the benefits. AltiVec has been a substantial performance win from the day it was introduced, and as Apple's continuing commitment to it has raised confidence in its longevity (unlike numerous other technologies). As a result more and more developers start using it, and it gains momentum. If there were multiple versions to support (yes, even counting extensions) that process would have to start all over again for each revision.

Quote:

There is alot of potential in the AltVec unit why not try to improve it? Would you take the same attitude with the main ALU and just have stuck with the same number of execution units that where in the 601? I would hope not, the changes made to the G4 and G5 have moved performance forward the same can be done for the vector unit.

Because changing the instruction set isn't necessary. The 601 used the original PPC instruction set and it essentially hasn't changed since then, yet we have the massively faster 970. AltiVec has room to improve without changing the ISA.

Quote:

Lets see three of those items (SMT, more cores and etc) require that a developer make some pretty signifcant changes to his code to get any advantage out of them. More execution units are what we are talking about here anyways so I'm not sure how you can use that on both sides of the argument. Further some of the items you suggest here are far more expensive, die realestate wise, than enhancements to AltVec. Just SMT itself required IBM to implement a bit of logic on POWER5 in effect you double the number of hardware registers so with AltVec would be as expensive as 256 bit wide registers.

Again you are just ignoring what I'm writing. SMT/more cores doesn't require anything different than supporting Apple's existing dual processor machines, and even if you don't do that the user still benefits from multiple threads due to OSX. More execution units doesn't change the ISA. And I'm not talking about saving IBM work... you'll note that POWER5 didn't change the ISA either. Adding SMT to the POWER5 sped up existing software. All existing software.

970GX and low power 970s for PowerBooks

Comments