You've said that twice now -- what does it mean and what info is it based on?
Quote:
weak (or no) branch prediction;
I see this said a lot too, but I haven't seen any actual evidence of it. STI said nothing about this core's branch predictor, they only talked (a little) about the SPE's lack of dynamic branch prediction. Given the likely expense of a branch on this Power core, you'd think they'd invest a little in an effective predictor.
Quote:
Meanwhile, Freescale is about to remove the G4's bus bottleneck. Without that, and with 64 bit support (also on the menu) and a dual-core variant (coming) it will be a difficult competitor at low wattage (12W and below). Freescale's major advantage here is that they've had years to hand-tune and hand-refine this core.
I think Freescale's name was chosen to reflect the time axis on their roadmaps. "about to remove" is a rather loose term when applied to Freescale (nee Motorola Semiconductor). I'll believe it when I see it.
Quote:
The benefits from process improvements are shrinking rapidly. At this point, nobody expects significant power savings (or enhanced clock speed) from the transition to 65nm. You get more chips per wafer, which means cheaper chips (yields being equal—and yields on 90nm were initially dismal, so there's no reason to expect they'll start out good on 65nm either). Unless some engineer somewhere gets really lucky and breaks through a wall, I wouldn't expect much here.
Only somebody silly or not risking much would be counting on shipping 65 nm in volume any time soon. Especially if there are no heat/power benefits (and you're right, there probably aren't). Cost is only going to be a benefit after you get yields under control. There is an obvious conclusion to be drawn from this.
If I correctly interpret what Programmer said a while back, the Cell's PPE core is roughly half as fast as the 970FX, at the same clock rate.
I'd be very surprised if it got even close to a quarter the performance at the same clock speed given half the execution resources are stripped out and it has massive amounts of logic removed.
I'd be very surprised if it got even close to a quarter the performance at the same clock speed given half the execution resources are stripped out and it has massive amounts of logic removed.
IBM is dropping those extra logic resources because they aren't worth it on properly optimized code and measured from a performance/power point of view. These cores are, currently anyhow, only appearing in game consoles and media-oriented embedded environments so the code tends to be well suited to this kind of processor plus more optimization can be applied to it since there aren't variations in the target platform. You may be right about the performance on unoptimized code, however, and only time will tell if this core gets used in such a situation.
. . . These cores are, currently anyhow, only appearing in game consoles and media-oriented embedded environments so the code tends to be well suited to this kind of processor plus more optimization can be applied to it since there aren't variations in the target platform. . .
Could you help educate those of us with minimal CS training? I must have read what you say here several times, but just now realized I don't know how code gets optimized. Is this something that mostly takes time and attention of programmers, to write clean and efficient code? Or can having better compilers take care of this job for clueless or lazy programmers?
What occurred to me is that this simple core may be very practical once superior compilers are developed, which make programming more foolproof. I know this approach has repugnant overtones, where sloppy code may run as well as well written code. The compiler could be designed to insult writers of terrible code, then make them go back and change it.
You've said that twice now -- what does it mean and what info is it based on?
I see this said a lot too, but I haven't seen any actual evidence of it. STI said nothing about this core's branch predictor, they only talked (a little) about the SPE's lack of dynamic branch prediction. Given the likely expense of a branch on this Power core, you'd think they'd invest a little in an effective predictor.
OK, you've caught me Arsing. Sue me.
Was Hannibal wrong? Or can you not talk about that?
Quote:
I think Freescale's name was chosen to reflect the time axis on their roadmaps. "about to remove" is a rather loose term when applied to Freescale (nee Motorola Semiconductor). I'll believe it when I see it.
Since becoming Freescale, they've been somewhat more reliable. I'm not expecting miracles, but I am expecting them to not suck so much as the albatross of Motorola management is lifted from their necks.
In retrospect, their conservative schedule for adopting 90nm, and their decision to wait until they had their full process tech migrated over, both look pretty wise. In fact, at this point I consider them hardly less trustworthy than IBM, whose hype about the 970 and 90nm manufacture blew up in their faces, and who stumbled through a distinctly Motorolan year of pathetic yields on their much-vaunted new fab.
Quote:
Only somebody silly or not risking much would be counting on shipping 65 nm in volume any time soon.
Which is why I said WWDC '06 or later. After the 90nm debacle, it's understandable that nobody's in a hurry to make the jump. On the other hand, nobody that I can see isn't going to make the jump. Some may have more conservative timelines than others, but they're all going to move.
Tthe jump down to 65nm will force IBM to do whatever tweaking is necessary to make their chips' heat signatures as even as possible. That will certainly help in any laptop-like scenario.
Ok from what I can understand is this; the xbox 360 has a PPC chip running at 3+GHZ and that it has 3 cores? Im guessing that this chip was specifically designed by IBM for just gaming, otherwise Apple would have gotten some probably. Right? Why hasn't IBM gotten around to putting this puppy into the Macs yet??
Could you help educate those of us with minimal CS training? I must have read what you say here several times, but just now realized I don't know how code gets optimized. Is this something that mostly takes time and attention of programmers, to write clean and efficient code? Or can having better compilers take care of this job for clueless or lazy programmers?
Better compilers help a little, but for this kind of multimedia optimization it mostly takes time and attention of programmers. The more exotic the architecture, the more time and attention it takes and the less compilers will help. And triple core PPCs w/ VMX128 or the Cell are both plenty exotic.
Quote:
What occurred to me is that this simple core may be very practical once superior compilers are developed, which make programming more foolproof. I know this approach has repugnant overtones, where sloppy code may run as well as well written code. The compiler could be designed to insult writers of terrible code, then make them go back and change it.
Don't count on it. OoOE was created to deal with crappy code, so eschewing it isn't going to make the crap smell like a rose.
If you really wanted a 7448 based machine you could get one today. Embedded manufactures have been advertising the chip for over a month now as available on their boards.
Now does Apple intend to use this in the portable for one more 32 bit revision is an interesting question. My geuss at the moment is that we are likely to see the 7448 in an apple portble soon. That might not be top of the line portable but it is truely hard to say. Personally I think Apple wants to do nothing more than to deliver a SMP 64 bit portable ASAP. That happening though is not something they control 100%
Dave
Quote:
Originally posted by PB
I agree that's almost a given. The 7448 is due this autumn, and this too would be a substantial improvement over the 7447B we have today. [/B]
If you really wanted a 7448 based machine you could get one today. Embedded manufactures have been advertising the chip for over a month now as available on their boards.
Oh, could you provide some link? This means that the 7448 has been well ahead of schedule (2H-05, see page 41). And perhaps it would in turn mean that a dual core 8641D-based Powerbook is not so far as i thought. If production of the 8641 chips go as did for the 7448, then a dual core Apple notebook should be ready for this November or so. Apple Expo Paris?
Quote:
Personally I think Apple wants to do nothing more than to deliver a SMP 64 bit portable ASAP.
Comments
Originally posted by Amorph
asymmetrical SMT;
You've said that twice now -- what does it mean and what info is it based on?
weak (or no) branch prediction;
I see this said a lot too, but I haven't seen any actual evidence of it. STI said nothing about this core's branch predictor, they only talked (a little) about the SPE's lack of dynamic branch prediction. Given the likely expense of a branch on this Power core, you'd think they'd invest a little in an effective predictor.
Meanwhile, Freescale is about to remove the G4's bus bottleneck. Without that, and with 64 bit support (also on the menu) and a dual-core variant (coming) it will be a difficult competitor at low wattage (12W and below). Freescale's major advantage here is that they've had years to hand-tune and hand-refine this core.
I think Freescale's name was chosen to reflect the time axis on their roadmaps. "about to remove" is a rather loose term when applied to Freescale (nee Motorola Semiconductor). I'll believe it when I see it.
The benefits from process improvements are shrinking rapidly. At this point, nobody expects significant power savings (or enhanced clock speed) from the transition to 65nm. You get more chips per wafer, which means cheaper chips (yields being equal—and yields on 90nm were initially dismal, so there's no reason to expect they'll start out good on 65nm either). Unless some engineer somewhere gets really lucky and breaks through a wall, I wouldn't expect much here.
Only somebody silly or not risking much would be counting on shipping 65 nm in volume any time soon. Especially if there are no heat/power benefits (and you're right, there probably aren't). Cost is only going to be a benefit after you get yields under control. There is an obvious conclusion to be drawn from this.
Originally posted by snoopy
If I correctly interpret what Programmer said a while back, the Cell's PPE core is roughly half as fast as the 970FX, at the same clock rate.
I'd be very surprised if it got even close to a quarter the performance at the same clock speed given half the execution resources are stripped out and it has massive amounts of logic removed.
Originally posted by Telomar
I'd be very surprised if it got even close to a quarter the performance at the same clock speed given half the execution resources are stripped out and it has massive amounts of logic removed.
IBM is dropping those extra logic resources because they aren't worth it on properly optimized code and measured from a performance/power point of view. These cores are, currently anyhow, only appearing in game consoles and media-oriented embedded environments so the code tends to be well suited to this kind of processor plus more optimization can be applied to it since there aren't variations in the target platform. You may be right about the performance on unoptimized code, however, and only time will tell if this core gets used in such a situation.
Originally posted by Programmer
. . . These cores are, currently anyhow, only appearing in game consoles and media-oriented embedded environments so the code tends to be well suited to this kind of processor plus more optimization can be applied to it since there aren't variations in the target platform. . .
Could you help educate those of us with minimal CS training? I must have read what you say here several times, but just now realized I don't know how code gets optimized. Is this something that mostly takes time and attention of programmers, to write clean and efficient code? Or can having better compilers take care of this job for clueless or lazy programmers?
What occurred to me is that this simple core may be very practical once superior compilers are developed, which make programming more foolproof. I know this approach has repugnant overtones, where sloppy code may run as well as well written code. The compiler could be designed to insult writers of terrible code, then make them go back and change it.
Originally posted by Programmer
You've said that twice now -- what does it mean and what info is it based on?
I see this said a lot too, but I haven't seen any actual evidence of it. STI said nothing about this core's branch predictor, they only talked (a little) about the SPE's lack of dynamic branch prediction. Given the likely expense of a branch on this Power core, you'd think they'd invest a little in an effective predictor.
OK, you've caught me Arsing. Sue me.
Was Hannibal wrong? Or can you not talk about that?
I think Freescale's name was chosen to reflect the time axis on their roadmaps. "about to remove" is a rather loose term when applied to Freescale (nee Motorola Semiconductor). I'll believe it when I see it.
Since becoming Freescale, they've been somewhat more reliable. I'm not expecting miracles, but I am expecting them to not suck so much as the albatross of Motorola management is lifted from their necks.
In retrospect, their conservative schedule for adopting 90nm, and their decision to wait until they had their full process tech migrated over, both look pretty wise. In fact, at this point I consider them hardly less trustworthy than IBM, whose hype about the 970 and 90nm manufacture blew up in their faces, and who stumbled through a distinctly Motorolan year of pathetic yields on their much-vaunted new fab.
Only somebody silly or not risking much would be counting on shipping 65 nm in volume any time soon.
Which is why I said WWDC '06 or later. After the 90nm debacle, it's understandable that nobody's in a hurry to make the jump. On the other hand, nobody that I can see isn't going to make the jump. Some may have more conservative timelines than others, but they're all going to move.
Tthe jump down to 65nm will force IBM to do whatever tweaking is necessary to make their chips' heat signatures as even as possible. That will certainly help in any laptop-like scenario.
Originally posted by hypoluxa
Why hasn't IBM gotten around to putting this puppy into the Macs yet??
When it comes out and we learn more about, it will become clear if it is worth to put it in Macs or not.
Originally posted by snoopy
Could you help educate those of us with minimal CS training? I must have read what you say here several times, but just now realized I don't know how code gets optimized. Is this something that mostly takes time and attention of programmers, to write clean and efficient code? Or can having better compilers take care of this job for clueless or lazy programmers?
Better compilers help a little, but for this kind of multimedia optimization it mostly takes time and attention of programmers. The more exotic the architecture, the more time and attention it takes and the less compilers will help. And triple core PPCs w/ VMX128 or the Cell are both plenty exotic.
What occurred to me is that this simple core may be very practical once superior compilers are developed, which make programming more foolproof. I know this approach has repugnant overtones, where sloppy code may run as well as well written code. The compiler could be designed to insult writers of terrible code, then make them go back and change it.
Don't count on it. OoOE was created to deal with crappy code, so eschewing it isn't going to make the crap smell like a rose.
Originally posted by Amorph
OK, you've caught me Arsing. Sue me.
Yeah, I kinda thought you were slumming again.
Was Hannibal wrong? Or can you not talk about that?
I like the article over at RealWorldTech better.
Originally posted by Programmer
. . . Don't count on it. OoOE was created to deal with crappy code, so eschewing it isn't going to make the crap smell like a rose.
Thanks. It was an uneducated guess on my part. I'm glad to see there is still a need for talent and brains.
Now does Apple intend to use this in the portable for one more 32 bit revision is an interesting question. My geuss at the moment is that we are likely to see the 7448 in an apple portble soon. That might not be top of the line portable but it is truely hard to say. Personally I think Apple wants to do nothing more than to deliver a SMP 64 bit portable ASAP. That happening though is not something they control 100%
Dave
Originally posted by PB
I agree that's almost a given. The 7448 is due this autumn, and this too would be a substantial improvement over the 7447B we have today. [/B]
Originally posted by wizard69
If you really wanted a 7448 based machine you could get one today. Embedded manufactures have been advertising the chip for over a month now as available on their boards.
Oh, could you provide some link? This means that the 7448 has been well ahead of schedule (2H-05, see page 41). And perhaps it would in turn mean that a dual core 8641D-based Powerbook is not so far as i thought. If production of the 8641 chips go as did for the 7448, then a dual core Apple notebook should be ready for this November or so. Apple Expo Paris?
Personally I think Apple wants to do nothing more than to deliver a SMP 64 bit portable ASAP.
Why?