Epic game developer calls iPad 2 graphics leap "astonishing," doubts Android can compete

13

Comments

  • Reply 41 of 68
    cgc0202cgc0202 Posts: 624member
    Quote:
    Originally Posted by Menno View Post


    The first test had the xoom within an hour of the ipad (less than 10%). The second was a lot lower (4 hours). I'm not sure why they didn't have a video test in the xoom review. Other links I gave still gave similar results between the two platforms (they lasted similar times) when comparing video playback, so it would be interesting to see what the difference was. Were they using a third party player on the xoom? And I've read of people getting bframes to to work fine on the xoom (some using third party players there)



    While others have the xoom pegged at 10 hour battery life: http://news.consumerreports.org/elec...tery-life.html (the ipad2 is at 12, still higher, but not as big a gap)





    And you're right, display tech is different. But so is screen resolution, web load times, Contrast ratio, backgrounding capabilities, etc. You can't have totally "equal" comparisons because they're operating different hardware AND software.



    I do not know if you actually used the Xoom and the iPad2 long enough to compare them, and make reliable anecdotal impressions. Your arguments are all over the place, when it suits you, you use information that compares a Xoom to the original iPad (but conveniently ignore the evidence comparing the Xoom to iPad2), see your original links first provided before they were challenged with relevant links between Xoom and iPad2.



    That is also true with reviews from various sources, you focused more on the one with least difference between the two devices. Did you even take the time to read more carefully about the nuanced of each comparison?



    To rely solely on reviews and what you read in the internet, as a basis of your conclusions, and then casting doubt on the statement of others, i.e., the game developer in question, because you believe they "have a stake on it" (which may or may not be true) is rather disingenuous, if not outright ridiculous.



    It may help if you buy a Xoom, and then show us your results.



    But then again, how can we be sure that you will not do a DaHarder act, where he claimed to own both the Xoom and the iPad*** but preferred the Xoom and the iPad is gathering dust (or something to that effect). And then, the credulity can be questioned when he claims he owns so many Apple products, but the holistic impression you get from his posts is that he does not like Apple products much and always touting a better product by some other company.



    I do not know about you or someone else, but if I tried a company once or twice, over the years, and they continue to give me a bad experience, why the heck will I keep on buying their product?



    CGC



    ***The flaw there again was a comparison between the Xoom and the "original iPad". Ones credulity may be put to question if one claims that he was actually comparing it with the iPad2 because if memory serves me, the post was before the iPad2 was yet to be on sale. And, even if iPad2 was already on sale, why would someone be lining up to buy an iPad2, if they had a not so good experience with the original iPad, let alone an overall mediocre experience with Apple products?
  • Reply 42 of 68
    bigmac2bigmac2 Posts: 639member
    Quote:
    Originally Posted by Menno View Post


    I refuse to pay Consumer Reports Paywall, so I don't know what methodology they used. I do know with Cnet for their tests they used a 720p version of a movie with a third party movie player for the Xoom, and they used a iPad Optimized version of the movie for the ipad.. pretty sure that will have significant weight when it comes to battery life. It's been awhile since I read mossbergs review, but I'm sure I saw that other reviewers commented on his review saying his battery results were not what they were seeing.



    Battery life tests based on video playback do not give good appreciation of the CPU consumption. As you point out, optimized codec limit the need to the CPU on decoding and it's a mostly lightweight task for the GPU since it only need to process a steady stream with relative simple function.



    This is not about Xoom vs iPad or Android vs iOS, it's all about CPU. I still can't find anything on the net giving real proof of Tegra 2 "awesomeness". None of the tests i've seen so far give direct appreciation of the Tegra 2 performance and efficiency, this is very very strange for a off the shelf product. Every chip maker beside Nvidia are publishing their TDW. If the TDW of Tegra 2 is in the 4-5 watts range, they are more in the Intel Atom league than Samsung-Qualcomm-Apple below 1 watt league.
  • Reply 43 of 68
    d-ranged-range Posts: 396member
    Quote:
    Originally Posted by tipoo View Post


    An ARM Cortex A8 at 1GHz equivalent to a single 360 processor core? I'm giving that a [citation needed]





    http://en.wikipedia.org/wiki/Instruc...ons_per_second



    The whole tri-core processor outputs 19,200 MIPS. An A8 like the iPad 1 outputs 2,000 MIPS at peak. And its 6 instructions per cycle vs 2. Even if you divide the former score by three, your nowhere close.



    And yes, I do know MIPS aren't a perfect indicator of performance, but they should give you a general sense of where things are.



    I know many posters before me have tried to discard your statement as uninformed, that they'd rather trust Epic and that these numbers don't tell the whole story, but IMO, you are absolutely right. There is NO WAY a Cortex A8 at 1 Ghz is equivalent with a single (3 Ghz) core as in the 360. No way AT ALL, not even nearly so. I don't know why someone at Epic is saying something like this, but a single core of the 360 CPU is in a whole different league of performance. It has 3x the clock speed, much faster memory bus, much more sophisticated, application controllable cache architecture, much higher floating point performance, much higher IPC, it's faster in every way imaginable. Even a single 360 core running at 1 Ghz would beat the pants off a Cortex A8 in terms of CPU speed. The CPU's in current consoles, even though they are already a few years old, are actually pretty damn fast. The gap with PC's is still mostly in GPU performance.



    Maybe he was making some kind of imaginary, nonsensical comparison of the theoretical performance of an A8 core including the GPU part, just adding FLOPS together to get a number that compares with the performance of 1/3rd of an Xbox 360, but that makes no practical sense at all, since a 360 also has a GPU, and it also has faster RAM and most likely a much better SDK and compiler that generate more efficient code than Apple gcc on iOS (gcc-generated ARM has never been great, and until Apple makes LLVM the default compiler backend, it can't hold a candle to the code generated by Microsoft compilers, in terms of efficiency).



    I'd say a dual-core Cortex A9 starts getting close to a single core of a 360 CPU, but I'd estimate it would still be slower. We're talking about an ARM SoC that's almost twice as fast as a typical Cortex A8.



    I'm not a professional game developer myself so I'm not pretending to be an expert, but I do have a background in processor design, and I did do my fair share of research on the architecture of the 360 and the PS3, and I'm _one hundred_ percent sure that a Cortex A8 at 1 Ghz is NOWHERE near the performance of 1/3rd of the CPU in a 360.
  • Reply 44 of 68
    melgrossmelgross Posts: 33,510member
    Quote:
    Originally Posted by Menno View Post


    THe LG optimus has released devices with Tegra2. The Atrix is a tegra2 device (battery issues here related to BLUR, not processor). The Bionic should be out within a month, as will other devices.



    There is a LOT of stuff out there demonstrating the capabilities of Tegra devices, both in battery life and in graphics processing. Again, I don't know where you've been reading otherwise.



    If you're looking for a closer comparison to the Ipad, you'll have to wait until the Asus Transformer, because that is also a IPS display device (though higher pixel density)



    But again, I'm not seeing what you're trying to say here. On one hand you're saying that there's next to no information out there on what a Tegra device can do, and on the other you're dismissing Tegra as being inferior. You cannot hold both positions.



    We have very good information as to what its capabilities are in a real design, and that in the test comparison between the Xoom and the iPad2 on Anandtech, where the Tegra2 design got thoroughly pounded by the iPad2, and was even being given a run for its money in some areas by the iPad1.



    The problem for the Tegra 2 was that all of it's good press came from the specs Nvidia gave out, and the ASSUMPTION that as Nvidia is primarily a graphics chip design firm its graphics capabilities would be better than anything else around. These reporters conveniently forgot about Imagination. Not any more. The graphics of the Tegra is now acknowledged to be much inferior to Imagination's, an inferiority that isn't believed to be able to be made up this year. Of course, imagination isn't sitting still either.
  • Reply 45 of 68
    shrikeshrike Posts: 494member
    Quote:
    Originally Posted by d-range View Post


    There is NO WAY a Cortex A8 at 1 Ghz is equivalent with a single (3 Ghz) core as in the 360. No way AT ALL, not even nearly so. I don't know why someone at Epic is saying something like this, but a single core of the 360 CPU is in a whole different league of performance. It has 3x the clock speed, much faster memory bus, much more sophisticated, application controllable cache architecture, much higher floating point performance, much higher IPC, it's faster in every way imaginable. Even a single 360 core running at 1 Ghz would beat the pants off a Cortex A8 in terms of CPU speed.



    It'll depend on the load. The PowerPC-based (Cell) PPE core inside the Xenon and the Cell basically traded good branch prediction capabilities (not to mention OOOE either) for a deep pipeline to enable the high GHz. In other words, for streaming, media loads, it's great. For AI and gameplay code, it sucks.



    So, I can definitely see from a game developers perspective, people who have to develop FPS-style games that require good gameplay and AI, a 1 GHz Cortex-A8 could be about equivalent to a 3.2 GHz PPE core. Cell PPE is a pretty imbalanced architecture.



    Developers were complaining at the time that their code were running 1/3 to 1/10 slower!



    The A8 doesn't have OOOE or have huge branch predictions either, but its branch misprediction penalties are smaller too. It has 2 ALUs. It's fairly clean. Cell PPE on the other hand, there are a lot of weird latencies with the architecture that can hamstring it. 2-cycle latency on instruction issue? 64-bit precision stalls the pipeline by 6 cycles? Penalties associated with branch mispredictions?



    So, don't buy into the marketing hype that IBM, Sony and MS built here. (3.2 GHz, XXX GFLOPS, etc). They knew that the PPE would suck at certain types of load.
  • Reply 46 of 68
    shrikeshrike Posts: 494member
    Quote:
    Originally Posted by melgross View Post


    We have very good information as to what its capabilities are in a real design, and that in the test comparison between the Xoom and the iPad2 on Anandtech, where the Tegra2 design got thoroughly pounded by the iPad2, and was even being given a run for its money in some areas by the iPad1.



    The problem for the Tegra 2 was that all of it's good press came from the specs Nvidia gave out, and the ASSUMPTION that as Nvidia is primarily a graphics chip design firm its graphics capabilities would be better than anything else around. These reporters conveniently forgot about Imagination. Not any more. The graphics of the Tegra is now acknowledged to be much inferior to Imagination's, an inferiority that isn't believed to be able to be made up this year. Of course, imagination isn't sitting still either.



    Yup. There are still from Android fans holding on to Nvidia's marketing of Tegra 2. Remember the 16 hr of HD video claims? Hehe.



    Seriously, it's pretty much proven that Tegra 2 is probably running last or second to last in terms of CPU and GPU performance among the 2011 ARM SoCs: Apple A5, Samsung Exynos, TI OMAP 44x0, and Qualcomm Snapdragon MSM8xx0. My guess that out of this group, Tegra 2 is probably last or 2nd to last in terms of aggregate performance. It's a really tight grouping though. Apple A5's SGX543MP2 GPU is the only clear winner as the best GPU of the bunch. CPU-wise, it's going to be a pretty tight grouping.
  • Reply 47 of 68
    d-ranged-range Posts: 396member
    Quote:
    Originally Posted by Shrike View Post


    It'll depend on the load. The PowerPC-based (Cell) PPE core inside the Xenon and the Cell basically traded good branch prediction capabilities (not to mention OOOE either) for a deep pipeline to enable the high GHz. In other words, for streaming, media loads, it's great. For AI and gameplay code, it sucks.



    So, I can definitely see from a game developers perspective, people who have to develop FPS-style games that require good gameplay and AI, a 1 GHz Cortex-A8 could be about equivalent to a 3.2 GHz PPE core. Cell PPE is a pretty imbalanced architecture.



    Developers were complaining at the time that their code were running 1/3 to 1/10 slower!



    The A8 doesn't have OOOE or have huge branch predictions either, but its branch misprediction penalties are smaller too. It has 2 ALUs. It's fairly clean. Cell PPE on the other hand, there are a lot of weird latencies with the architecture that can hamstring it. 2-cycle latency on instruction issue? 64-bit precision stalls the pipeline by 6 cycles? Penalties associated with branch misprediction



    So, don't buy into the marketing hype that IBM, Sony and MS built here. (3.2 GHz, XXX GFLOPS, etc). They knew that the PPE would suck at certain types of load.



    Even with all these penalties and pipeline stalls a 1Ghz A8 is not going to come close to a single Xenon core or Cell PPE, not even on heavily branched code such as AI. The difference in clock speed, FPU throughput, cache architecture and memory bandwidth is simply too big. I know the PPC cores in 360's are incomparable in performance to e.g. a G5, but we're comparing against an ARM core about as fast as a single core Atom at 1 Ghz, which is hideously slow.



    I'm really the first person to admit ARM designs are making huge inroads in terms of performance, and a dual core Cortex A9 is really starting to look very interesting compared to low end x86 chips, but a Cortex A8 beating a dual-threaded chip running at 3x the clock speed and pretty crazy FPU performance, on it's own game, that's really a bridge too far.
  • Reply 48 of 68
    Quote:
    Originally Posted by d-range View Post


    Even with all these penalties and pipeline stalls a 1Ghz A8 is not going to come close to a single Xenon core or Cell PPE, not even on heavily branched code such as AI. The difference in clock speed, FPU throughput, cache architecture and memory bandwidth is simply too big. I know the PPC cores in 360's are incomparable in performance to e.g. a G5, but we're comparing against an ARM core about as fast as a single core Atom at 1 Ghz, which is hideously slow.



    I'm really the first person to admit ARM designs are making huge inroads in terms of performance, and a dual core Cortex A9 is really starting to look very interesting compared to low end x86 chips, but a Cortex A8 beating a dual-threaded chip running at 3x the clock speed and pretty crazy FPU performance, on it's own game, that's really a bridge too far.



    You seem to have missed one important thing.



    "Last year's A4 CPU used in the iPhone 4 and iPad is roughly "comparable to a single Xbox 360 core" Sweeney estimated."



    He never mentioned the Cortex A8, he was talking about the whole system-on-a-chip which makes all your arguments meaningless.
  • Reply 49 of 68
    bigmac2bigmac2 Posts: 639member
    Quote:
    Originally Posted by d-range View Post


    Even with all these penalties and pipeline stalls a 1Ghz A8 is not going to come close to a single Xenon core or Cell PPE, not even on heavily branched code such as AI. The difference in clock speed, FPU throughput, cache architecture and memory bandwidth is simply too big. I know the PPC cores in 360's are incomparable in performance to e.g. a G5, but we're comparing against an ARM core about as fast as a single core Atom at 1 Ghz, which is hideously slow.



    I'm really the first person to admit ARM designs are making huge inroads in terms of performance, and a dual core Cortex A9 is really starting to look very interesting compared to low end x86 chips, but a Cortex A8 beating a dual-threaded chip running at 3x the clock speed and pretty crazy FPU performance, on it's own game, that's really a bridge too far.



    One thing for sure, SoC chip could beat any conventional motherboard architecture in latency. It can easily beat RAM and CPU-GPU bandwidth too. I haven't seen any spec for the real memory bandwidth of the A5 vs the Xbox 360 cpu, but in term of Cache architecture and memory bandwidth, the Cell PPE may be a stream processing beast, but there is nothing to prevent a current generation of SoC from getting the similar memory bandwidth and better latency while sitting the GPU and the RAM on top of the CPU. The Xbox 360 still a 6 years old design.
  • Reply 50 of 68
    d-ranged-range Posts: 396member
    Quote:
    Originally Posted by InsideOut View Post


    You seem to have missed one important thing.



    "Last year's A4 CPU used in the iPhone 4 and iPad is roughly "comparable to a single Xbox 360 core" Sweeney estimated."



    He never mentioned the Cortex A8, he was talking about the whole system-on-a-chip which makes all your arguments meaningless.



    He's talking about 'the CPU', to me, that means the CPU and not the CPU plus the GPU plus the RAM. Comparing a CPU core to a complete SoC doesn't make sense anyway, that would be be along the lines of saying a car can 300 mph, because the engine can run it at 150 mph, and the metal bits around it can do 150 mph when pushed off a cliff. Talk about meaningless
  • Reply 51 of 68
    d-ranged-range Posts: 396member
    Quote:
    Originally Posted by BigMac2 View Post


    One thing for sure, SoC chip could beat any conventional motherboard architecture in latency. It can easily beat RAM and CPU-GPU bandwidth too. I haven't seen any spec for the real memory bandwidth of the A5 vs the Xbox 360 cpu, but in term of Cache architecture and memory bandwidth, the Cell PPE may be a stream processing beast, but there is nothing to prevent a current generation of SoC from getting the similar memory bandwidth and better latency while sitting the GPU and the RAM on top of the CPU. The Xbox 360 still a 6 years old design.



    No, that's not true, the latencies in RAM don't depend on how close the RAM is located to the CPU. The 360 runs it's RAM at 3 Ghz by the way, so it's going to be much faster just because of that fact. Also, the 360 CPU has a much more flexible cache architecture, for example it allows developers to lock cache lines and use them as ultra fast micro-memory. All in all, most architectural aspects of the 360 CPU are more advanced, everything runs at higher clocks, with faster RAM, higher FPU throughput.



    Edit: I double checked because I wasn't sure, but the 360 has 700 MHz GDDR3 RAM, so it's not running at the CPU clock like the PS3. Still a lot faster than the RAM on the A4 though.
  • Reply 52 of 68
    bigmac2bigmac2 Posts: 639member
    Quote:
    Originally Posted by d-range View Post


    No, that's not true, the latencies in RAM don't depend on how close the RAM is located to the CPU. The 360 runs it's RAM at 3 Ghz by the way, so it's going to be much faster just because of that fact. Also, the 360 CPU has a much more flexible cache architecture, for example it allows developers to lock cache lines and use them as ultra fast micro-memory. All in all, most architectural aspects of the 360 CPU are more advanced, everything runs at higher clocks, with faster RAM, higher FPU throughput.



    The 360 got a conventional North-south bridge motherboard, so every reading from the RAM need extra step to reach if you compare to Intel newer Core architecture with the ram controller within the CPU, SoC got the same benefit. As for the cache locking, every moderne multi-core CPU with cache sharing between cores need those feature, so we can assume that any dual core ARM have it right now. The 360 CPU have been design since 2003, so there is nothing exceptional here.



    BTW, check back your specs, the 360 is running his DDR3 ram @ 700Mhz... not at CPU frequency... neither any system on the market.

    ref: http://hardware.teamxbox.com/article...cifications/p1
  • Reply 53 of 68
    shrikeshrike Posts: 494member
    Quote:
    Originally Posted by d-range View Post


    He's talking about 'the CPU', to me, that means the CPU and not the CPU plus the GPU plus the RAM. ...



    Agree with you here. Sweeney is strictly talking about the CPU proper between the two devices. There is not other way to interpret the quote.
  • Reply 54 of 68
    mennomenno Posts: 854member
    Quote:
    Originally Posted by cgc0202 View Post


    I do not know if you actually used the Xoom and the iPad2 long enough to compare them, and make reliable anecdotal impressions. Your arguments are all over the place, when it suits you, you use information that compares a Xoom to the original iPad (but conveniently ignore the evidence comparing the Xoom to iPad2), see your original links first provided before they were challenged with relevant links between Xoom and iPad2.



    That is also true with reviews from various sources, you focused more on the one with least difference between the two devices. Did you even take the time to read more carefully about the nuanced of each comparison?



    To rely solely on reviews and what you read in the internet, as a basis of your conclusions, and then casting doubt on the statement of others, i.e., the game developer in question, because you believe they "have a stake on it" (which may or may not be true) is rather disingenuous, if not outright ridiculous.



    It may help if you buy a Xoom, and then show us your results.



    But then again, how can we be sure that you will not do a DaHarder act, where he claimed to own both the Xoom and the iPad*** but preferred the Xoom and the iPad is gathering dust (or something to that effect). And then, the credulity can be questioned when he claims he owns so many Apple products, but the holistic impression you get from his posts is that he does not like Apple products much and always touting a better product by some other company.



    I do not know about you or someone else, but if I tried a company once or twice, over the years, and they continue to give me a bad experience, why the heck will I keep on buying their product?



    CGC



    ***The flaw there again was a comparison between the Xoom and the "original iPad". Ones credulity may be put to question if one claims that he was actually comparing it with the iPad2 because if memory serves me, the post was before the iPad2 was yet to be on sale. And, even if iPad2 was already on sale, why would someone be lining up to buy an iPad2, if they had a not so good experience with the original iPad, let alone an overall mediocre experience with Apple products?



    I'll make this simple for you, ok? The links I posted were for XOOM reviews. When the xoom came out, the ipad 2 wasn't around for testing, thus it's absence from the data I linked. THe links sophilism posted were for ipad2 reviews. Since I have zero interest in owning an ipad, I wouldn't see those stats. It's really not a hard concept. Why would I link to an ipad review if someone asked me for information about xoom battery life?



    I know people who do have both (they're tech reviews/junkies) and they say that both can easily get them through a day or two. That's what I base my "Practically comparable" statement off of.



    Do you have a xoom? I thought not. Don't tell me to not rely on what's read on the internet when that is ALL That you are doing. I think that Apple makes some of the best computers around, I just don't like iOS. I do have an ipod touch (over 2 years old), so yes, I'm familiar with the OS.



    For the record I don't own an ipad or a Xoom. If I get a tablet it will most likeyl be an android device because that's what all my apps are wrapped up in and I prefer the more desktop like experience it gives. I have used both the xoom and the ipad2 (though nowhere near long enough to test the battery)



    I know this is hard for you to believe, but if you look at it. ALL MY LINKS WERE TO XOOM REVIEWS. And even stuff comparing the Xoom to the ipad2 the only place it seems to fall short is for 720p video playback. Browsing and standard definition are very comparable (ot the point where the average user would NOT notice a difference)>
  • Reply 55 of 68
    bigmac2bigmac2 Posts: 639member
    Quote:
    Originally Posted by Shrike View Post


    Agree with you here. Sweeney is strictly talking about the CPU proper between the two devices. There is not other way to interpret the quote.



    True, and while it look like I defend the A5 a lot, I still consider the 360 cpu to be much more powerful than any mobile SoC.



    But for what I can see, and what this article is all about, SoC would be the future of Mobile and Gaming platform for the next few years. And at one point could be the next gen of desktop computing.
  • Reply 56 of 68
    d-ranged-range Posts: 396member
    Quote:
    Originally Posted by BigMac2 View Post


    The 360 got a conventional North-south bridge motherboard, so every reading from the RAM need extra step to reach if you compare to Intel newer Core architecture with the ram controller within the CPU, SoC got the same benefit.



    It's not as simple as that. Either way, it doesn't matter much anyway, from a practical point of view the RAM in the 360 is much faster than the LPDDR1 on the A4.



    Quote:

    As for the cache locking, every moderne multi-core CPU with cache sharing between cores need those feature, so we can assume that any dual core ARM have it right now.



    Maybe, I'm not sure, I don't know much about the newer dual core ARM. I do know that the A8 pretty much lacks any form of cache control that could help multithreaded performance. He was talking about the A4, and that's a single-core A8.



    Quote:

    The 360 CPU have been design since 2003, so there is nothing exceptional here.



    I agree with you there. Nothing exceptional, just brute force. You can't compare it to a ~10W mobile part, even though it's 7 years old.



    Quote:

    BTW, check back your specs, the 360 is running his DDR3 ram @ 700Mhz... not at CPU frequency... neither any system on the market.

    ref: http://hardware.teamxbox.com/article...cifications/p1



    As you can see I already corrected myself right before you mentioned it . The PS3 RAM does run at CPU clock though.
  • Reply 57 of 68
    bigmac2bigmac2 Posts: 639member
    Quote:
    Originally Posted by d-range View Post


    As you can see I already corrected myself right before you mentioned it . The PS3 RAM does run at CPU clock though.



    True, the PS3 XDR ram is running at CPU clock, and VRAM at 700Mhz. Strange enough, specs said the memory bandwidth for the XDR is 25GB/sec and 22GB/sec for the Vram, that puzzle me since the big frequency differences between the Vram and the XDR Ram on the PS3.



    And I give you right about the brute force of the +50 watts 360 CPU, and for now it still be pretentious to compare Cell and Arm cpu. but if you consider the whole system, mobile platform like the iPad have some "spades" making them in reach of true gaming platform, and exceeding in some "limited" way current gaming console and even desktop computer (ex. NAND flash storage give them SSD performance). In a way the iOS ecosystem look like a lot of gaming console. Apple have the same level of control on its device and developer for it's device than Sony, Nintendo and Microsoft, which appear to be very successful for the whole gaming industry.
  • Reply 58 of 68
    shrikeshrike Posts: 494member
    Quote:
    Originally Posted by d-range View Post


    Even with all these penalties and pipeline stalls a 1Ghz A8 is not going to come close to a single Xenon core or Cell PPE, not even on heavily branched code such as AI. The difference in clock speed, FPU throughput, cache architecture and memory bandwidth is simply too big. I know the PPC cores in 360's are incomparable in performance to e.g. a G5, but we're comparing against an ARM core about as fast as a single core Atom at 1 Ghz, which is hideously slow.



    I'm really the first person to admit ARM designs are making huge inroads in terms of performance, and a dual core Cortex A9 is really starting to look very interesting compared to low end x86 chips, but a Cortex A8 beating a dual-threaded chip running at 3x the clock speed and pretty crazy FPU performance, on it's own game, that's really a bridge too far.



    Sweeney didn't say that. He said "comparable to a single Xbox 360 core." That doesn't mean beating. It may even mean a little slower (10%, 20%, maybe even 30% slower).



    I think you're putting too much credence to the "marketing" specs of the PPE. For single-threaded performance, branchy code, the performance can be as "bad" or worse than a 1 GHz Cortex-A8. Developer rumors were 1/10 the performance. The penalties can be that bad. Instructions can only be issued every other clock cycle! A 1-1.5 GHz PowerPC G4 was comparable to a 3.2 GHz PPE. No wonder they implemented 2-way SMT. For streaming loads, a 3.2 GHz PPE would obviously be 2x or 3x faster on single threaded code.



    On aggregate, over a game with a mix load of instructions, something a developer like Sweeney would see, comparable maybe the right word. With 3 cores, and the other hardware in the Xbox 360, it obviously has more horsepower, 2x-3x, than a Cortex-A8 system. But core-to-core comparisons, I think it makes sense.



    In retrospect, the era that produced Netburst, Cell, and PPE was a market-driven dead end. Atom is similar, and is suffering for it. Even EPIC has a philosophical similarity (I'll get to this). Back then, clock frequency was the number one parameter in the performance and marketing of CPUs. It got to the point that they reached economic, practical limits for the power consumption for personal computing. It sacrificed single-threaded, spaghetti code performance for streaming performance because the GHz, GFLOPS, GB/S were so much more marketable.



    Cell and PPE was perhaps the zenith of the imbalance. These processor architectures were essentially hoping that the compiler would save them, that somehow software designers would make their code beneficially multithreaded. The PPE and Atom essentially require multiple threads to ensure that their pipelines had instructions to execute. EPIC was even lower level as it was expecting compilers and software design to extract parallelism out of the instruction stream, let alone having multi-thread code.



    That was a bad bet as parallelism is a decades long problem. CPU architecture? It can cycle every four years. Gradual improvements single threaded performance in could be done every 2 years. We caught in the same trap with multi-core CPUs. After the Cortex-A15, single-threaded performance improvement is going to slow down while the cores ramp up. But software to take advantage of the increasing cores will be few.
  • Reply 59 of 68
    hirohiro Posts: 2,663member
    Quote:
    Originally Posted by tipoo View Post


    Ehh? Why would I count the GPU when I'm specifically talking about the CPU comparison he made? He made a per-core processor comparison. Not GPU. I'm not talking about Apple's 9x graphical performance claim.



    Now you are the one making the bogus claim -- Sweeny said A4, YOU said CortexA8 core. An A4 includes the GPU, a memory controller and on package memory. An A4 vs one of the XBox Xenon cores. A Xenon core with very little cache and no onboard memory controller.



    Is the A4 as MIPS benchmark "fast", no. Could the A4 with all of the designed in synergy and latency mitigation of on package and on-die placement become as computationally powerful? Not such a big stretch.



    Most non-CS types have no idea of how pathetic most CPU utilization is, how much computational power of the chip is wasted by less than wonderful systems engineering and programming. The intervening 10 years since the original PPC Power4 core was shipped and eventually morphed into the derived Xenon core have allowed a lot of total systems engineering to happen.



    It is quite naive to discount that 10 years of engineering, even if a MIPS snapshot of a single operating mode in both CPUs doesn't measure up to a division by three. The problem is the MIPS benchmark is specifically designed to ignore as much of that systems engineering as possible! MIPS is just a single component of a benchmarking suite that needs some serious interpretation to get right. Cherry-pick any component and you are guaranteed to get the big picture wrong.
  • Reply 60 of 68
    shrikeshrike Posts: 494member
    Quote:
    Originally Posted by d-range View Post


    I'm really the first person to admit ARM designs are making huge inroads in terms of performance, and a dual core Cortex A9 is really starting to look very interesting compared to low end x86 chips, but a Cortex A8 beating a dual-threaded chip running at 3x the clock speed and pretty crazy FPU performance, on it's own game, that's really a bridge too far.



    Tried to find a good benchmark. There are none. The best I could find is this.



    Linux Dhrystone 2 benchmarks





    Machine_______________Result____________________In dex

    G4/1250 (Mac Mini)____3896391.2_________________174.2

    Thunderbird/1400______3161216.2_________________141.3

    xbox 360 ("xenon")____3015837.2 (estimated)_____134.8

    Celeron M/1500________2759615.6_________________123.4

    iBook G3/700__________1537968.9_________________68.8





    My memory is vaguely recalling the discussions at the time for the IBM PPE in the Cell and the Xenon/Waternoose. Remember, in that time frame, 2003-2005, there was a lot of noise about what Apple was going to do as the PPC 970fx, 970mp, was running out of steam and reaching a bad perf/watt state. There was a lot of angst about this on Apple boards/forums. One of the wishful thoughts at the times was that IBM could adapt the PPE for Apple Macs.



    My vague recollection was that, after many posts about architecture differences, a 4 GHz PPE would be about like a 2 GHz 970fx, maybe even a 2 GHz PPC G4, core to core, leaving Apple none the better.



    This simplistic, single-threaded integer benchmarks above basically show this. But once you take into account spaghetti, branchy code, things got ominous. Good multi-threaded code, which these types of architecture are good at, is still obviously decades off except for the embarrassingly parallel problems (solving differential equations, streaming ops, multimedia etc).



    Obviously, going Intel was the best move.
Sign In or Register to comment.