or Connect
AppleInsider › Forums › Mac Hardware › Future Apple Hardware › Speed of Apple Intel dev systems impress developers
New Posts  All Forums:Forum Nav:

Speed of Apple Intel dev systems impress developers - Page 3

post #81 of 134
Quote:
Originally posted by T'hain Esh Kelch
I was just about to quote all those guys who actually saw the ironi and write "They got it!"

Damn newbs.

Irony and sarcasm rear their head in a ton of posts here people, and TEK just expressed his. Time to move on...
...we have assumed control
Reply
...we have assumed control
Reply
post #82 of 134
Quote:
Originally posted by Apple

Time to move on...

*sniff, sniff* Are you breaking up with me? - IBM
-wt-
Reply
-wt-
Reply
post #83 of 134
Quote:
Originally posted by 00100011
*sniff, sniff* Are you breaking up with me? - IBM


Apple to IBM "I need some space. Just wanted to tell you it's not you but me"
post #84 of 134
Quote:
Originally posted by Brendon
What is needed is to deliver the perfect balance of integer to FPU to SIMD performance, and maybe Intel knows this.

No such thing. There are as many "perfect" balances as there are uses for these machines.

Quote:
Maybe Apple can see that the Intel processors are better balanced.

Apple's transition is based on an evaluation of future Intel processors vs. future IBM processors. This will have included all aspects of machine performance, including power consumption and cost(s). The better integer performance is certainly a big plus, and the inferior FPU/VPU performance is a minus (unless Intel's roadmap demonstrates those are better in the future too). Intel claims they are making FPU improvements to the Pentium-M chip line (which Apple will adopt).
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #85 of 134
Quote:
Originally posted by THT
Have to violently disagree with you. If an app has 4 threads, a dual 970mp PowerMac using the current G5 motherboard will destroy a Yonah desktop in performance, all of the 970mp cores each have 1 MB on-die L2 cache afterall.

I think we're starting to talk past each other-- I don't care all that much about the capabilities of the processors' execution units. Today's single-processor Pentium M 2.1GHz chips have 2MB of L2 cache per chip, and Intel has shown a great deal of expertise in doing good cache implementations both on-chip and on motherboard. Today's single-processor G5 has 512KB of L2 cache, and Apple has a history of comparably bad motherboard caching and chipsets. If I were a betting man, I'd put my money on Intel to create an architecture that can feed more cores with data for their next generation of chips. Today's CPUs tend to be slightly starved as it is... adding 2 more cores without a dramatic redesign is going to simply offer no benefits.

As for the next generation-- Yonah vs. 970mp-- and how they compare, it will be interesting. Intel is pulling more desktop technology into Yonah, and shoring up some of the Pentium M's weak points, while at the same time lowering power consumption and voltage. The G5 will be boosted to 1MB of cache per core, while the Yonah will have 2MB of cache shared between the cores, which will probably offer a significant win for the Yonah without VERY careful thread affinity algorithms on the G5. (In the worst-case scenario of two threads working on the same data, the G5 has effectively half the L2 cache as the Yonah and twice the memory bus overhead.)

In summary, I suspect the real limitation of multi-core performance will be memory busses and caching, where the Yonah appears to be ahead of the 970mp. I don't think anyone can predict whether a 2-core Yonah will beat a 2-core 970mp, a dual 2-core 970mp, or neither. I don't think any statements there are "ludicrous", although obviously someone's going to be wrong.

Description of Yonah cache:
http://www.tomshardware.com/hardnews...02_143758.html
post #86 of 134
Quote:
Originally posted by Booga
Multithreading - I would expect the Yonah to win. Intel now has way more experience with desktop dual-core than anyone else, except perhaps AMD. The quad-core G5 is a nice thought, but it would very much depend on the cache architecture to overcome the anemic RAM configuration, and Apple has shown no evidence to be clueful in this regard. (a single threaded app, in contrast, will not take advantage of multiple CPUs, so if you are just running that one single threaded app, then whether you have 1, 2, or 4 cores is irrelevant)

Vector operations - I would expect the G5 to win for scientifically-targeted vector operations, and the Yonah to win for games-oriented vector operations. A lot of opportunities to vectorize games that exist in SSE3 aren't as easy to use (ie. slower) with Altivec with all the data massaging required. On the other hand, Altivec really does scream for certain imaging and scientific apps.

Floating point - G5 definitely wins from what we know so far. It's a super chip for floating point.

Memory bandwidth - Yonah, hands down. Intel is 667MHz DDR2 now, while Apple is stuck at 400MHz DDR. Bidirectional pipes to the CPU are nifty and all, but to then bottleneck them with slow RAM is silly.

Disk access - I don't see how either CPU has an advantage here. Disk speeds will be the bottleneck for virtually any bus or CPU. However, PCI-E is sure as heck better than Apple's northbridge/southbridge at shuffling the data between motherboard components here.

Graphics Card acceleration - Yonah win, easily. AGPx8 is a slow, slow bus compared to PCI-E.

Multithreading: You appear to have misunderstood my point. I am not talking about Hyperthreading, multicores, or anything hardware related. I am simply saying that a multithreaded app will take advantage of multiple CPUs (or multiple cores). Therefore, a well threaded app running on 4 G5 cores clocked at 2.5 Ghz or higher will easily outrun the same app running on 2 Pentium-M cores clocked around 2 Ghz (the targeted clock speed for Yonah's release).

While it may be technically true, it is also disingenuous to state the Intel has more dual core experience than anyone else. What is that, two or three months of experience? IBM has been making dual core chips since they released Power4 in 2001 - they are the pioneers of dual core technology!

Finally, if you think by "quad-core G5" I mean a 4 cores on a single die, that is wrong. A G5 machine with four cores will have two 970MP processors, with two cores per processor. A dual-core Powerbook will have one dual-core Yonah.

Vector operations - Games oriented vector operations, blah blah blah. Get real, pretty much any game that seriously challenges a 2+ Ghz G5 or Pentium-M is going to also require a good graphics card. We don't know exactly what GPU technology will be available next year, but if you look at what's out there today, the fastest chipset that could realistically fit in a Powerbook is the Mobility X700 (the Mobility X800 and the Geforce Go 6800 are only designed for 10+ lb "desktop replacement" laptops...they suck way too much power for a Powerbook). The Mobility X700 isn't bad, but it will be completely slaughtered, chewed up, and spit out by high end desktop cards like the Radeon X850 or GeForce 7800. If you think a 5 lb laptop GPU is ever going to come close to competing with a high end desktop GPU, you are smoking some pretty strong stuff.

Memory bandwidth - Yonah's FSB is expected to be 667 Mhz (167 Mhz quad-pumped). EACH 970 MP will have a 1250 Mhz E-Bus (625 Mhz double-pumped), so even when only one core is being used AND all of the memory usage is in only one direction, the G5 system will still have similar bandwidth to Yonah (5.2 GB/sec for Yonah, 4.9 GB/sec for the G5). Of course, that is the worst case scenario. In general usage, the G5 system will have up to FOUR TIMES the FSB bandwidth of the Yonah system (9.8 GB/sec per 970MP times two 970MPs).

Furthermore, most laptop chipsets do not use dual channel memory. There is no guarantee the Powerbook will.

As for DDR vs DDR2, you are assuming that the G5 chipset will not support DDR2 by next year. You are also ignoring the latency issues with DDR2 that make DDR2/533 benchmark worse than DDR400 in most cases.

Disk access - Hint: it's not in the CPU. Remember, the original poster was claiming that a 2006 PowerMac 970MP is going to be slower than a 2006 Yonah Powerbook. This is demonstrably untrue for many, many important applications.

The PowerMac can use a 4 channel RAID, or even XServe RAID, for hundreds of MB/sec of bandwidth. The Powerbook will at best have a single, small 7200 rpm mobile HD which might max out at 40 or 50 MB/sec at most. Again, the Yonah Powerbook gets slaughtered. And PCI-E is irrelevant...the limiting factor is clearly the drive (and after that, the drive interface), not 133 Mhz PCI vs. PCI-E.

Graphics Card acceleration - Okay, now you're just being completely nuts. You really do believe a POS Mobility X700 is going to outrun an AGP X850 or AGP GeForce 6800GT just because the former has PCI-E. WRONG. The X850 and 6800GT are going to ABSOLUTELY DESTROY the Mobility chip. To think otherwise is to have absolutely no understanding of how GPUs work and what AGP/PCI-E actually do. (let me guess, you also think the PCI-E Radeon X300 will beat an AGP Radeon X850)

Clearly, all of my original points stand. I will assume that you are not completely ignorant, but that you simply had no idea what systems were actually being compared. Next time READ THE THREAD before you respond to someone's post. It will save you some embarrassment!
post #87 of 134
Quote:
Originally posted by THT

It's almost like saying a Yonah laptop will be faster than the dual 2.7 GHz PowerMac G5 Apple currently ships. That's pretty crazy when we know Yonah will only be at 2.2 GHz. [/B]

Hmm, that's a good point. Except he is trying to compare a 4 core G5 to a 2 core Powerbook, so what he's really saying is that a single core 2 Ghz Dothan lightweight laptop should be outrunning a two core Dual G5/2.5

And yet strangely, if we assume that a 2 Ghz Dothan is about equivalent to a 3 Ghz P4 (actually, the Dothan is probably a bit slower for the class of applications we're talking about, which benefit from vector, floating point, and Hyperthreading enhancements on the P4, but it won't be a huge difference) we can see that the G5 would be:

2.4 times faster in After Effects rendering
1.9 times faster in Cinema 4D rendering
1.9 times faster in Photoshop MP actions
1.4 times faster in Photoshop SP actions
2.1 times faster in Bryce (actually, I think the Dothan would do somewhat better than the P4 here, but it's not going to change anything in the end)

Source is http://www.barefeats.com/macvpc.html

In other words, the Dual G5 wins in every single benchmark, often by 2x or more.

The Dothan would also lose in games. Obviously there is no Dothan laptop gaming equivalent on Barefeats, but you can compare the Dual G5 FPS numbers to Dothan/Mobility X700 numbers from other sites

(for example, http://www.hothardware.com/viewartic...eid=637&cid=10 )

Halo: G5/X800 gets 57 FPS, Dothan/X700 gets 30 FPS
Doom3: G5/X800 gets 20 FPS, Dothan/X700 gets 16 FPS.
Unreal 2K4: G5/X800 gets 60/137 FPS (flyby/botmatch), Dothan/X700 gets 20 FPS (31 FPS with no AA). (this one from PC Mag)

Note that these numbers seriously overstate the Dothan/X700 performance, because the G5 is always tested at 1600x1200 (4x AA where applicable), whereas the Dothans are tested at 1400x1050 for Halo and 1280x1024 for Doom and Unreal.
post #88 of 134
Quote:
Originally posted by Booga

I don't think anyone can predict whether a 2-core Yonah will beat a 2-core 970mp, a dual 2-core 970mp, or neither.

Watch me. I will predict that the fastest 4 core 970MP G5 system shipping in summer 2006 (assuming Apple releases such a beast) with the best Mac ATI or NVidia card will beat the fastest Yonah-based Powerbook that Apple ships by summer 2006 (assuming that Apple sticks to its Intel Mac shipping schedule), in a majority of the following applications:

Photoshop (say, an average of the Barefeats MP and SP filter tests)
Cinebench
After Effects
Doom3 at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)
Halo at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)

and a whole bunch more, but these ones we can be pretty sure that we will see benchmarks from at Barefeats.

If you are so sure that it is very hard to predict whether 970 MP PowerMac will beat Yonah Powerbook, then you should be happy to take this bet at odds better than 50/50. In fact, I will offer you THREE-TO-ONE odds in your favor, so this should be a great bet for you. I'd be gladly willing to wager $30,000 on this (i.e., if you win you get $30,000, if I win I get $10,000). How much would you be willing to wager?
post #89 of 134
Quote:
Originally posted by bigmig
Watch me. I will predict that the fastest 4 core 970MP G5 system shipping in summer 2006 (assuming Apple releases such a beast) with the best Mac ATI or NVidia card will beat the fastest Yonah-based Powerbook that Apple ships by summer 2006 (assuming that Apple sticks to its Intel Mac shipping schedule), in a majority of the following applications:

Photoshop (say, an average of the Barefeats MP and SP filter tests)
Cinebench
After Effects
Doom3 at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)
Halo at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)

and a whole bunch more, but these ones we can be pretty sure that we will see benchmarks from at Barefeats.

If you are so sure that it is very hard to predict whether 970 MP PowerMac will beat Yonah Powerbook, then you should be happy to take this bet at odds better than 50/50. In fact, I will offer you THREE-TO-ONE odds in your favor, so this should be a great bet for you. I'd be gladly willing to wager $30,000 on this (i.e., if you win you get $30,000, if I win I get $10,000). How much would you be willing to wager?

What if the 970 MPs never get released? Do you automatically lose?
post #90 of 134
Quote:
Originally posted by Brendon
I would like to purpose an observation that could be supported by the dev kits. We have for years been saying that the Mac is tuned for everyday life stuff but not for bench testing. Is it possible that we could look at the Intel processors as being somewhat the same in that they may be well balanced for desk top processing.

I'm fairly confident that the development kits are faster or seem faster primarily for 2 things. The user interface performance (app startup, window creation, filesystem) is driven by integer performance and burst memory performance, and the Pentium 4 660 in the dev kits is very good at that. Intel in general does very well in burst memory performance.

If they start benching Photoshop, FCP and whatnot, a 2.7 GHz 970fx should be about the same as the 3.6 GHz P4 660. If there was a GUI speed benchmark, I think the P4 660 would win hands down. A 2 GHz Dothan should win hands down against a 2, maybe 2.2 GHz 970fx.
post #91 of 134
Quote:
Originally posted by bigmig
Photoshop (say, an average of the Barefeats MP and SP filter tests)
Cinebench
After Effects
Doom3 at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)
Halo at any resolution of 1280x1024 or above, AA or no AA (both machines running same resolution/settings)

I suspect the G5 will win in Photoshop. I'm not sure why you limit the Doom3 and Halo to 1280x1024 or above... do you think the Yonah PowerBook would win with lower resolutions? In any case, I think the Yonah will be competitive on the games, even considering laptop GPU suckage. Don't know much about Cinebench or After Effects characteristics, so can't say.

And what about browsing, checking Mail, using iPhoto, copying data over a network, opening/closing/resizing windows, doing searches, or any of a hundred other things the typical user will want to do with their laptop?

I also have no clue what sort of chipset Apple will go with. From history, they could easily hobble their PowerBooks simply to maintain market segmentation. In short, I think the technology will be comparable, but I'd say the odds are worse than 3-to-1 that Apple will screw it up again and squander the opportunity. You also specify the best ATI or nVidia graphics card available... which Apple never ships as a factory configuration anymore.

So I'm not going to take your bet, because 1. I'm not the betting type, and 2. as I maintained earlier, I have no idea whatsoever how these things will benchmark against each other and I don't want to lose money over such a silly argument.
post #92 of 134
Quote:
Originally posted by THT
If there was a GUI speed benchmark, I think the P4 660 would win hands down.

I'm curious about the real profiles here -- the slow part of the MacOS X GUI has always (apparently) been Quartz 2D. Quartz 2D is a pixel rasterizer built around a floating point display model. This means it is mostly likely going to be primarily a combination of FPU and vector work, and moving a lot of memory. The Pentium4 will do pretty well here because this stuff pipelines nicely to take full advantage of clock rate, and the P4's fast bus and caches will be a big benefit. The (existing) Pentium-M will look a weak by comparison. Depending on the balance between memory usage and FPU/VPU compuation the G5 should be strong, and the 970MP with its bigger caches ought to do noticably better. Without real profiling data it isn't possible to know...
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #93 of 134
Quote:
Originally posted by THT
I'm fairly confident that the development kits are faster or seem faster primarily for 2 things. The user interface performance (app startup, window creation, filesystem) is driven by integer performance and burst memory performance, and the Pentium 4 660 in the dev kits is very good at that. Intel in general does very well in burst memory performance.

If they start benching Photoshop, FCP and whatnot, a 2.7 GHz 970fx should be about the same as the 3.6 GHz P4 660. If there was a GUI speed benchmark, I think the P4 660 would win hands down. A 2 GHz Dothan should win hands down against a 2, maybe 2.2 GHz 970fx.

I know that a test at Tom's Hardware showes that Yohna can be faster than the fastest Itanium, and Pentiums. So I think that a Dual Yohna machine will do just fine against a 970MP. In that you just stated that a 2.7 970fx is comparable to a P4 660, or were you talking about a single processor? What I have seen, the Dual 970s run neck and neck with the single Pentiums and Itaniums. I don't think that the current Pentiums and Itaniums were designed with dual configurations in mind but that will change as Intel switches to Yohna and beyond.
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
post #94 of 134
Quote:
Originally posted by THT
I'm fairly confident that the development kits are faster or seem faster primarily for 2 things. The user interface performance (app startup, window creation, filesystem) is driven by integer performance and burst memory performance, and the Pentium 4 660 in the dev kits is very good at that. Intel in general does very well in burst memory performance.

If they start benching Photoshop, FCP and whatnot, a 2.7 GHz 970fx should be about the same as the 3.6 GHz P4 660. If there was a GUI speed benchmark, I think the P4 660 would win hands down. A 2 GHz Dothan should win hands down against a 2, maybe 2.2 GHz 970fx.

Since this 'old-tech' Dothan easily clocks upto 2.5 Ghz as shown here http://www.tomshardware.com/cpu/2005...ntium4-10.html you will note for all REAL benchmarks (rar,ripping,etc) it wins hands down over current gen P4 and AMD chips. So i believe in most daily comp chores this lil champ would be great. If the next gen deliver's with an uprated SIMD and possible future on die mem controller, we can stay cool in both thermal and performance.
'If these words were people, I would embrace their genocide.' - Maddox
Reply
'If these words were people, I would embrace their genocide.' - Maddox
Reply
post #95 of 134
Quote:
Originally posted by Programmer
I'm not at all understanding the statement "most computing tasks that are bandwidth limited are unidrectional". I can't imagine bandwidth sensitive applications where you simply suck vast amounts of data IN to a processor and have little to no output. I'd welcome education on that.

That's easy: its not true.

<snipping the rest which is not at issue.>[/B][/QUOTE]

I would strenuously say it actually is true. What are you looking at right now? That display has been created by many times more code and data read into the processor over the FSB than ever goes out. Any type of quality encoding uses many times the amount of input data to the encoded output. Your forte, gaming code, will ship 10's of times more code upstream directly to the graphics card over the bus due to textures. In comparison to the the bytes shipped upstream there, the poly info shipped downstream off the CPU and over to the graphics card pales in comparison. From there almost everything goes out to the monitor or local storage (VRAM), not back to primary storage (RAM). Searches? Tons of data comes in--does it match? No, throw it away unchanged. Yes, keep that one little slice of everything you looked through. I could go on for a long time here.

The POWER series and 970 style balanced elastic bus is superior for server type transactions where you expect to ship large amounts of data off the CPU, but not much comparison or computation, and can use the clear one way downstream bus to avoid clobbering the upstream bus when shipping those packets out. In an aside though, those outgoing packets only go part-way down that bus to RAM, they get diverted at the DMA module to the appropriate I/O interface, usually one of the networks. But in computation intensive applications you typically consume far more data (and instructions which only ever go upstream) than finished data you ship back to RAM or other downstream devices.
.
Reply
.
Reply
post #96 of 134
Quote:
Originally posted by ZoranS
Since this 'old-tech' Dothan easily clocks upto 2.5 Ghz as shown here http://www.tomshardware.com/cpu/2005...ntium4-10.html you will note for all REAL benchmarks (rar,ripping,etc) it wins hands down over current gen P4 and AMD chips. So i believe in most daily comp chores this lil champ would be great. If the next gen deliver's with an uprated SIMD and possible future on die mem controller, we can stay cool in both thermal and performance.

So a Yohna chip is basically two Dothans on one die, with improvements. So a MB design that uses two Yohnas is a system that could be faster than a MB utilizing 4 Pentium 660s.
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
post #97 of 134
Quote:
Originally posted by skatman
I'm curious if OSX for Intel supports hyperthreading?

OS X is multithreaded and supports multi-threaded applications as it has done so from the beginning. Hyper-threading (SMT) will just make this support more valuable.
.
Reply
.
Reply
post #98 of 134
Quote:
OS X is multithreaded and supports multi-threaded applications as it has done so from the beginning. Hyper-threading (SMT) will just make this support more valuable.

Hyperthreading (within single core) is actually closer to SMP than SMT, but the scheduling is different than both.
I was wondering if OSX supports single hyperthreading. Maybe this is a non-issue with Yonnah.
post #99 of 134
Quote:
Originally posted by Brendon
I know that a test at Tom's Hardware showes that Yohna can be faster than the fastest Itanium, and Pentiums.

Lets stop including Itaniums as part of the discussion. In fact, lets just stop including Itanium in any discussion involving Apple all together. It's not proper until they enter the enterprise server market.

Quote:
So I think that a Dual Yohna machine will do just fine against a 970MP.

That wasn't the point of contention. Remember the original comment: a Yonah laptop will outrun a quad-PPC PowerMac, and will be the first Mac laptop in years to be faster than desktops. This essentially means a 2.2 GHz Yonah laptop versus a dual 2.5 GHz 970mp. It's an apples and oranges comparison at best. For things that Apple would ship a quad-PPC PowerMac for, it will crush a Yonah laptop at those things, and most of those things would involve multithreaded apps.

One of the things that I think we all agree on is that Yonah will perform single threaded spaghetti integer code better.

Quote:
In that you just stated that a 2.7 970fx is comparable to a P4 660, or were you talking about a single processor?

Single processor, assuming properly tuned code, and "comparable" meaning that for some things one processor will do much better than the other, but on the whole, roughly comparable.

The dual 2.7 970fx PowerMac is a competitive machine against dual x86 machines at its price point for Apple's market.
post #100 of 134
Quote:
Originally posted by ZoranS
Since this 'old-tech' Dothan easily clocks upto 2.5 Ghz as shown here http://www.tomshardware.com/cpu/2005...ntium4-10.html you will note for all REAL benchmarks (rar,ripping,etc) it wins hands down over current gen P4 and AMD chips.

It's not surprising, but I will also note that a ~2.5 GHz Dothan and Yonah doesn't really exist. Dothan is limited to ~2.1 GHz, Yonah limited to ~2.2 GHz.

If Intel ships a ~2.5 GHz Sossaman, with appropriate core logic chipset support for desktop machines, then sure, Apple should use them. I'm sure they would rather switch earlier than later.

Quote:
So i believe in most daily comp chores this lil champ would be great.

Who said that it wouldn't?

Quote:
If the next gen deliver's with an uprated SIMD and possible future on die mem controller, we can stay cool in both thermal and performance.

Merom-based Conroe, the 8th generation IA-32 architecture (w/EM64T), looks to be when Apple will start switching the Power Macs to the Intel platform. We're all hoping it will be great. Heck, Sossaman could be good enough.
post #101 of 134
Quote:
Originally posted by Brendon
So a Yohna chip is basically two Dothans on one die, with improvements. So a MB design that uses two Yohnas is a system that could be faster than a MB utilizing 4 Pentium 660s.

Just a tiny nuance we should remember about Yonah and Dothan. Yonah is a basically two Dothan "cores" on a die with 2 MB of shared L2, as far as we know. Dothan has 1 core with 2 MB of L2. Text illustration:

Code:


Dothan:

----------------------
| --------- |
|| Dothan | |
|| core | |
| --------- |
| -------------------- |
|| ||
|| 2 MB L2 cache ||
|| ||
| -------------------- |
----------------------

Yonah:

----------------------
| --------- --------- |
|| Dothan || Dothan ||
|| core+ || core+ ||
| --------- --------- |
| -------------------- |
|| ||
|| 2 MB L2 cache ||
|| ||
| -------------------- |
----------------------



Whether a dual Yonah system is faster than a quad P4 660 system is all dependent on the clock rate of the CPUs. If Yonah is limited to 2.2 GHz, probably not. Competitive in some things, but on the whole, probably not.

Back to the Yonah vs 970mp comparison. Consider the improvement from the 970fx to the 970mp: 2x the cache per processor. The 970fx has 512k cache, but each core of the 970mp has 1 MB of dedicated L2. Grand total of 2 MB on the die. Twice the cache is the equivalent to about adding 5 to 10% in clock rate (within limitations), so each core of the 970mp could be about as fast a 2.7 GHz 970fx. That's pretty good. If Apple used dual channel DDR2-667 for a dual 2.5 GHz 970mp machine, that would be some good performance improvements for streaming memory apps.

Yonah compared to Dothan on the other hand, is more complicated. Single-threaded performance will be like Dothan, perhaps a little better due to the "digital media boost" improvements Intel is doing. But multithreaded performance, hmm, the cores will be a little bit slower because they have to share the same cache. A Yonah core is almost be like a Dothan with only 1 MB L2 and slightly slower to the real 2 MB L2 Dothan. So, it's not exactly like 2 Dothans on one die.

The multicore speedup factor for Yonah compared to Dothan won't be or shouldn't be as good to the multicore speedup factor for a 970mp compared to the 970fx.
post #102 of 134
Quote:
Originally posted by skatman
Hyperthreading (within single core) is actually closer to SMP than SMT, but the scheduling is different than both. I was wondering if OSX supports single hyperthreading. Maybe this is a non-issue with Yonnah.

If OS X needs to support "Hyperthreading", I'm sure Apple will provide the support for it. It should be quite easy.

Hyperthreading is SMT, simultaneous multithreading. It is simply Intel's marketing term for SMT. When using the term, it only applies to a single processor or a single core. A processor with SMT will present itself to the operating system as a multiprocessor system, as if it were an SMP system.

Once again, or perhaps for the first time:

Thread: operating system term for a stream of instructions. An application is typically a stream of instructions, a single stream of instructions, and hence is single threaded. Applications with more than one stream of instructions are multithreaded.

A non-SMT non-CMP non-SMP CPU can only execute one thread at a time. It cannot have 2 threads mixed in the execution pipeline at the same time. Multitasking operating systems empty the execution pipeline (I think) every time they switch threads. It only appears to be executing at the same time because the switching occurs over microsecond time spans, if not less.

SMP: acronym for symmetric multiprocessor. A machine that has 2 processors like Apple's dual CPU machines. The term came about before the advent of multicore processors came about, and not in use much anymore. These machines can execute 2 threads simultaneously, in parallel, and hence, will provide good performance enhancements for multithreaded applications.

Way back, 6 years ago, people argued the merits of SMP and ASMP (I think), asymmetric multiprocessor. Mac OS 8/9 could only run on one CPU and used the second CPU as a dedicated processor as it were. Mac OS X could run on both processors and is able to run apps/threads on any processor it chooses to, independent, ie, the processors are symmetric.

CMP: acronym for chip multiprocessor. A CPU that has multiple processors on the same die. People now refer it to dual-core, quad-core or multi-core processors. Multiple cores mean multiple threads can be executed at the same time, in parallel. A CMP CPU appears to the operating system as a SMP machine.

SMT: acronym for simultaneous multithreading. A CPU can execute multiple threads at the same time. The instructions from multiple threads could be in the same execution pipeline of the processor. The operating system would typically see a SMT processor as a SMP machine. Hyperthreading is merely Intel's marketing name for it, like Velocity Engine is for AltiVec.

There are some variations in hardware multithreading, such as vertical and horizontal microthreading, which I don't really remember the details of anymore. But the general essence of hardware multithreading is the ability of a single CPU or core to be able to have multiple threads executing, mixed, parallel, in its pipeline.
post #103 of 134
Quote:
Originally posted by THT
If OS X needs to support "Hyperthreading", I'm sure Apple will provide the support for it. It should be quite easy.

It seems that the current x86 version of Mac OS X supports HyperThreading.
JLL

95% percent of the boat is owned by Microsoft, but the 5% Apple controls happens to be the rudder!
Reply
JLL

95% percent of the boat is owned by Microsoft, but the 5% Apple controls happens to be the rudder!
Reply
post #104 of 134
Quote:
Originally posted by THT
Just a tiny nuance we should remember about Yonah and Dothan. Yonah is a basically two Dothan "cores" on a die with 2 MB of shared L2, as far as we know. Dothan has 1 core with 2 MB of L2. Text illustration:
/snip/
Whether a dual Yonah system is faster than a quad P4 660 system is all dependent on the clock rate of the CPUs. If Yonah is limited to 2.2 GHz, probably not. Competitive in some things, but on the whole, probably not.

Back to the Yonah vs 970mp comparison. Consider the improvement from the 970fx to the 970mp: 2x the cache per processor. The 970fx has 512k cache, but each core of the 970mp has 1 MB of dedicated L2. Grand total of 2 MB on the die. Twice the cache is the equivalent to about adding 5 to 10% in clock rate (within limitations), so each core of the 970mp could be about as fast a 2.7 GHz 970fx. That's pretty good. If Apple used dual channel DDR2-667 for a dual 2.5 GHz 970mp machine, that would be some good performance improvements for streaming memory apps.

Yonah compared to Dothan on the other hand, is more complicated. Single-threaded performance will be like Dothan, perhaps a little better due to the "digital media boost" improvements Intel is doing. But multithreaded performance, hmm, the cores will be a little bit slower because they have to share the same cache. A Yonah core is almost be like a Dothan with only 1 MB L2 and slightly slower to the real 2 MB L2 Dothan. So, it's not exactly like 2 Dothans on one die.

The multicore speedup factor for Yonah compared to Dothan won't be or shouldn't be as good to the multicore speedup factor for a 970mp compared to the 970fx.

Whether a dual Yonah system is faster than a quad P4 660 system is all dependent on the clock rate of the CPUs. If Yonah is limited to 2.2 GHz, probably not. Competitive in some things, but on the whole, probably not.

From THG: Our Pentium M 770 was rock stable at 2.48 GHz. And: We overclocked the Pentium M 770 to 2.56 GHz by increasing the frequency of the front side bus to 160 MHz, without even having to raise the core voltage.

This page LINK shows that the stock 2.13GHz Dothan is running above the Pentium 660 in this test. Of course the overclocked version is running faster but with little extra heat. So I think that with some improvements a 2.2GHz core will run about the same as the Pentium 660.

As far as the "original statement" ( which was the 7th post in this thread ) goes, which is it "The speed of the Intel dev kits are impressing developers" or some crap about an iBook out running a PowerMac, yea I read that and I thought it was funny, that's it right, a joke. My statements are pointing to something else. The idea of transition times, for the PowerMac. Everyone here appears to believe that it will be very late in '06 or early in '07, I believe that depending on the speeds, and from THG we can see that a Yonah can run faster than a Pentium 660 which is what is in those dev kits. If Apple wanted they could switch sooner than later. One Yonah for the Books and little Macs and two for the PowerMacs, depending on how Intel does at getting the performance up. That is where I am heading, and also note that from what I have read that rosetta is running fine, so once the big major apps have been ported and anything deemed a necessitity then Apple is in the drivers seat and can pick the time to switch of their choosing.
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
Please consider throwing extra cycles at better understanding Alzheimer's, Mad Cow (CJD), ALS, and Parkinson's disease go here <a href="http://folding.stanford.edu/" target="_blank">http://folding....
Reply
post #105 of 134
Quote:
Originally posted by skatman
Hyperthreading (within single core) is actually closer to SMP than SMT, but the scheduling is different than both.
I was wondering if OSX supports single hyperthreading. Maybe this is a non-issue with Yonnah.

Hyper-threading within a single core IS SMT. By Intel's definition and the definition of SMT [page 3, top 2 full paragraphs right column]. Hyper-threading is just Intel's registered trademark for it, like Moto's Altivec for the G4 SIMD engine and IBM's VMX for a G5 implementation of that standard. Registered trademarks don't change the underlying technology, just give the marketing department a unique term to tout that the competitors cant use.

Guess THT beat me to it, all of it--even the Altivec stuff! Maybe I should read to the end first?
.
Reply
.
Reply
post #106 of 134
grrr. wrong button.
.
Reply
.
Reply
post #107 of 134
Quote:
Originally posted by Hiro
I would strenuously say it actually is true. What are you looking at right now? That display has been created by many times more code and data read into the processor over the FSB than ever goes out.

Except that most of the operations read the frame buffer, modify it, and write it back.

Quote:
Any type of quality encoding uses many times the amount of input data to the encoded output. Your forte, gaming code, will ship 10's of times more code upstream directly to the graphics card over the bus due to textures.

In the case of 3D graphics, most of the data explosion happens in the GPU not in the CPU. The CPU is usually just copying between buffers, if it touches the data at all.

Quote:
Searches? Tons of data comes in--does it match? No, throw it away unchanged. Yes, keep that one little slice of everything you looked through. I could go on for a long time here.

And I could go through a whole list of algorithms which have both an input and output stream. Sure, if you're spending all your time doing searches then you're right, but most apps aren't doing that.

Quote:
The POWER series and 970 style balanced elastic bus is superior for server type transactions where you expect to ship large amounts of data off the CPU, but not much comparison or computation, and can use the clear one way downstream bus to avoid clobbering the upstream bus when shipping those packets out.

The 970 bus is designed so that the direction of the packets never needs to be "turned around". This saves a considerable amount of overhead compared to a single bi-directional bus. Also, I didn't say input and output are perfectly balanced, I just denied that most bandwidth intensive algorithms are almost completely without either input or output streams. Even if all you are doing is reading, the requests for what to read go down the write bus so it is handling some of the traffic. It also allows cache write back operations to happen in parallel to read operations.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #108 of 134
Quote:
Originally posted by kim kap sol
What if the 970 MPs never get released? Do you automatically lose?

Of course not - that's why I said, assuming they are released. If they're not released, the bet would be off, since there is nothing to compare with. Same thing for the Intel Powerbooks...if they're not released, the bet would be off.
post #109 of 134
Quote:
Originally posted by THT
If there was a GUI speed benchmark, I think the P4 660 would win hands down. A 2 GHz Dothan should win hands down against a 2, maybe 2.2 GHz 970fx.

Didn't the P4 Macs do surprisingly well in the (emulated) XBench UI test? I forget the numbers that were being thrown around...not all the reports seemed to be consistent with each other.
post #110 of 134
Quote:
Originally posted by Booga
I suspect the G5 will win in Photoshop. I'm not sure why you limit the Doom3 and Halo to 1280x1024 or above... do you think the Yonah PowerBook would win with lower resolutions? In any case, I think the Yonah will be competitive on the games, even considering laptop GPU suckage. Don't know much about Cinebench or After Effects characteristics, so can't say.

And what about browsing, checking Mail, using iPhoto, copying data over a network, opening/closing/resizing windows, doing searches, or any of a hundred other things the typical user will want to do with their laptop?

I also have no clue what sort of chipset Apple will go with. From history, they could easily hobble their PowerBooks simply to maintain market segmentation. In short, I think the technology will be comparable, but I'd say the odds are worse than 3-to-1 that Apple will screw it up again and squander the opportunity. You also specify the best ATI or nVidia graphics card available... which Apple never ships as a factory configuration anymore.

So I'm not going to take your bet, because 1. I'm not the betting type, and 2. as I maintained earlier, I have no idea whatsoever how these things will benchmark against each other and I don't want to lose money over such a silly argument.

Doom3/Halo: The reason to limit to reasonably high resolutions (1280x1024 and above) is that those are the resolutions that people buying high end machines want to play at.

However, I think that at a low enough resolution (maybe 800x600), the Yonah Powerbook could match or beat the G5 in FPS. Many games are single threaded, so 2 vs 4 cores is irrelevant, and if the game is ported from x86 to PPC, or if it is heavily dependent on L2 cache size (the P4 Extreme has a big L2 cache and is targeted in large part for serious gamers), then I could easily see the Yonah PB producing higher framerates at very low resolutions (where GPU doesn't matter)

Cinebench (and After Effects) rendering scale very well across multiple CPUs, so the 4-core G5 will easily beat the 2-core Yonah. I believe they also make substantial use of FPU power.

Browsing, checking mail, and UI I think will be faster on the Yonah PB, in some cases. However, there are two big caveats here. First, I am typing on a 1.2 Ghz iBook right now, and honestly the mail client, browser, and UI run fast enough that I wouldn't care much if they ran any faster. The limiting factors are generally my own response time (for the UI and Mail), and the bandwidth of my broadband connection (for Safari and Mail attachments). The only time I get serious slowdowns is when I start paging to disk (I only have 768 MB RAM). Obviously, this is a laptop problem that will also affect the Yonah Powerbook...it's just not as easy/cheap to cram a couple GB of RAM into a laptop.

Second, what you are talking about is basically perceived speed - how fast does the system feel doing routine tasks. I have personally found that perceived speed/responsiveness depends a lot on hard drive speed. Obviously, the PowerMac G5 will have a much faster drive than the Yonah Powerbook. So I wouldn't be surprised if the G5 still "feels" faster in a lot of routine tasks, even if Safari benchmarks a bit faster on the Powerbook.

As for iPhoto, it leverages Altivec and the GPU, so it will probably be faster on the quad G5 than the Yonah PB.
post #111 of 134
Quote:
Originally posted by bigmig
Didn't the P4 Macs do surprisingly well in the (emulated) XBench UI test? I forget the numbers that were being thrown around...not all the reports seemed to be consistent with each other.

Wasn't it concluded here that XBench was not a great application for benchmarking?
You think Im an arrogant [expletive] who thinks hes above the law, and I think youre a slime bucket who gets most of his facts wrong. Steve Jobs
Reply
You think Im an arrogant [expletive] who thinks hes above the law, and I think youre a slime bucket who gets most of his facts wrong. Steve Jobs
Reply
post #112 of 134
Quote:
Originally posted by DHagan4755
Wasn't it concluded here that XBench was not a great application for benchmarking?

Damn straight!
post #113 of 134
Quote:
Originally posted by DHagan4755
Wasn't it concluded here that XBench was not a great application for benchmarking?

Never said it was. In fact, I think it's terrible. But that doesn't mean that the information it provides is better than nothing at all, if you know how to interpret it...
post #114 of 134
Quote:
Originally posted by bigmig
Never said it was. In fact, I think it's terrible. But that doesn't mean that the information it provides is better than nothing at all, if you know how to interpret it...

Except that it's inaccurate, unreliable, and never gives the same numbers twice. Interpret it how you will.
post #115 of 134
All I know is that anything will be faster than my 550MHZ G4 PowerBook.
MacBook Pro Core 2 Duo 2.33GHZ 3GB Ram
120GB HDD, 6x DL SuperDrive
15.4" Glossy Screen
Mac OS 10.5.1

Counting the cars on the freeway below... lost in the music... all the foolishness of our lives...
Reply
MacBook Pro Core 2 Duo 2.33GHZ 3GB Ram
120GB HDD, 6x DL SuperDrive
15.4" Glossy Screen
Mac OS 10.5.1

Counting the cars on the freeway below... lost in the music... all the foolishness of our lives...
Reply
post #116 of 134
Quote:
Originally posted by Programmer
Except that most of the operations read the frame buffer, modify it, and write it back.

And most of that is in the GPU, not the CPU for basic GUI related stuff.

Quote:
In the case of 3D graphics, most of the data explosion happens in the GPU not in the CPU. The CPU is usually just copying between buffers, if it touches the data at all.



That's my point.

Quote:
And I could go through a whole list of algorithms which have both an input and output stream. Sure, if you're spending all your time doing searches then you're right, but most apps aren't doing that.



We both can, but 50 years of CS Operating systems research & development pretty much says upstream gets much more traffic in the long haul.

Quote:
The 970 bus is designed so that the direction of the packets never needs to be "turned around". This saves a considerable amount of overhead compared to a single bi-directional bus. Also, I didn't say input and output are perfectly balanced, I just denied that most bandwidth intensive algorithms are almost completely without either input or output streams. Even if all you are doing is reading, the requests for what to read go down the write bus so it is handling some of the traffic. It also allows cache write back operations to happen in parallel to read operations.

We pretty much agree here.
.
Reply
.
Reply
post #117 of 134
Quote:
Originally posted by Programmer
Most users will be pleased with the performance of x86 Intel Macs because it is good at what most users do most of the time. Those few who care more about FPU or VPU performance are going to want to hang onto those PPC Macs for a little longer.

No doubt. I wouldn't be surprised if savvy video shops hold onto G5 PowerMacs long after the newer Intel PowerMacs are available. Especially with H.264, HD-DVD etc. on the horizon.
- - - - - - - -

- J B 7 2 -
Reply
- - - - - - - -

- J B 7 2 -
Reply
post #118 of 134
Okay now everyone who actually makes their living with after effects, and cinema 4d raise their hands?

2. Photoshop is much slower on mac according to tomshardware if you try it out on some of his slightly less biased tests.

3. At least with intel chips when Apple announces a new powermac you'll actually be able to buy it instead of waiting 6 months for it to show up only to be outdated.

4. I've never judged the speed of my computer by how long it took the photshop filter to run. I've judged it by the wait time, moving files around, opening windows, copying data, checking email, loading web pages, and navigating the file browser.... Currently this means a beachball for part of your day... Cough cough, on the intel DEV system I have yet to see the beach ball (and yes i have plenty of clean installed super fast g5s), ghz does matter for that "fast snappy" response of the OS. I don't care so much if I wait 2 seconds longer for a filter to render... I care if things are snappy...


Quote:
Originally posted by bigmig
Hmm, that's a good point. Except he is trying to compare a 4 core G5 to a 2 core Powerbook, so what he's really saying is that a single core 2 Ghz Dothan lightweight laptop should be outrunning a two core Dual G5/2.5

And yet strangely, if we assume that a 2 Ghz Dothan is about equivalent to a 3 Ghz P4 (actually, the Dothan is probably a bit slower for the class of applications we're talking about, which benefit from vector, floating point, and Hyperthreading enhancements on the P4, but it won't be a huge difference) we can see that the G5 would be:

2.4 times faster in After Effects rendering
1.9 times faster in Cinema 4D rendering
1.9 times faster in Photoshop MP actions
1.4 times faster in Photoshop SP actions
2.1 times faster in Bryce (actually, I think the Dothan would do somewhat better than the P4 here, but it's not going to change anything in the end)

Source is http://www.barefeats.com/macvpc.html

In other words, the Dual G5 wins in every single benchmark, often by 2x or more.

The Dothan would also lose in games. Obviously there is no Dothan laptop gaming equivalent on Barefeats, but you can compare the Dual G5 FPS numbers to Dothan/Mobility X700 numbers from other sites

(for example, http://www.hothardware.com/viewartic...eid=637&cid=10 )

Halo: G5/X800 gets 57 FPS, Dothan/X700 gets 30 FPS
Doom3: G5/X800 gets 20 FPS, Dothan/X700 gets 16 FPS.
Unreal 2K4: G5/X800 gets 60/137 FPS (flyby/botmatch), Dothan/X700 gets 20 FPS (31 FPS with no AA). (this one from PC Mag)

Note that these numbers seriously overstate the Dothan/X700 performance, because the G5 is always tested at 1600x1200 (4x AA where applicable), whereas the Dothans are tested at 1400x1050 for Halo and 1280x1024 for Doom and Unreal.
post #119 of 134
If we're going to talk about Photoshop, then the truth is that most work does not require the fastest cpu on the block.

If you work with file sizes smaller that about 50MB's, as long as you have enough memory, and a good clean scratch drive, you should be fine. There are few operations that can't be handled quickly enough on a decent G4 system. A 9800 Radeon is fine for this as well.

As you go up in size, it becomes more of a hassle to deal with Gaussian Blurs, and rotations, but until you start to get to 100MB's or so in size, it's not a hardship.
post #120 of 134
Quote:
Originally posted by webmail
2. Photoshop is much slower on mac according to tomshardware if you try it out on some of his slightly less biased tests.

The Ars Technica Forum PS7 Benchmark compilations show that the G5 machines provide approximately equivalent performance to x86 competitors. For awhile, the dual 2.5 GHz Power Mac G5 was the top performer amongst the forum. The G5 has about the same clock for clock performance with Opteron/Athlon and about +1 GHz to P4 clocks in Photoshop.

Photoshop is definitely an area where the G5 competes very well, and is at bang/buck parity with x86 systems on the high end systems.

Quote:
3. At least with intel chips when Apple announces a new powermac you'll actually be able to buy it instead of waiting 6 months for it to show up only to be outdated.

Don't count your eggs before they hatch. There is only one thing we are relatively sure about right now: Apple will be building some nice computers in some nice form factors with the Intel platform, and that performance will be at approximate parity with Windows/Intel machines.

Other then that there isn't much else. Both AMD and Intel preannounce availability of their high end processors and they are subject to long delays in shipment. It will depend on how plays their cards.

Quote:
4. I've never judged the speed of my computer by how long it took the photshop filter to run. I've judged it by the wait time, moving files around, opening windows, copying data, checking email, loading web pages, and navigating the file browser.... Currently this means a beachball for part of your day... Cough cough, on the intel DEV system I have yet to see the beach ball (and yes i have plenty of clean installed super fast g5s), ghz does matter for that "fast snappy" response of the OS. I don't care so much if I wait 2 seconds longer for a filter to render... I care if things are snappy...

So the UI (launching apps, window server, and filesystem) is faster or appears faster on the dev kits?

I'm fairly confident that's because of better burst memory and integer performance, not really GHz.
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Future Apple Hardware
AppleInsider › Forums › Mac Hardware › Future Apple Hardware › Speed of Apple Intel dev systems impress developers