Why go all Dualies in the PM line?

2

Comments

  • Reply 21 of 43
    Yevgeny,



    Would you please give us your opinion as to why the Barefeats performance tests show no significant improvement between the old and the new dual 1 GHz, despite the 25% increase in bus speed?



    Thanks,



    Eirik



    PS. This is really an open question to all, of course.
  • Reply 22 of 43
    programmerprogrammer Posts: 3,458member
    Its possible that Apple is pulling a fast one and running the MPX bus asynchronously from the memory bus and they are claiming the "system bus speed of 167 MHz" by talking about the memory bus speed. This would be a cheap and sleezy marketing trick, but it would explain how they managed to do it so quickly (don't need a new processor from Moto, and Moto hasn't announced a new processor yet).



    We'll have to wait and see... either proper technical docs will come out or somebody will run a "real" memory benchmark. The Barefeats results are suspicious because they are so close... the new machines have smaller/faster L3 cache so you'd expect some difference.
  • Reply 23 of 43
    stevessteves Posts: 108member
    [quote]Originally posted by Brussel:

    <strong>

    The hype comes from people who say "X will automagically speed up everything you do because the OS is MP-aware!" That's nonsense.

    </strong><hr></blockquote>





    True, but on a dual machine, you could be running a PS filter, playing an MP3, while checking your e-mail without any significant loss of performance or responsiveness on a dual CPU machine. While it's true that not all tasks will automatically be faster, your SYSTEM will automatically be faster in OS X as opposed to the same setup in OS 9. This is the point that most people are trying to make when discussing OS X.



    [quote]Originally posted by Eirik Iverson:

    <strong>

    BRussell, I do not contest the idea that multiprocessors can improve performance. I am arguing that for big, main memory intensive jobs their value will be constrained by the 167 MHz bus bottleneck.

    </strong><hr></blockquote>



    I'll be the first to agree that bandwidth is an issue regarding performance. I'd also be the first to point out that this issue is way over rated in forums like this. It's almost used as an excuse for poor CPU performance. i.e. my G4 would be as fast as your P4 if only it ran on a faster bus. This is pure BS! The vast majorty of applications are not significantly bandwidth limited. For example, a couple months ago, PC world compared the performance of several P4 systems in the 2 - 2.2GHZ range. Systems with SDR, DDR and Rambus were compared. Across a variety of applications the performance difference was generally less than 10%. I'd argue that Macs, running at lower clock rates, would see even less of a difference in overall speed. Especially when you factor in the use of L3 cache, which, for most apps, basically nullifies the arguement for the need of faster busses.



    In short, yes, faster busses are a good thing. However, it's nowhere near the issue that people are making it out to be. Outside of a synthetic benchmark such as STREAM, etc. I challenge anyone to prove bus speed is the bottleneck, across a variety of applications, that it's made out to be in this forum, especially with the use of L3 cache.



    Steve
  • Reply 24 of 43
    kecksykecksy Posts: 1,002member
    The old dual 1GHz had 2MB L3 cache.

    The new dual 1GHz has 1MB L3 cache.



    Perhaps the smaller L3 cache nullifies the performance advantage of a 167MHz bus.



    This would explain why the old dual 1GHz is faster than the new dual 1GHz in Photoshop.



    Looks like an extra megabyte of L3 cache gives you the same performance boost as a 167MHz bus. Apple really should have put more memory in these machines.



    DDR SDRAM doesn't do squat either. I guess it's just there for marketing purposes.



    What the next PowerMacs need is RapidIO. Either that or 128-bit MPX. Dam you Motorola!



    [ 08-15-2002: Message edited by: Kecksy ]</p>
  • Reply 25 of 43
    The Barefeats test results appear to bear out SteveS's statement about bus speed being over-rated. Although, I did ask Rob over at Barefeats to run tests on bigger files.



    The largest of them all was 30 MB. I'd of guessed that would stress the bus enough to show a difference.



    As for the L3 cache difference, maybe that does make the difference. Perhaps, the large cache enables the application to 'cache' more of the instructions from the application, which leaves the main bus that much more free to transport only data.



    Plus, I imagine an instruction retrieval yields considerably more latency than just another gulp of data.



    I still intend to buy a PowerMac soon, BTW. I want to get into the digital era, make more use out of my digital camcorder and all. I don't want to learn this on a WinDon't machine. I never want to have to **** with the WinDon't registry or confusing I/O settings ever, ever again. Sorry for the tangent; had to say it.



    Eirik
  • Reply 26 of 43
    programmerprogrammer Posts: 3,458member
    [quote]Originally posted by Eirik Iverson:

    <strong>The Barefeats test results appear to bear out SteveS's statement about bus speed being over-rated. Although, I did ask Rob over at Barefeats to run tests on bigger files.



    The largest of them all was 30 MB. I'd of guessed that would stress the bus enough to show a difference.



    As for the L3 cache difference, maybe that does make the difference. Perhaps, the large cache enables the application to 'cache' more of the instructions from the application, which leaves the main bus that much more free to transport only data.



    Plus, I imagine an instruction retrieval yields considerably more latency than just another gulp of data.



    I still intend to buy a PowerMac soon, BTW. I want to get into the digital era, make more use out of my digital camcorder and all. I don't want to learn this on a WinDon't machine. I never want to have to **** with the WinDon't registry or confusing I/O settings ever, ever again. Sorry for the tangent; had to say it.



    Eirik</strong><hr></blockquote>



    For these tests all of the performance intensive code will fit in the L1 instruction cache, so that shouldn't be a factor at all.



    Increasing the size of the data set probably won't help the results -- he was already using a 10 MB image, so that is well beyond the cache size. I don't know the code that is running on a per-pixel basis in these tests, however, so I can't say whether this machine would be CPU or bandwidth limited. We need a test that was definitely memory bound before passing judgement.
  • Reply 27 of 43
    yevgenyyevgeny Posts: 1,148member
    [quote]Originally posted by Eirik Iverson:

    <strong>Yevgeny,



    Would you please give us your opinion as to why the Barefeats performance tests show no significant improvement between the old and the new dual 1 GHz, despite the 25% increase in bus speed?</strong><hr></blockquote>



    I don't know why the new bus shows no speed improvement. Given that all the machines are MP machines, there should be some speed difference. The new machines do have a 166 MHz mpx bus. That we do have tests where the new machines are faster seems to indicate that something is up with the testing.
  • Reply 28 of 43
    nevynnevyn Posts: 360member
    [quote]Originally posted by Eirik Iverson:

    <strong>Yevgeny,



    Would you please give us your opinion as to why the Barefeats performance tests show no significant improvement between the old and the new dual 1 GHz, despite the 25% increase in bus speed?

    </strong><hr></blockquote>



    Since it's _exactly_ the same, I'd guess that it isn't a bus-limited operation. If there's enough calculating on each part of the image a bit at a time without maxing out the memory bus, it looks to be a CPU-bound operation.



    I don't know exactly what their Photoshop actions are for their benchmark, but that's my guess.



    Is there an easy memory-bandwidth test we could aim BareFeats way?



    [Edited to add the following:]



    Or to put it simpler:

    The CPUs are identical, it's only the bus that's different. Not everything is bus-limited, some things are CPU-limited. Whatever is limiting determines how fast things will go.



    [ 08-15-2002: Message edited by: Nevyn ]</p>
  • Reply 29 of 43
    yevgenyyevgeny Posts: 1,148member
    [quote]Originally posted by Nevyn:

    <strong>

    Is there an easy memory-bandwidth test we could aim BareFeats way?</strong><hr></blockquote>



    Sure, it looks like this:



    void BigJob()

    {

    long* pLong = new long[100000000];

    // start timing code

    for (long i = 0; i &lt; 100000000; i++)

    pLong[i] = i;

    // end timing code

    }



    Have them load up a copy of metrowerks, open the "Hello world" sample and copy/paste the code from this function into that. Find the timing librarys and use cout to output the time. (make sure to run in release mode, not debug mode)



    OR, if they are scared of writing code, they could run any one of a number of commercially available benchmarking programs, instead of some ad hoc test platform.
  • Reply 30 of 43
    brussellbrussell Posts: 9,812member
    [quote]Originally posted by Eirik Iverson:

    <strong>BRussell, I do not contest the idea that multiprocessors can improve performance. I am arguing that for big, main memory intensive jobs their value will be constrained by the 167 MHz bus bottleneck.



    The tests by barefeats may NOT be very memory intensive. I haven't looked at the details yet. But, I believe that the fractal test is NOT main memory intensive.



    Maybe I'm being unnecessarily defensive but I've acknowledged your first two points several times in earlier posts in this thread. Again, I'm talking about big jobs that are memory intensive. These things have real tangible benchmarks. Snappiness does not.



    As for the Barefeats tests, I haven't looked at the details yet: size of test files, detailed description of algorithm, etc.



    I don't claim to be an EE on these matters. However, I'd like to know in what real-world QUANTIFIABLE ways that the extra CPU benefits. Saying that it makes your system more snappy isn't all that compelling.



    Eirik</strong><hr></blockquote>Well, I'm getting mixed messages from your posts then. You're saying that you don't trust the people who say "it feels snappier" and I couldn't agree more. But I showed a few empirical examples of where duals do dramatically improve performance.



    So now can you show an example of where duals don't improve performance in an MP-aware app? I'd just like to see some empirical evidence that duals don't help because of bandwidth limitations.



    BTW, could you please provide a link to where someone from Motorola said they had ceased Apple-related development.

    :eek:
  • Reply 31 of 43
    g-newsg-news Posts: 1,107member
    Apple also had Dual systems back in the 604 days where tehre was NO L3 cache and a measily 50MHz, sometimes even 45MHz 60x bus. Do you think that wasn't saturated too? And the daystar machines with quad 604e machines on a 50MHz bus?

    But then too, there were MP aware apps and they DID benefit from it.

    Today we even have SMP in the OS, so we also benefit for multitasking, which was not the case under 7.5.x-9.2.2 (or only very little).



    Seeing PS, SoundJAM and Q3 benefit around +90% with duals compared to singles, I can understand Apple quite clearly.



    Plus it helps the marketing as well.



    I'll be glad if my DP 1.25GHz arrives and I will notice the second CPU, even though they share a 166MHz bus.



    And yes, I think the future lies with IBM.



    G_News
  • Reply 32 of 43
    There's no direct link from irumors.net so I'm pasting it below:



    "Bye bye Motorola - it was good while it lasted!

    Posted July 30 2002



    Within the next few months, Apple will have ditched Motorola processors for its high end computers. Indeed, after a discussion with Motorola Canada president Frank Maw on July 25, he quite happily told our sources that product developement has already ceased on non- imbeded PowerPC processors. What does this mean? Quite simply, it means that Motorola have stopped upgrading the processors we use in our Macs.



    Apple have used Motorola G4 processors for one main reason - AltiVec. However, with processor development ceasing, it is not known what will happen. Will they use upcoming IBM processors for new PowerMacs and continue using Motorola for their lower end computers?



    Motorola, of course, will continue to ship current processors if there is a need for them. But Apple wants faster chips, which Motorola cannot provide.



    Apple have been put in a very tough position. Will they be able to get out of this mess without losing sales? We hope so."




    However, there is a follow-up to the article that I didn't see that waters down their original so-called quote from a corporate officer in Motorola:



    "A few things have come to light in the Motorola saga

    Posted July 31 2002



    We would like to expand on what we told you yesterday regarding Motorola and their ceasing chip development. First up, we would like to point you to a link that covers more on the topic. View this here.



    Secondly, we would like to tell you that we do not necessarily agree with what we posted yesterday. This information came from an email rumor submission. After looking over other evidence, we would also like to point out that Motorola will not be fully abandoning the Mac. Frank Maw, however, say that there will be no sure timeline for chip production."






    Well, this more recent report by irumors.net takes much of the wind out of the first one. Supposedly someone from CNET spoke with Frank Maw but I haven't seen it in CNET. That reference from a board somewhere may have been an error.



    As for the benchmarks and/or the rationale for why there faster or more CPU's shouldn't improve major memory intensive performance, I'll refer you to a post or two in arstechnica later.



    I am looking for benchmarks, BTW. Specifically, I'm looking for a comparison of a single 800MHz PPC 7455 versus a dual 800MHz PPC 7455. This should provide some insight into single versus dual performance, provided that the cache levels are comparable. Different cache levels would just complicate things a bit.
  • Reply 33 of 43
    This arstechnica thread has some quality explantions/speculation about the so-called b/w bottleneck and questions the value of more or faster CPU's for improved performance on main memory intensive applications. It also, BTW, rips the Barefeats tests for some methodology deficiencies.



    <a href="http://arstechnica.infopop.net/OpenTopic/page?a=tpc&s=50009562&f=8"; target="_blank">ArsTechnica Thread on PM G4 bottleneck & benchmarks</a>
  • Reply 34 of 43
    kukukuku Posts: 254member
    I'm urge to use some ungrateful slangs just about now.



    People who might not even be of a computer engineer profession, no real world results, and not even theorical results are trying there hands at bashing duals.



    I've just finish playing with a Dual 1.25ghz new model.



    Apple standard configs at the apple store.



    Thanks to the power of unix and their utilites, I can realistically say, that BOTH CPUS are getting a full tilt workout at even trivial things.



    So naysayers of SMP, the processor is getting close to that 2.5ghz factor.



    And even a single 2.5ghz computer doesn't scale right on real world results. things do happen based on situations to keep it from getting linear results.



    After playing with the computer, I am a bliever of the dual being 2.5ghz as a 2x1.25ghz. The processors are kept fed well enough that unless you're a prick that does the "Look that split second the 2nd cpu is idle".



    And I wasn't lax in playing with it, Apple store keeps their showcase computers packed with programs.



    ~Kuku



    [ 08-15-2002: Message edited by: Kuku ]</p>
  • Reply 35 of 43
    [quote]Originally posted by Kuku:

    <strong>

    I've just finish playing with a Dual 1.25ghz new model.



    Apple standard configs at the apple store.



    Thanks to the power of unix and their utilites, I can realistically say, that BOTH CPUS are getting a full tilt workout at even trivial things.



    So naysayers of SMP, the processor is getting close to that 2.5ghz factor.



    And even a single 2.5ghz computer doesn't scale right on real world results. things do happen based on situations to keep it from getting linear results.



    After playing with the computer, I am a bliever of the dual being 2.5ghz as a 2x1.25ghz. The processors are kept fed well enough that unless you're a prick that does the "Look that split second the 2nd cpu is idle".



    And I wasn't lax in playing with it, Apple store keeps their showcase computers packed with programs.



    ~Kuku



    [ 08-15-2002: Message edited by: Kuku ]</strong><hr></blockquote>



    Kuku,



    This sounds interesting could you possibly elaborate on the programs you ran and the files, particularly file sizes, that you employed?



    I don't think anyone disputes that the dual 1.25 will be anything but very snappy/responsive in casual usage. There's an interesting discussion in a few threads in arstechnica with some semiconductor engineers speculating about the dual 1.25 and dual 1.00 in performing memory intensive jobs.



    Eirik
  • Reply 36 of 43
    g-newsg-news Posts: 1,107member
    I'll tell you guys in 6 weeks when that baby arrives here.

    "Going to scream, it is."



    G-News
  • Reply 37 of 43
    Perhaps Apple is doing this as a favor to Motorola. All dualies means almost twice as many chips sold. As odd as this may sound (Apple helping Motorola despite the mess that Motorola put them into), it makes sense. If Apple is using another vendor for their next generation chip *IBM*, Motorola surely knows and might need some incentive to keep on producing G4s instead of stopping production and cutting their losses. Apple will still need G4s for the forseeable future for their Powerbook/iMac lines. This might be a bargaining chip for Motorola to keep on doing G4 work (the little of it that they are doing) until Apple can switch to a new vendor. Just a thought.
  • Reply 38 of 43
    eliahueliahu Posts: 71member
    Apple went all dual processors to remain competetive and try to spur sales. Do you know what you get for your money on the PC side right now? Dell has the Dimension 8200 for under $1K. This box boasts a 2GHz P4 with a 400 MHz bus.



    Apple has been stuck with tiny incremental processor speed bumps and they're still using an archaic bus. If they want Wintel users to switch, they had better be offering something more compelling than great style for prices significantly higher than what PC users are accustomed to.



    Apple went all dual because it had no choice. Tower sales have been sad for a while now. The magic words "DDR" and dual processors will get many excited and should improve the bottom line until they can make the hardware on par with PCs.
  • Reply 39 of 43
    hmurchisonhmurchison Posts: 12,425member
    [quote] This box boasts a 2GHz P4 with a 400 MHz bus. <hr></blockquote>



    Geez that's descriptive. And it's a 100mhz bus Quad Pumped.
  • Reply 40 of 43
    amorphamorph Posts: 7,112member
    [quote]Originally posted by eliahu:

    <strong>Apple went all dual because it had no choice. Tower sales have been sad for a while now. The magic words "DDR" and dual processors will get many excited and should improve the bottom line until they can make the hardware on par with PCs.</strong><hr></blockquote>



    Do you mean on par with a Dell Dimension in real terms, or in marketing-speak? The Dimension series is designed to look good on a spec sheet and that's about it. There's a reason it costs under $1K.
Sign In or Register to comment.