Speculation: Vector Unit in Apple's Intel chip

2

Comments

  • Reply 21 of 52
    carniphagecarniphage Posts: 1,984member
    Quote:

    Originally posted by Programmer

    On-processor vector units are exactly what Cell is.





    The problem with GPUs is that they are quite focussed on graphics, and are nowhere near "general-purpose". Yes they are programmable, but the programming model is quite constrained... which is how they are made so powerful. A GPU can do a lot more in certain applications than a PPC+VMX, but it also has to handle the systems graphics. Something like Cell + GPU is the best combination -- the problem with Cell from Apple's point of view is the weakness of the Power core in the first Cell relative to the 970 and Intel's Pentium-M offerings.




    Like I say I am not convinced.



    If I am programming for graphics, a GPU is better solution than Altivec. End of story. And graphics is the most intensive float munching task.



    I am prepared to accept that audio programming (which is less demanding) is better served by Altivec, at the moment - but another couple of years of GPU development will probably end up favouring the GPU solution for audio too. I am certain that is is already possible to write very fast and very sophisticated audio tools on a GPU. But treating samples as pixels is probably a bit counter intuitive.



    And yes, I am equally unconvinced by Cell.

    Yes Cell has got gigaflops in abundance. But imagine a Mac with a slow CPU and 8 riced-up Altivec units. It isn't going to run Microsoft Word any faster.



    And as a game developer, I am happy to go on record to state that Cell isn't going to run Grand Theft Auto any faster either. Yes, it will make some computationally expensive tasks (like physics) much more affordable. But I don't believe it will revolutionize game devlopment and it certainly will not reduce development costs. Gaming performance is often more about IFs than FLOPs.



    Carni
  • Reply 22 of 52
    splinemodelsplinemodel Posts: 7,311member
    Quote:

    Originally posted by PB

    So, are you suggesting that Intel will re-use existing CPU technologicy (Pentium + Itanium) to make a strong desktop CPU?



    I'm suggesting that Apple will need something that's more than an evolutionary step-forward from Intel's current Pentium-M technology in order to deliver a product that runs its own software (iTunes, iMovie, iPhoto, FCP, etc) faster than the current top-of-the-line PPC systems.
  • Reply 23 of 52
    macchinemacchine Posts: 295member
    When Apple first claimed to have the record for being able to rip a CD faster then any other computer, but pre-announced and had not quite shipped yet, Compac shipped a machine that I saw selling only at Costco which listed in the specs a CD rip speed a little faster than Apple's claim.



    But then those machines disappeared and Compac was sold to HP and I saw no other PCs with those claims, although I was not looking.



    And that was in the days when PPCs were faster than Intel chips.



    So my guess is Apple does have the best software for ripping MPEG 2 onto CDs.





    When the debates were hot and heavy over how good Altivec was everyone called it a Super-scalar instruction set, which is what its call on the generic side so if you want something to compare Altivec to look for that.



    The reason its Super-scalar is because the instructions are very long, working on large 128 bit chunks of data at a time, and it has special hooks to do many of these is rapid succession.



    But now days we have 64 bit processors with parallel busses, in the recent past busses had a serial configuration, these kinds of busses make Altivec much less valuable, because they can process data near that same speeds.



    Maybe 1/2 a fast, but the type of data that Altivec worked with was not ideal for the average programmer.



    I have not looked at Intel's processor specs enough to see if they have this kind of bus but the first IBM G5s had it, I assume Intel has copied them by now.



    Altivec required optimization that seldomly got done well, the new processors would only require a standard type of optimization so that will get done well.



    So the bottom line is standard processors are now almost as fast as Altivec.









    I once was in a long argument on MacCentral with a programmer who claimed that Altivec sucked.



    He gave an example and we went back and forth for a long time, then he seemed to WIN the argument by pointing out at the end that the number returned by Altivec would be wrong because of the way that it did calculations.



    And that was true it returns the answers to calculations with rounding error so long equations can give highly scewed results.



    But after I got a chance to think about it I finally realized that he had in-fact LOST the argument, because the example was a good one, it definitely applied, the type of calculations Alitvec was built for.



    The example was calculating a color, doing a transformation on colors when 24 bit, 17 million colors were set on the monitor.



    So in the end Altivec selected the wrong color because of rounding error.



    But this actually DID NOT MATTER, I realized that the human eye can only see 16,000 colors, Altivec would have to have been off by 1064 colors for anyone to see the difference and it was only off by about 5.



    So the bottom line was Altivec was and is very useful but VERY FEW PROGRAMMERS ARE ABLE TO PERCEIVE ITS VALUE, and few are able to program for it so it does not get used as much as it should.



    We are better off with fast standard processors.

  • Reply 24 of 52
    xoolxool Posts: 2,460member
    Notice that iTunes is one of the few consumer Apple apps that wasn't been converted to a Universal Binary. My bet is that that feature is part of the 5.0 release, so why waste time doing it for 4.8 or 4.9.



    Once iTunes is fully Universal, I think we can really start making comparisons between iTunes PPC, iTunes Intel, and iTunes for Windows.
  • Reply 25 of 52
    splinemodelsplinemodel Posts: 7,311member
    Quote:

    Originally posted by MACchine

    So the bottom line was Altivec was and is very useful but VERY FEW PROGRAMMERS ARE ABLE TO PERCEIVE ITS VALUE, and few are able to program for it so it does not get used as much as it should.



    We are better off with fast standard processors.




    Thanks to Cell, high level programmers are going to have to start respecting the low level a bit more. Cell also makes it clear that you are wrong about "us" being better off with faster, standard processors, in the sense that "standard" processors have hit a performance wall. That why we see all of the multi-core designs popping up.



    It is also painfully obvious, at least to me, that FPU and SIMD have very appreciated places in the standard workflow of the modern consumer. You can claim that I can't compare iTunes Mac to iTunes Windows, but we're talking about an enormous disparity over standardized code that's licensed from a 3rd party. Apple did add altivec functionality, but the fact that an old powerbook can hang with a late-model p4 clearly means that something is up. Regardless of how it's handled, it DOES need to be dealt with. And I don't think that it's going to be solved by thoughts and prayers of Apple intentionally holding back the PC version of iTunes.
  • Reply 26 of 52
    macchinemacchine Posts: 295member
    Quote:

    Originally posted by Splinemodel

    Thanks to Cell, high level programmers are going to have to start respecting the low level a bit more. Cell also makes it clear that you are wrong about "us" being better off with faster, standard processors, in the sense that "standard" processors have hit a performance wall. That why we see all of the multi-core designs popping up.



    It is also painfully obvious, at least to me, that FPU and SIMD have very appreciated places in the standard workflow of the modern consumer. You can claim that I can't compare iTunes Mac to iTunes Windows, but we're talking about an enormous disparity over standardized code that's licensed from a 3rd party. Apple did add altivec functionality, but the fact that an old powerbook can hang with a late-model p4 clearly means that something is up. Regardless of how it's handled, it DOES need to be dealt with. And I don't think that it's going to be solved by thoughts and prayers of Apple intentionally holding back the PC version of iTunes.




    You may very well be correct, and we will all know for certain once Apple ships the real Mac/ OS X. Well, that is IF Cell is shipping by then.



    Now that Mac/ OS X is being bootlegged freely it seems that optimizations or acceleration must be part of the game.



    If Sony really does put blue ray on their P3P systems ( Or whatever they are called ), that will make for a very interesting test.



    How long will it take to burn an entire disk, 5 hours ???



    People laugh at me talking about free Mac/ minis, a P3P with blue ray and burning built in for $300, now that's nuts !!!
  • Reply 27 of 52
    my guess is Apple is simply relying on a combination of a significant jump in clock speed (from 1.25-1.67 Ghz G4 to 2-2.5Ghz Yonah) and Dual-core, since almost all Apple software is now multi-threaded (and OSX is too), going from single 1.67 (or 1.8) to dual-core 2.5Ghz Powerbooks should provide sufficent speed boost to keep most people happy.
  • Reply 28 of 52
    programmerprogrammer Posts: 3,461member
    Quote:

    Originally posted by Carniphage

    [B]Like I say I am not convinced.



    If I am programming for graphics, a GPU is better solution than Altivec. End of story. And graphics is the most intensive float munching task.



    There are plenty of non-graphics tasks which are extremely float intensive, and in the future you may find those in games. A GPU is also not ideal for all graphics tasks, only those which fit into its vertex & pixel shader model. If your definition of graphics fits only that model, it is a fairly limited notion of graphics.



    Quote:

    I am prepared to accept that audio programming (which is less demanding) is better served by Altivec, at the moment - but another couple of years of GPU development will probably end up favouring the GPU solution for audio too. I am certain that is is already possible to write very fast and very sophisticated audio tools on a GPU. But treating samples as pixels is probably a bit counter intuitive.



    Plus your GPU is busy doing graphics, and it can't afford to take a time out on a ~100 Hz clock to do some audio mixing.



    Quote:

    And yes, I am equally unconvinced by Cell.

    Yes Cell has got gigaflops in abundance. But imagine a Mac with a slow CPU and 8 riced-up Altivec units. It isn't going to run Microsoft Word any faster.



    If most of Word's time is in Quartz 2D and font rendering, then yes it will. There is also the question about whether Word needs to run any faster, and why it is running slow. There are all sorts of issues that a faster CPU will not correct. And without knowing what is slowing down Word, you cannot predict what will speed it up.



    Quote:

    And as a game developer, I am happy to go on record to state that Cell isn't going to run Grand Theft Auto any faster either. Yes, it will make some computationally expensive tasks (like physics) much more affordable. But I don't believe it will revolutionize game devlopment and it certainly will not reduce development costs. Gaming performance is often more about IFs than FLOPs.



    Heh, well don't let your shareholders hear that -- they may decide that your competition is going to hand you your hat and show you to the door. I'll be perfectly happy if GTA doesn't improve on Cell.
  • Reply 29 of 52
    skatmanskatman Posts: 609member
    Quote:

    SSE2 is not nearly as robust as altivec. Don't kid yourself. Secondly, apple didn't even write the encoding code. It was licensed, and it's the standard.



    You seem to be an expert in the subject. Could you please tell me how exactly is Altivec more "robust" than SSE2 as related to audio encoding?



    Secondly, who cares of Apple didn't write it? Apple puts their name on it. That means they consider the quality to high enough to put their name on it and to sell it as if they wrote it. A company is only as good as their subcontrators are.
  • Reply 30 of 52
    webmailwebmail Posts: 639member
    GRR. There will be no altivec on any future Intel mac. I can confirm this 100%. Apple will take advatage of other technologies, like oh a computer that's 2 GHZ faster! There is also intel multimedia unit which they will make use of.



    BUT READ MY LIPS NO ALTIVEC. It's not important either, by the time apps are ported to intel, the PPC version will look so slow compared to how it run s on intel.





    Quote:

    Originally posted by Mr. Me

    Intel has not even hinted that it will design processors specifically for Apple. The opposite is true. Intel said that Apple will be able to choose from the wide array of processors offered by the company. I believe that Apple engineers will play a significant role in the development of future Intel processors. However, these processors will be available to everyone.



  • Reply 31 of 52
    splinemodelsplinemodel Posts: 7,311member
    Quote:

    Originally posted by skatman

    You seem to be an expert in the subject. Could you please tell me how exactly is Altivec more "robust" than SSE2 as related to audio encoding?



    SSE(1) was a joke. It used the regular FPU. IIRC, SSE2 fixed this, becoming a more independent unit on the cpu core, but it still doesn't have any instructions that I know of that are meant to operate on points within a single vector. So it requires more overhead, more clocks (supposedly 4x, but who really knows) and much more clever programming to match Altivec's speed. Then consider that most G4's snd G5's have more than one Altivec core per PPC core. SSE3 is better than SSE2. Perhaps SSE4 or 5 will match Altivec. It's certainly possible, and I'd like to see it as a 256bit unit rather than today's 128bit.



    Quote:

    Originally posted by skatman

    Secondly, who cares of Apple didn't write it? Apple puts their name on it. That means they consider the quality to high enough to put their name on it and to sell it as if they wrote it. A company is only as good as their subcontrators are.



    My point is that Apple is not intentionally delivering a slow encoder to iTunes Windows. So I'm inclined to agree with you here. Well said.





    Quote:

    Originally posted by webmail

    GRR. There will be no altivec on any future Intel mac. I can confirm this 100%. Apple will take advatage of other technologies, like oh a computer that's 2 GHZ faster! There is also intel multimedia unit which they will make use of.



    BUT READ MY LIPS NO ALTIVEC. It's not important either, by the time apps are ported to intel, the PPC version will look so slow compared to how it run s on intel.




    I agree with you too, that we won't see Altivec on Intel. As long as SSE4/5 doesn't suck, and there are lots of them on a core, that's fine. I'm not married to Altivec -- just burly SIMD, whatever shape or form. And I do think that we will see dual or quad core SSE4/5 per "Mactel" core.



    Beyond that, your use of confirmed is a little frowned upon 'round these parts. Pretty much, everything that has ever been "confirmed" has never actually happened. Likewise, clock rates on Intel chips circa 2007 probably won't be higher that 3GHz, nor will they perform faster than a current DP 2.7GHz G5 at DSP tasks if they don't have burly SIMD. I thought that I demonstrated a pretty good analogy, that being the results of a 2.5 year old mac vs 0.5 year old pc should give us a good example of what to expect of a modern mac vs. a 2007 pc.
  • Reply 32 of 52
    matsumatsu Posts: 6,558member
    Are we discounting what Apple brings to Intel? SSE hasn't been great, but...



    who cares how you get there so long as the results are good?



    there are two cores, possibly hyperthreading technology will re-appear, bringing the total to four logical units, maybe not, but two cores is still good. IBM doesn't promise anything similar for portable use...



    As for SSE, nothing we know says that it won't improve substantially in the near term. I think part of the problem with SSE has been that it hasn't been fully supported with the different variants available from Intel and AMD... With Apple that problem might be avoided...
  • Reply 33 of 52
    wmfwmf Posts: 1,164member
    The AltiVec spec has been out there for years. Intel is up to SSE3. If AltiVec is so much better than SSE, why hasn't Intel fixed it after all this time?
  • Reply 34 of 52
    Quote:

    SSE(1) was a joke. It used the regular FPU.



    You're thinking of MMX. (Yay, I actually know something about Apple hardware now!) Edit: No, I don't. It just doesn't use the same registers like MMX.



    Quote:

    As for SSE, nothing we know says that it won't improve substantially in the near term. I think part of the problem with SSE has been that it hasn't been fully supported with the different variants available from Intel and AMD... With Apple that problem might be avoided...



    Boom. Apple is not interested in custom chips or special functionality from Intel. That's precisely what they were getting away from financing. Any improvements on SSE will be to the benefit of all Intel customers, but with tight hardware/software integration, Apple could gain the most from any new functionality.
  • Reply 35 of 52
    For what it's worth, Ryan Gordon commented on the whole SSE/Altivec situation on IMG at IMG mac game dev reaction.



    Long and short of it, is that Ryan feels that SSE3 will not only offer fair performance, it's application will also be broader reaching, allowing for more widespread SIMD acceleration.



    In the end, time will tell; but even if SSE3 is not as fast as Altivec, simply having much greater bandwidth to feed data to the SIMD core in laptops will make a huge difference, and I think it will allow Apple to be less reliant on the GPUs, which currently suffer a latency hit compared to on core processing (until relevent data is cached on card, of course).



    I'm not as bullish on CoreImage for Photoshop-like image processing as sum, simply because I can't imagine the GPU as effectively caching 150-1500MB images as a CPU solution.
  • Reply 36 of 52
    gspottergspotter Posts: 342member
    Regarding the power of altivec: Why is a PC faster in these media creation tests ?
  • Reply 37 of 52
    telomartelomar Posts: 1,804member
    Quote:

    Originally posted by GSpotter

    Regarding the power of altivec: Why is a PC faster in these media creation tests ?



    After Effects isn't Altivec enabled I don't believe.



    Altivec is very good, if you can use it. SSE is far more all purpose.
  • Reply 38 of 52
    unixpoetunixpoet Posts: 41member
    Quote:

    Originally posted by Programmer

    There are plenty of non-graphics tasks which are extremely float intensive, and in the future you may find those in games. A GPU is also not ideal for all graphics tasks, only those which fit into its vertex & pixel shader model. If your definition of graphics fits only that model, it is a fairly limited notion of graphics.





    If the "limited model" is good enough for Carmack and HL2 then its good enough for me.



    GPUs are very, very fast vector units. Graphics happens to need these kinds of chips but so do other things. Physics is one - indeed there are demos of physics being done on a GPU.



    BTW, Carmack had this to say (Taken from Slashdot post ):

    Quote:

    Carmack



    We work with Apple, ATI, and Nvidia to make everything run as well as possible. Doom 3 had AltiVec code in it, and there were driver changes to make things work better. The bottom line is that the compiler / cpu / system / graphics card combinations available for macs has just never been as fast as the equivalent x86/windows systems. The performance gap is not a myth or the result of malicious developers trying to make your platform of choice look bad.



    Yes, it is always possible to make an application faster, but expecting developers to work harder on the mac platform than on windows is not reasonable. The xbox version of Doom required extensive effort in both programming and content to get good performance, but it was justified because of the market. In hindsight, we probably should have waited and ported the xbox version of the game to the mac, which would have played on a broader range of hardware. Of course, then we would have taken criticism for only giving the mac community the "crippled, cut down version".





    Listening to you people its as if Altivec is God's own gift to the processor world. Its a good implementation but you lot are missing the big picture. And in the big picture Altivec is not important/relevant to 80% of the applications out there.
  • Reply 39 of 52
    Quote:

    Originally posted by UnixPoet

    If the "limited model" is good enough for Carmack and HL2 then its good enough for me.



    So, if a limited model is good enough for people who only need a limited model, then it's good enough for you. Can't really argue with that logic.



    Quote:



    GPUs are very, very fast vector units. Graphics happens to need these kinds of chips but so do other things. Physics is one - indeed there are demos of physics being done on a GPU.




    Well, since superbuffers got canned, we're stuck with a model where everything on the GPU must be a polygon or a fragment/pixel. The result is that any physics that can be done on a GPU are extraordinarily limited (no collision detections and such).



    As much of a hubbub is made about GPGPU's, they're not here yet.



    Quote:



    BTW, Carmack had this to say (Taken from Slashdot post ):





    Carmack's just being egotistical-- it's his design decisions that made it slow on the Mac. Pretty much every other non-Doom benchmark proves him wrong.



    Quote:



    Listening to you people its as if Altivec is God's own gift to the processor world. Its a good implementation but you lot are missing the big picture. And in the big picture Altivec is not important/relevant to 80% of the applications out there.




    Right, it's only relevant to the remaining 20% that are all performance heavy apps that professionals depend on daily.
  • Reply 40 of 52
    junkyard dawgjunkyard dawg Posts: 2,801member
    Confirmed: Intel will not dump SSE(x), invested in by hordes of Wintel developers great and small, in favor of VMX/Altivec/Velocity Engine, which is used by a single, small, and new x86 adopter.



    Intel will not return to the drawing boards to customize their latest CPU designs to please Apple.



    Intel does not secretly wish they could band together with Apple to take on the world of Wintel computers.



    It may be healthy for us Apple zealots to keep in mind that Apple is merely a customer buying CPUs from Intel, CPUs which are already for sale and are regularly purchased by Apple's competitors in quantities Apple could only dream of using. Intel may value Apple a bit more than their purchasing volumes would predict, but after the initial honeymoon cools off, Apple will be left to deal with a CPU supplier far more responsive to the needs a corporation like Dell that moves silicon like a Tattoine sandstorm. The relationship between Apple and Intel will basically be Apple taking what is given to them, no more, no less.



    No more, no less is the key to understanding Apple and Intel. As long as Apple gets their CPUs from Intel, hardware performance ceases to be a variable in the competition between OS X and Windows, which is of course awesome for Apple since PPC has normally lagged behind x86 performance. Now if a Mac lags against comparable x86 systems in hardware performance, it's because Apple wants it to lag and hobbled the CPU with cheap mobo components.



    Back to the SIMD unit - while Intel will never adopt Altivec, it seems plausible that they may add improvements to SSE, while maintaining backwards compatibility, that bring it's performance up to the level of Altivec. Of course both Apple and Microsoft, and any developer, would be free to exploit an enhanced SSE unit, but this is where Apple finally gets to showcase their programming chops to the whole world by making SSE sing on OS X!
Sign In or Register to comment.