multiprocessor & altivec speedup

Posted:
in macOS edited January 2014
I read awhile back about some programming tools for nvidia cards that helped programmers implement high-end effects that were previously difficult to code for.I think it was called "CygN" or something like that?



Does Apple have anything like that to make it easier for developers to implement altivec & multiprocessing to their apps?

Comments

  • Reply 1 of 8
    amorphamorph Posts: 7,112member
    Yes, and they've had something like that since the G4 first appeared. AltiVec is the only SIMD unit on the market, to my knowledge, where you can use a macro language rather than bare assembly to write code for it.



    As for multiprocessing, they have several thread libraries, actually, to target different programming models and environments. They've made it about as easy as it can be with the currently prevailing technologies, which is not saying much.
  • Reply 2 of 8
    mokimoki Posts: 551member
    There also is a tool out there that will automagically AltiVec-optimize ordinary source code -- though how good it actually is at it I don't know firsthand.



    AlitVec is pretty easy to use, though -- it's actually quite fun.
  • Reply 3 of 8
    sc_marktsc_markt Posts: 1,401member
    I've heard mention of altivec II and double precison altivec. Is there any room for improving altivec in future releases? In other words, can it be made to run faster?
  • Reply 4 of 8
    amorphamorph Posts: 7,112member
    There's an autovectorization tool called VAST, but apparently the AltiVec code it spits out leaves something to be desired (then again, how different is this from, say, gcc's PowerPC output)?



    CodeWarrior has a limited autovectorization capability as well.



    AltiVec actually took a great leap forward from the 7400 to the 745x, but the improvement was basically mooted by the lack of bus bandwidth (i.e., it only mattered in code that did a lot of work on each piece of data). It actually took a step back from the 7455 to the 970, but the 970 does a lot to fix the bandwidth issue and pumps up the clock rate, so there should be a performance improvement in most AltiVec code despite that.



    There's been talk of a "double wide" 256-bit AltiVec unit for some time now. On the G4 it just didn't make any sense because such a beast would tax bandwidth even more, and bandwidth is the G4's Achilles Hell. On the 970, the vector unit is still bus limited, just not nearly as limited, so a 256-bit AltiVec unit doesn't make too much sense there either. Also, the 970 will be able to do 64-bit FP much faster than the G4 can, so the ability to process 64-bit FP data in the vector unit isn't as important.



    As a general rule, twice as wide doesn't mean twice as fast. The issues are a lot more complicated than that. Sometimes things run a bit slower, sometimes they run more than twice as fast.
  • Reply 5 of 8
    sc_marktsc_markt Posts: 1,401member
    Thanks Amorph.



    Didn't know that the "the vector unit is still bus limited" on the 970. Just how much bandwidth can Altivec handle?
  • Reply 6 of 8
    1337_5l4xx0r1337_5l4xx0r Posts: 1,558member
    WAG: on a 1.8Ghz 970, Altivec would need, say, 27GB/sec throughput, whilst the 970 would only have 6.4 GB/sec.
  • Reply 7 of 8
    junkyard dawgjunkyard dawg Posts: 2,801member
    Altivec is a monster!
  • Reply 8 of 8
    netromacnetromac Posts: 863member
    Quote:

    Originally posted by Junkyard Dawg

    Altivec is a monster!



    Yes, it will be interesting to see how the increased bandwith will help speeding up altivec enabled applications. I think it will help a lot.
Sign In or Register to comment.