Speculation: Vector Unit in Apple's Intel chip

splinemodel · June 10, 2005 9:57PM

There are two things that normal Apple customers do with any regularity these days that take a lot of cpu cycles and actually force waiting: video work (iMovie), and CD ripping (iTunes). Even cheap computers are so fast these days that no one likes waiting for the computer to . . . well . . . compute.

In iTunes, my TiBook with a 1GHz G4, 1GB of RAM, and a 133MHz bus rips 192kb AAC faster than my P4 3.2GHz with 2GB of RAM and an 800MHz bus. Altivec works.

It's total speculation, but I'd put money on Intel integrating a decent SIMD unit into the 2007 Mactel. Preferably, at least two per core.

The whole i-suite just will not be faster on Intel 2007 than it is on today's G5's unless there's some answer to Altivec.

mr. me · June 10, 2005 10:42PM

Quote:

Originally posted by Splinemodel

There are two things that normal Apple customers do with any regularity these days that take a lot of cpu cycles and actually force waiting: video work (iMovie), and CD ripping (iTunes). Even cheap computers are so fast these days that no one likes waiting for the computer to . . . well . . . compute.

In iTunes, my TiBook with a 1GHz G4, 1GB of RAM, and a 133MHz bus rips 192kb AAC faster than my P4 3.2GHz with 2GB of RAM and an 800MHz bus. Altivec works.

It's total speculation, but I'd put money on Intel integrating a decent SIMD unit into the 2007 Mactel. Preferably, at least two per core. The whole i-suite just will not be faster on Intel 2007 than it is on today's G5's unless there's some answer to Altivec.

Intel has not even hinted that it will design processors specifically for Apple. The opposite is true. Intel said that Apple will be able to choose from the wide array of processors offered by the company. I believe that Apple engineers will play a significant role in the development of future Intel processors. However, these processors will be available to everyone.

ikdigital · June 11, 2005 12:19AM

How does Hyper-Threading compare/differ from Altivec?

Is HT something Apple will incorporate into the MacTels?

a j stev · June 11, 2005 1:07AM

Quote:

Originally posted by ikDigital

How does Hyper-Threading compare/differ from Altivec?

Is HT something Apple will incorporate into the MacTels?

I think your trying to compare SMT (hyper threading) with SIMD (Altivec) - they are different kettles of fish altogether.

Will Apple incorporate them into their computers? If they go with Pentium M-lineage chips then the answer is no because Pentium M chips don't have SMT/Hyper-threading.

Mr. Me is right...there is no sign that Intel is interested in bolting on Altivec onto their designs. The Universal Binary site on the Apple developer page talks about porting Altivec instructions to Intel-based MMX, SSE, SSE2, SSE3 Multimedia functions rather than saying 'Hold on, keep optimizing Altivec...its going to still be there'.

Even if Apple and Intel wanted to use Altivec on Intel, it's still unclear whether they have the intellectual property to do so.

Having said that, as Arsians have alluded to over the past week, Apple may inspire Intel to have another go at multimedia functions, perhaps adding SSE4 or 5 in the next 2 to 3 years to Merom, Conroe and other processors on the drawing board

a j stev · June 11, 2005 1:19AM

Apparently, Yonah also has enhanced multimedia capabilities. This from Eweek (http://www.eweek.com/article2/0,1759,1823185,00.asp):

"Digital Media Boost, a Yonah feature which aims to increase performance, is a combination of updates to SSE (Streaming Single Instruction Multiple Data Extensions), a set of special instructions for executing multimedia, and the chip's Floating Point or number-crunching unit, the chip maker said".

The belief is that the Pentium M suffers on FP calculations compared with Athlon 64 (socket 939) and Pentium 4. DMB is obviously a plan to combat that. Would Apple have any involvement in that development? Ask intel

towel · June 11, 2005 3:41AM

Quote:

Originally posted by a j stev

Apparently, Yonah also has enhanced multimedia capabilities. This from Eweek (http://www.eweek.com/article2/0,1759,1823185,00.asp):

"Digital Media Boost, a Yonah feature which aims to increase performance, is a combination of updates to SSE (Streaming Single Instruction Multiple Data Extensions), a set of special instructions for executing multimedia, and the chip's Floating Point or number-crunching unit, the chip maker said".

The belief is that the Pentium M suffers on FP calculations compared with Athlon 64 (socket 939) and Pentium 4. DMB is obviously a plan to combat that. Would Apple have any involvement in that development? Ask intel

I've read that Yonah is getting SSE3, as well as an enhanced FPU. I think SSE3 is already on the newer Pentium 4's. They'll help, but probably still won't match the FPU or Altivec on the 970. Oh well. That's why the PowerMac will surely be the last to be switched. Unfortunately, I imagine there are some serious IP issues preventing Intel from adpating, much less adpoting, Altivec. These plans must pre-date Apple's involvement, anyway - they probablu go back to whenever Intel realized the M was the future of their company.

t'hain esh kelch · June 11, 2005 8:05AM

Altivec is prety darn great.. Its a big setback if the Intel chips wont get it, which I dont think they will..

mikenap · June 11, 2005 9:15AM

not having a good understanding on the impact of Vector units like Altavec, one thing to keep in mind is recent real world performance tests of the newest systems. The PM does pretty well (wins some looses more) in the After Effects tests, wins some looses more in photoshop tests, and obviously gets smoked in all 3D functions, which i guess if Video card/drivers. These tests (which are always suspect I know) have been against standard Intel/AMD chips, not the multicore units that are just shipping, that smoke the older versions in content creation, so one could assume that would do even better against the PM's. Isnt this all encouraging for the future of OSX on Intel? Where are we going to suffer, since in the apps I care about at least, we were not exactly dominating in the first place.

stoo · June 11, 2005 9:58AM

Hyper threading makes a single processor physical appear as multiple logical processors. The idea is that this allows better use of the processor's resources (e.g. an integer heavy thread runs on logical CPU 1, and an FPU heavy thread runs on logical CPU 2).

carniphage · June 11, 2005 11:47AM

Although audio is a beneficiary of Altivec - there is a better place for video to go: The GPU.

Apple would be wise to invest in a technology which allowed the CPU to offload compositing, & rendering to the extra-ordinary power of the new programmable GPUs.

Hey wait....

Didn't they just do that?

In fact as GPUs get more and more programmable it makes sense to offload intensive math tasks to the GPU and forget about bolting on a math processor inside the CPU.

Roll on Core Math

Carni

splinemodel · June 11, 2005 4:48PM

Quote:

Originally posted by Carniphage

Although audio is a beneficiary of Altivec - there is a better place for video to go: The GPU.

Using the gpu requires a lot of overhead. Using an on-die VPU, which is very similar architecturally to a GPU, is much cheaper.

SSE is weak sauce. There are two glaring problems with the x86: the FPU and the lack of a good SIMD unit. Intel has great FPU and SIMD technology, it's just not present in the current line of x86 chips. To me it seems clear that Apple is waiting until 2007 because it won't be until then that Intel will have anything that can compete with IBM in these critical areas.

carniphage · June 12, 2005 7:16AM

Quote:

Originally posted by Splinemodel

Using the gpu requires a lot of overhead. Using an on-die VPU, which is very similar architecturally to a GPU, is much cheaper.

I disagree. In recent years off-die GPUs have produced an astonishing improvment in performanc, while SSE units have at best resulted in modest gains. AMDs 3DNow presented itself as a solution to a problem which was solved much better by "T&L"

I think an on-processor vector unit is not a terrible idea but I think the benefit of that technology is declining. Specifically, I am just not sure what current computing applications it is particularly essential for.

The most intensive float-bashing task imaginable is video/CG - and the right place for that computing power to go is on the GPU alongside your pixels and your vertices.

Yes, it is expensive for the CPU to send data to the GPU - but if your application's main bottleneck is bashing massive arrays of floats, the CPU should not be involved in the task in the first place.

Having spent some times with the PS2 hardware, the way to be optimal is to avoid using the CPU as much as possible.

The general-purpose nature of modern GPUs means that more and more numerically complex tasks can and should be offloaded into the GPU.

The real trick is making that process simple for the programmer. Never tried it, but I am fairly sure that a modern GPU could give Altivec a kicking with audio apps, video encoding, cryptography, physics simulation and just about everything else.

Put another way which is the more useful computer? a 1GHz G3 with Core Image & fast GPU or a 1GHz G4 with no GPU and just Altivec?

Carni

mr. me · June 12, 2005 9:14AM

Quote:

Originally posted by Carniphage

I disagree. In recent years off-die GPUs have produced an astonishing improvment in performanc, while SSE units have at best resulted in modest gains. AMDs 3DNow presented itself as a solution to a problem which was solved much better by "T&L"

I think an on-processor vector unit is not a terrible idea but I think the benefit of that technology is declining. Specifically, I am just not sure what current computing applications it is particularly essential for.

The most intensive float-bashing task imaginable is video/CG - and the right place for that computing power to go is on the GPU alongside your pixels and your vertices.

Yes, it is expensive for the CPU to send data to the GPU - but if your application's main bottleneck is bashing massive arrays of floats, the CPU should not be involved in the task in the first place.

Having spent some times with the PS2 hardware, the way to be optimal is to avoid using the CPU as much as possible.

The general-purpose nature of modern GPUs means that more and more numerically complex tasks can and should be offloaded into the GPU.

The real trick is making that process simple for the programmer. Never tried it, but I am fairly sure that a modern GPU could give Altivec a kicking with audio apps, video encoding, cryptography, physics simulation and just about everything else.

Put another way which is the more useful computer? a 1GHz G3 with Core Image & fast GPU or a 1GHz G4 with no GPU and just Altivec?

Carni

The reason that Apple is evangelizing Xcode so strongly is to encourage developers to use the various high-level APIs, especially the CoreXXXX sets. If developers use the tools available to them, then they won't have to worry about the underlying hardware.

t'hain esh kelch · June 12, 2005 11:16AM

How easy would it be to make the GPU decode audio? I know there has been some development on this area - Read a /. article a few months ago on it - But I have no technological insight...

shawk · June 12, 2005 12:00PM

The proposed settlement between Apple and Motorolaa/Freescalw may include unlimited Altavec IP licensing to Apple for other CPUs, such as X86 and Itanium.

programmer · June 12, 2005 12:43PM

Quote:

Originally posted by Carniphage

I disagree. In recent years off-die GPUs have produced an astonishing improvment in performanc, while SSE units have at best resulted in modest gains. AMDs 3DNow presented itself as a solution to a problem which was solved much better by "T&L"

I think an on-processor vector unit is not a terrible idea but I think the benefit of that technology is declining. Specifically, I am just not sure what current computing applications it is particularly essential for.

On-processor vector units are exactly what Cell is.

Quote:

The general-purpose nature of modern GPUs means that more and more numerically complex tasks can and should be offloaded into the GPU.

The real trick is making that process simple for the programmer. Never tried it, but I am fairly sure that a modern GPU could give Altivec a kicking with audio apps, video encoding, cryptography, physics simulation and just about everything else.

Put another way which is the more useful computer? a 1GHz G3 with Core Image & fast GPU or a 1GHz G4 with no GPU and just Altivec?

The problem with GPUs is that they are quite focussed on graphics, and are nowhere near "general-purpose". Yes they are programmable, but the programming model is quite constrained... which is how they are made so powerful. A GPU can do a lot more in certain applications than a PPC+VMX, but it also has to handle the systems graphics. Something like Cell + GPU is the best combination -- the problem with Cell from Apple's point of view is the weakness of the Power core in the first Cell relative to the 970 and Intel's Pentium-M offerings.

dhagan4755 · June 12, 2005 12:54PM

I think things will run just as good or better on the Intel stuff as they do with the G4 and Altivec or the G5 and VMX. I wonder how Apple will overcome the "Velocity Engine" stuff, but Apple is shrewd and they probably have all of that figured out...it's just that we don't know what it is yet

splinemodel · June 12, 2005 1:12PM

Thanks for the clarification, Programmer. I only have my left hand (cast on the right one), so long posts are not going to happen.

DHagan: I don't see how Intel will be able to outdo a DP 2.7GHz G5 without revamping the FPU architecture AND adding a vector unit along the lines of altivec. The Itanium has a best-in-class FPU and vector performance, so it's not like intel has to go to the drawing boards from scratch.

pb · June 12, 2005 1:18PM

Quote:

Originally posted by Splinemodel

DHagan: I don't see how Intel will be able to outdo a DP 2.7GHz G5 without revamping the FPU architecture AND adding a vector unit along the lines of altivec. The Itanium has a best-in-class FPU and vector performance, so it's not like intel has to go to the drawing boards from scratch.

So, are you suggesting that Intel will re-use existing CPU technologicy (Pentium + Itanium) to make a strong desktop CPU?

skatman · June 12, 2005 1:30PM

Quote:

n iTunes, my TiBookt with a 1GHz G4, 1GB of RAM, and a 133MHz bus rips 192kb AAC faster than my P4 3.2GHz with 2GB of RAM and an 800MHz bus. Altivec works.

This fact has zero to do with Pentium SSE2 being somehow inferior to Altivec and 100% to do with Apple not being able to write decent software for PC.

splinemodel · June 12, 2005 1:58PM

Quote:

Originally posted by skatman

This fact has zero to do with Pentium SSE2 being somehow inferior to Altivec and 100% to do with Apple not being able to write decent software for PC.

SSE2 is not nearly as robust as altivec. Don't kid yourself. Secondly, apple didn't even write the encoding code. It was licensed, and it's the standard.

Speculation: Vector Unit in Apple's Intel chip

Comments