970 Questions

rogue master · November 28, 2002 8:29AM

I've looked at the various information available on the 970, but have a few questions that someone here might be able to answer.

If the 970 is a 64-bit chip, does the VMX allow for two 64-bit values or is it still limited to 4 32-bit, 8 16-bit, and 16 8-bit values? Assuming it doesn't, wouldn't it have been easy for IBM to include instructions to support 2 64-bit values? This would give many FFT users double precision, which is sorely lacking from Altivec. Will the performance of add/subtract in VMX be substantially better than it is in Altivec?

As I understand, the size of an int (in bits) is the same size as a GPR. Does this mean that an int on the 970 is 64-bits? If not, is a long 64-bits? What is a long long? What is a short? What about floats and doubles? Is the pending introduction of the 970 related to Apples deprecation of vector signed long and vector unsigned long in OS X 10.2?

Looking for answers....

stoo · November 28, 2002 8:47AM

[quote]This would give many FFT users double precision, which is sorely lacking from Altivec<hr></blockquote>

Why not put double precision FP through the standard FPU? I suspect that "double precision lacking in Altivec" is because the G4 could do with some more Hz/scalar FPU units to challenge the Athlon.

Why does the P4 do this? Any advantages for the P4? Do these translate to 970 advantages?

programmer · November 28, 2002 9:56AM

The VMX unit is identical to the AltiVec unit in terms of functionality. The power of this unit comes from having 4-16 times as many values per register and doing the same operation on all of them at once. Going to 2 elements per register is quite a minor gain over just using the scalar units, and turns out to usually not be a gain at all since you usually have to do things to your code to allow it to operate on multiple data elements at the same time. The circuit complexity to support 64-bit SIMD operands would be very substantial, and I would argue that it is not "sorely" lacking -- there are good reasons for the SIMD unit to be limited to 32-bit operands and 128-bit registers. SIMD units which are only 2-way are quite marginal in value in my experience.

Double precision FFT users currently have their code in scalar form. The 970 will run that code twice as fast (at the same clock rate) thanks to the 2 FPUs. This will be as fast as the suggested enhancement to VMX without having to rewrite the code.

An integer register is 64-bits wide on the 970. In the compilers this will probably be represented as a "long long", but I might be wrong about this since Apple hasn't announced their 64-bit OS APIs yet. "int" will likely stay 32-bits, but "long" may go to 64-bits. If it does then "long long" will likely be promoted to 128-bits.

"float" is defined by IEEE to be 32-bits and "double" to be 64-bits -- the PowerPC execution model defines the FPU registers to be the size of a double.

"vector signed long" and "vector unsigned long" may have been eliminated to remove confusion about this issue going forward, they certainly were redundant.

BTW: The P4 only puts double precision through their "vector" unit because the x86 FPU architecture is so incredibly stupid that they can't figure out how to get decent performance out of it. Rather than trying they added scalar double support to the SSE2 unit, avoiding the whole problem with the FPU register stack and how it is shared with the MMX unit. It also means that AMD needs to implement SSE2 if they want to stay compatible with P4-optimized code.

stoo · November 28, 2002 2:05PM

Mmmm... non stack based FPU

Thanks programmer.

wmf · December 4, 2002 10:13PM

In 32-bit mode, PowerPC uses an "ILP32" model where int, long, and void* are 32 bits. In 64-bit mode it uses an "LP64" model where int is 32 bits and long and void* are 64 bits. I'm not sure about long long.

programmer · December 4, 2002 10:17PM

[quote]Originally posted by wmf:

<strong>In 32-bit mode, PowerPC uses an "ILP32" model where int, long, and void* are 32 bits. In 64-bit mode it uses an "LP64" model where int is 32 bits and long and void* are 64 bits. I'm not sure about long long.</strong><hr></blockquote>

This is actually a function of the C/C++ compiler. Different compilers (or even different settings on the same compiler) could present different size data types.

wmf · December 4, 2002 11:21PM

[quote]Originally posted by Programmer:

<strong>This is actually a function of the C/C++ compiler. Different compilers (or even different settings on the same compiler) could present different size data types.</strong><hr></blockquote>

True, but AFAIK there is some kind of standard for C on PowerPC, so all the compilers do it the same way.

970 Questions

Comments