Vmx 2

powerdoc · August 4, 2003 11:42AM

Thanks for your answer Programmer

I have now a clear image, VMX 2 will support more instructions than the current implementation and his architecture will be more parallelar than the previous one.

smalm · August 4, 2003 6:01PM

Quote:

Originally posted by Programmer

The 970 VMX unit already has dual VPUs which can process vectors in parallel, effectively acting as a 256-bit vector unit like you allude to.

Could you explaine what you mean?

The 970 VMX consists of a permute unit fed by one queue and a simple interger unit, compex integer unit and floating point unit fed by a second queue.

How can I let this act as a 256-bit unit?

programmer · August 4, 2003 8:59PM

Quote:

Originally posted by smalM

Could you explaine what you mean?

The 970 VMX consists of a permute unit fed by one queue and a simple interger unit, compex integer unit and floating point unit fed by a second queue.

How can I let this act as a 256-bit unit?

No, I can't explain what I mean because I don't understand how I could say that either (at least not and be correct at the same time). I can offer excuses like "it was early in the morning" or "it was late at night".

You're quite right, of course. I had "two instructions per clock" stuck in my head.

Now a simple addition of another dispatch queue in an improved 970-design with extra execution units to back it up would result in effectively a 256-bit vector unit from a performance point of view, while retaining the flexibility of context costs of the 128-bit unit. Still wouldn't give you double precision, of course.

Sorry. Bad programmer, bad.

tomb of the unknown · August 5, 2003 8:00AM

Quote:

Originally posted by smalM

Could you explaine what you mean?

The 970 VMX consists of a permute unit fed by one queue and a simple interger unit, compex integer unit and floating point unit fed by a second queue.

How can I let this act as a 256-bit unit?

You might be interested in this PDF file from Apple's developer site. It talks about 256-bit FP operations and how to get the Altivec unit to do them.

http://developer.apple.com/hardware/ve/pdf/oct3a.pdf

powerdoc · August 5, 2003 9:25AM

Quote:

Originally posted by Tomb of the Unknown

You might be interested in this PDF file from Apple's developer site. It talks about 256-bit FP operations and how to get the Altivec unit to do them.

http://developer.apple.com/hardware/ve/pdf/oct3a.pdf

It's quite different from what we where discussing here. It's a discussion of how to make 256 bit FP operation on a altivec unit, rather than altivec 256 bits, or a parallelar altivec unit (the current one have only 4 executions units, all differents).

programmer · August 5, 2003 9:25AM

Quote:

Originally posted by Tomb of the Unknown

You might be interested in this PDF file from Apple's developer site. It talks about 256-bit FP operations and how to get the Altivec unit to do them.

http://developer.apple.com/hardware/ve/pdf/oct3a.pdf

This is the difference between operating on a single very large number, and acting on 256-bits worth of small numbers (i.e. 8 32-bit numbers). SIMD is far better at doing the same operation to many small numbers than it is at doing an operation on a big number... that's what it is designed to do after all: process "vectors" of numbers.

yevgeny · August 5, 2003 9:31AM

Quote:

Originally posted by Programmer

This is the difference between operating on a single very large number, and acting on 256-bits worth of small numbers (i.e. 8 32-bit numbers). SIMD is far better at doing the same operation to many small numbers than it is at doing an operation on a big number... that's what it is designed to do after all: process "vectors" of numbers.

I don't even know what data type takes 256 bits. The only reason why you would have a 256 bit register is for parallel vector operations.

I only know of one 128 bit data type, a MS GUID. When you have to compare lots of them (e.g. in a COM QI call), it takes quite a bit of time. My coworkers and I have frequently wished that Intel chips had some way to do a 128 bit compare.

programmer · August 5, 2003 11:23AM

Quote:

Originally posted by Yevgeny

I don't even know what data type takes 256 bits. The only reason why you would have a 256 bit register is for parallel vector operations.

I only know of one 128 bit data type, a MS GUID. When you have to compare lots of them (e.g. in a COM QI call), it takes quite a bit of time. My coworkers and I have frequently wished that Intel chips had some way to do a 128 bit compare.

The need for numbers that large (or precise) is very rare. Some scientific uses, and the like.

There are relatively quick ways to do a 128-bit GUID compare on a 32-bit machine (of course having enough registers to load 8 words at the same time always helps!). You shouldn't be doing this in assembly, and in a high level language just write the function once. I can't imagine that it would be your performance bottleneck -- if it is you have some serious design issues.

mr. me · August 5, 2003 2:03PM

Quote:

Originally posted by Programmer

The need for numbers that large (or precise) is very rare. Some scientific uses, and the like.

....

I find this comment amusing. Computers in the scientific world have three uses. One is to analyze data from experiments. Another is to simulate the real world. The third is to control instrumentation. A 128-bit number is relevant if you can measure two quantities differing by only the last digit. But the fact is that there is no instrument constructed by the hand of man (or woman) that approaches that kind of accuracy. There is no quantity in any system of measurements that is defined to 128 bits of accuracy. If there is nothing defined to 128-bit accuracy and nothing that can measure it anyway, then it most certainly makes no sense to simulate anything to that kind of accuracy. And finally, there is no instrument on the face of the Earth that requires control by 128-bit strings.

Is it possible to map 128-bit numbers into something useful scientifically? Probably, but that doesn't mean that a scientific problem is naturally expressed in 128-bit terms or that 128-bit numbers are the most efficient way to do it. Isn't it true that 128-bits are more accurate than 32-bits or even 64-bits? And if that is the case, then what is the harm in using 128-bit numbers? Yes, it is true. However, the major use of large-precision numbers is to hide bad algorithms. Bad algorithms waste time. Calculations performed using large-precision numbers also waste time and memory. That is time better spent making simulations more representative of the real world.

powerdoc · August 5, 2003 2:15PM

Quote:

Originally posted by Mr. Me

I find this comment amusing. Computers in the scientific world have three uses. One is to analyze data from experiments. Another is to simulate the real world. The third is to control instrumentation. A 128-bit number is relevant if you can measure two quantities differing by only the last digit. But the fact is that there is no instrument constructed by the hand of man (or woman) that approaches that kind of accuracy. There is no quantity in any system of measurements that is defined to 128 bits of accuracy. If there is nothing defined to 128-bit accuracy and nothing that can measure it anyway, then it most certainly makes no sense to simulate anything to that kind of accuracy. And finally, there is no instrument on the face of the Earth that requires control by 128-bit strings.

Is it possible to map 128-bit numbers into something useful scientifically? Probably, but that doesn't mean that a scientific problem is naturally expressed in 128-bit terms or that 128-bit numbers are the most efficient way to do it. Isn't it true that 128-bits are more accurate than 32-bits or even 64-bits? And if that is the case, then what is the harm in using 128-bit numbers? Yes, it is true. However, the major use of large-precision numbers is to hide bad algorithms. Bad algorithms waste time. Calculations performed using large-precision numbers also waste time and memory. That is time better spent making simulations more representative of the real world.

For the measurement i agree.

For the second part i disagree. Mathemacians have demonstrated that a lack of large precisions and numbers can lead to a huge error,during a simulation.

A very small error can be exponentially increase in some type of calculations. For most of the simulations, this is not an issue, but for some others it's important.

yevgeny · August 5, 2003 2:23PM

Quote:

Originally posted by Programmer

The need for numbers that large (or precise) is very rare. Some scientific uses, and the like.

There are relatively quick ways to do a 128-bit GUID compare on a 32-bit machine (of course having enough registers to load 8 words at the same time always helps!). You shouldn't be doing this in assembly, and in a high level language just write the function once. I can't imagine that it would be your performance bottleneck -- if it is you have some serious design issues.

Yes, generally speaking, there isn't really much of a need for 128 bit numbers.

In particular, it would be possible to check the portion of the GUID that corresponds to the time created, then check the portion of the GUID that corresponds to the IP address. To be equal, you would need to run all 8 32 bit tests, but inequality could be established pretty easily- within the first or second query. IP address isn't that reliable a way of differentiating GUIDs because programmers have a tendency to generate GUIDs on the same machine (the box they program on).

When you to lots of COM interface programming, GUID comparisons are actually an irritating bottleneck (the software design forces them on you, but they are slow in comparsion to the actual work you have to get done). It is commonly the case that you must obtain an interface on an object that is almost the last interface that is queried for (in one case, the interface that I regularly needed was the 15th of 17 interfaces on an object!). Being able to do a GUID check in one clock cycle is something that we wish for (a quick check through our code base turned up 5 million lines of code implementing 6,611 unique COM interfaces implemented on 9,641 coclasses in 510 dll's). At least the architecture is clean, and this is what is important because we expose all of our internal code to 3rd party developers so that they can extend our software using the same objects that we use. When you want to expose millions of lines of C++ code to developers, software enginnering sometimes trumps speed.

programmer · August 5, 2003 3:39PM

Quote:

Originally posted by Yevgeny

When you to lots of COM interface programming, GUID comparisons are actually an irritating bottleneck (the software design forces them on you, but they are slow in comparsion to the actual work you have to get done). It is commonly the case that you must obtain an interface on an object that is almost the last interface that is queried for (in one case, the interface that I regularly needed was the 15th of 17 interfaces on an object!). Being able to do a GUID check in one clock cycle is something that we wish for (a quick check through our code base turned up 5 million lines of code implementing 6,611 unique COM interfaces implemented on 9,641 coclasses in 510 dll's). At least the architecture is clean, and this is what is important because we expose all of our internal code to 3rd party developers so that they can extend our software using the same objects that we use. When you want to expose millions of lines of C++ code to developers, software enginnering sometimes trumps speed.

Its not necessarily an either-or situation. I've seen a COM implementation that does a much better job of the QueryInterface than Microsoft's does, and it still gives you all of the abstraction power that interface-based programming offers.

programmer · August 5, 2003 3:45PM

Quote:

Originally posted by Mr. Me

Is it possible to map 128-bit numbers into something useful scientifically?

Yes, there are situations where this is necessary. Is it necessary 99.9999% of the time? No. I agree that many coders squander the available precision unnecessarily, but there are problems out there which require this level of precision. It is not worth the circuits to implement this directly in commodity desktop processors, but that doesn't mean that occasionally it isn't useful. This is why Apple published a paper on doing 128-bit floating point arithmetic with the AltiVec unit. Its an interesting read, but they wouldn't have done it if it wasn't useful to somebody.

yevgeny · August 5, 2003 4:14PM

Quote:

Originally posted by Programmer

Its not necessarily an either-or situation. I've seen a COM implementation that does a much better job of the QueryInterface than Microsoft's does, and it still gives you all of the abstraction power that interface-based programming offers.

True, it isn't an either or situation. I thought of my own speed boost to QI that would use hash table at the base object in the interface inheritance to reduce QI runtime from O(n) to O(1) (amortized). Of course, we would have to implement this in all our objects... actually it turns out that you can usually get around most QI problems by simply rearranging the order in which you compare a requested interface to the interfaces that a given object supports. Place the frequently requested interfaces at the front of the list of interfaces to query. Anyhow, I am sure that effective COM programming isn't the best use of a future hardware topic on VMX. It suffices to say that some people have a use for 128 bit operations that sadly are not doable on the x86 instruction set.

It doesn't surprise me that there are better COM implementations out there. The MIDL compiler barely qualifies as a usable product in my mind...

wizard69 · August 6, 2003 1:31PM

I have to disagree with this on several counts. You miss what is possibly the most important usage for a computer in science, and that is as a communications tool. Foats don't often come into play here unless we are talking about visualization applications.

Second resolutions in measurements do not constrain the need for resolution in processing.

The advent of and the use of large number processing libraries more or less discount your assertion that large number capabilities are not needed. Much of what science deals with is not the product of man kind anyways.

Your statement about the definition of quanities is a bit rediculous. Many an Engineering math handbook has definitions for constants out past 20 digits. It is never a good thing to loose information simply because the current number system you are using can't handle it.

You want to use the largest number sizes that are consistant with not loosing information. If you applications would work better with 68 bit floats, then the next logical size it 128 bit floats, though I suppose one could argue for 96 bit floats. Besides there are even applications outside of science that can make use of large floats, just look at the national budget.

Thanks

Dave

Quote:

Originally posted by Mr. Me

I find this comment amusing. Computers in the scientific world have three uses. One is to analyze data from experiments. Another is to simulate the real world. The third is to control instrumentation. A 128-bit number is relevant if you can measure two quantities differing by only the last digit. But the fact is that there is no instrument constructed by the hand of man (or woman) that approaches that kind of accuracy. There is no quantity in any system of measurements that is defined to 128 bits of accuracy. If there is nothing defined to 128-bit accuracy and nothing that can measure it anyway, then it most certainly makes no sense to simulate anything to that kind of accuracy. And finally, there is no instrument on the face of the Earth that requires control by 128-bit strings.

Is it possible to map 128-bit numbers into something useful scientifically? Probably, but that doesn't mean that a scientific problem is naturally expressed in 128-bit terms or that 128-bit numbers are the most efficient way to do it. Isn't it true that 128-bits are more accurate than 32-bits or even 64-bits? And if that is the case, then what is the harm in using 128-bit numbers? Yes, it is true. However, the major use of large-precision numbers is to hide bad algorithms. Bad algorithms waste time. Calculations performed using large-precision numbers also waste time and memory. That is time better spent making simulations more representative of the real world.

nevyn · August 6, 2003 2:25PM

Quote:

Originally posted by wizard69

Besides there are even applications outside of science that can make use of large floats, just look at the national budget.

???

If you're running out of bits in double precision floating point math doing anything related to money, fire your programmer.

Yes, the US national debt exceeds $4B, it is still NOWHERE near something that needs floating point sizes beyond double precision. Yes, I grok compounded interest and other methods of getting "partial cents" etc - but _in_the_end_ it all has to be rounded. (Because we don't cut coins into pieces anymore). The other aspect is that something like the debt isn't really a 'single item'. It is a collection of a very large number of smaller items... each of which has its own individual interest rate, maturity etc. -> Each of those is _individually_ calculable with 100% accuracy on hand calculators (once you acknowledge the inherent discreetness + rounding rules).

On another note, the text in front of me tabulates only e & pi beyond 16 digits. Well, and pi/2, and a slew of other sillyness like that.

You don't really mean "If you've got the more precise info, you're insane to chuck it, all calculations must proceed with all available information" do you? You really mean "I've assessed 1) the number of significant bits I need in the end, and 2) I've assessed how my algorithm will spread my error bars/reduce the number of significant bits, and 3) I've measured as accurately as I need to, with a fair chunk of extra accuracy"

Because if you really mean "never use approximations when better data is available", please call me when you've accurately and precisely calculated the circumference of a circle exactly 1 meter in diameter. In units of _meters_ please, not multiples of pi. Oh, and here's the first million digits of pi or so.

yevgeny · August 6, 2003 2:56PM

Quote:

Originally posted by wizard69

I have to disagree with this on several counts. You miss what is possibly the most important usage for a computer in science, and that is as a communications tool. Foats don't often come into play here unless we are talking about visualization applications.

Second resolutions in measurements do not constrain the need for resolution in processing.

The advent of and the use of large number processing libraries more or less discount your assertion that large number capabilities are not needed. Much of what science deals with is not the product of man kind anyways.

Your statement about the definition of quanities is a bit rediculous. Many an Engineering math handbook has definitions for constants out past 20 digits. It is never a good thing to loose information simply because the current number system you are using can't handle it.

You want to use the largest number sizes that are consistant with not loosing information. If you applications would work better with 68 bit floats, then the next logical size it 128 bit floats, though I suppose one could argue for 96 bit floats. Besides there are even applications outside of science that can make use of large floats, just look at the national budget.

Thanks

Dave

You can express the national budget in terms of doubles. Because all computer numbers are discrete, you can never have precision for infinitely repeating numbers like the number produced by 1/3. 64 bits can represent in decimal 9*10^15 (999,999,999,999,999) or floating point values from 4.9*10^-307 to 1.8*10^308. That is quite range of values and you still get 15 digits of accuracy.

64 bit doubles are sufficiently accurate to describe the location of anything in the solar system to within a few millimeters. Any scientist who thinks his numbers are that accurate in the first place is an idiot. The error in measurement is larger than the error introduced by the inaccuracy by a 64 bit number.

amorph · August 6, 2003 4:20PM

Quote:

Originally posted by wizard69

You want to use the largest number sizes that are consistant with not loosing information. If you applications would work better with 68 bit floats, then the next logical size it 128 bit floats, though I suppose one could argue for 96 bit floats.

Actually, IEEE specifies an 80 bit floating point type, which was implemented in the 68040 and lost in the transition to PowerPC.

For the national debt, you'd want one of those big IBMs that can do fixed-point math in hardware.

None of this is really relevant to a revision of VMX - G4s have always been able to do 64 bit floating point just fine (if a bit slowly). The issue is whether it's worth the extra bandwidth and silicon to handle 64 bit values in vectors, and currently the answer appears to be no. SSE2 can do 2x64 bit FP, but that's only to make up for the fact that the x86's built-in floating point unit is hilariously bad. That's not the case on any PowerPC (that has an FP unit in the first place).

airsluf · August 6, 2003 4:31PM

Kickaha and Amorph couldn't moderate themselves out of a paper bag. Abdicate responsibility and succumb to idiocy. Two years of letting a member make personal attacks against others, then stepping aside when someone won't put up with it. Not only that but go ahead and shut down my posting priviledges but not the one making the attacks. Not even the common decency to abide by their warning (afer three days of absorbing personal attacks with no mods in sight), just shut my posting down and then say it might happen later if a certian line is crossed. Bullshit flag is flying, I won't abide by lying and coddling of liars who go off-site, create accounts differing in a single letter from my handle with the express purpose to decieve and then claim here that I did it. Everyone be warned, kim kap sol is a lying, deceitful poster.

Now I guess they should have banned me rather than just shut off posting priviledges, because kickaha and Amorph definitely aren't going to like being called to task when they thought they had it all ignored *cough* *cough* I mean under control. Just a couple o' tools.

Don't worry, as soon as my work resetting my posts is done I'll disappear forever.

tomb of the unknown · August 6, 2003 6:01PM

Quote:

Originally posted by Powerdoc

It's quite different from what we where discussing here. It's a discussion of how to make 256 bit FP operation on a altivec unit, rather than altivec 256 bits, or a parallelar altivec unit (the current one have only 4 executions units, all differents).

I know what the paper was about, after all, that's what I said. What I didn't know was what was smalM's application. So I posted the link in the hopes that it would help him if his application was related to the paper.

As to the need for the level of precision provided by 256 bit calculations, if you are looking for gross physical proof of proof of quantum level events, then you are likely to need extreme precision in your calculations just to get a sense of the scale of the phenomenon you seek. Recent discussions about the gravitational lensing effects of relatively small newtonian bodies (Jovian class) demonstrated the need for extremely accurate calculations just so a determination could be made as to whether the anticipated effect could be measured, as I recall.

Vmx 2

Comments