or Connect
AppleInsider › Forums › Mac Hardware › Future Apple Hardware › Macintoshes coming next week ? month ? year ?
New Posts  All Forums:Forum Nav:

Macintoshes coming next week ? month ? year ?

post #1 of 36
Thread Starter 
My first new topic in FH ! hope it won'tbe locked

Knowing current partnerships :
AMD/Apple/nVidia
Apple/IBM/Moto

The next pro line :
XMac, XBook, XServe
new MB
RapidIO/hypertransport
nforce2
Altivec/SIMD
DDR 333/400/666
G5/POWER4/POWER5/Clawhammer/SledgeHammer
1,2,4,8-processors

The next consumer line :
iMac, iBook (improved current PowerMac/PowerBook specs all G4)

The next education line :
eMac, eBook (improved current iMac/iBook specs all G4)


Make your choice...

>>>>>>>>>>>>&g t; WHEN WILL WE HAVE THEM STEVE ? <<<<<<<<<<<<&l t;


<img src="graemlins/hmmm.gif" border="0" alt="[Hmmm]" />

Aw
post #2 of 36
Hey..I just want Double Precision Altivec without losing speed.
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
post #3 of 36
[quote]Originally posted by hmurchison:
<strong>Hey..I just want Double Precision Altivec without losing speed.</strong><hr></blockquote>

but Altivec is only 128 bit wide, and wouldnt be able to performe operations on more than 2 double FP data/cycle.

But looking at the op-code for Altivec, it _is_ designed to be extended to 256 bit. Only a few operations migh be problematic, like bit shifting . But even when you need to bit-shift by that much, it can be resolved by using two bit-shift ops instead of one.
post #4 of 36
[quote]Originally posted by blabla:
<strong>

but Altivec is only 128 bit wide, and wouldnt be able to performe operations on more than 2 double FP data/cycle.

But looking at the op-code for Altivec, it _is_ designed to be extended to 256 bit. Only a few operations migh be problematic, like bit shifting . But even when you need to bit-shift by that much, it can be resolved by using two bit-shift ops instead of one.</strong><hr></blockquote>


Because apps the need number crunching and require Double Precison FPU won't see any advantages from Altivec. Developers for POV Ray and E-On have mentioned this and it probably affects a large number of potential Altivec'd apps out there.

I just want speeeeeeeeeeeeeeeeeeed
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
post #5 of 36
[quote]Originally posted by hmurchison:
<strong>


Because apps the need number crunching and require Double Precison FPU won't see any advantages from Altivec. </strong><hr></blockquote>

I dont think 64 bit FP is much used in various DSP algoritms, at least not in the market Motorola is aiming for.
post #6 of 36
[quote]Originally posted by blabla:
<strong>

I dont think 64 bit FP is much used in various DSP algoritms, at least not in the market Motorola is aiming for.</strong><hr></blockquote>

<a href="http://mac.povray.org/support/faq.html#what_aboutg4" target="_blank">Things about Altivec U should know</a>

<a href="http://www.macuarium.com/macuarium/actual/especiales/2002_08_02_e-ongb.shtml" target="_blank">E-On Software interview </a>

[quote] Macuarium . Respecting hardware, Altivec is well known among 2D and digital video professionals because its power -for instance Final Cut pro 3 on Mac OS X-, does it have the same value in 3D world?

e-on . Unfortunately not; at least, not for our kind of application. Vue 4 is optimised for Altivec, but the performance gain is not where you’d expect it to be.

In terms of rendering speed, the problem is that Altivec only optimizes single precision floats, and we mostly use double precision. So Altivec only yields a 1% gain in performance (funnily enough, the very first Altivec optimised version of Vue 4 we did was actually running slower than the non optimised version!)

But Altivec really shines in other aspects of the application. For instance, the terrain editor is up to 440% faster thanks to Altivec optimisation.

<hr></blockquote>

Can I have my cake...and eat it too??

[ 08-06-2002: Message edited by: hmurchison ]</p>
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
post #7 of 36
[quote]Originally posted by hmurchison:
<strong>
Because apps the need number crunching and require Double Precison FPU won't see any advantages from Altivec. Developers for POV Ray and E-On have mentioned this and it probably affects a large number of potential Altivec'd apps out there.</strong><hr></blockquote>

In the Photoshop plug-in I'm currently working on, we did an experiment, rewriting some of our code to employ Altivec, and found that there was no performance advantage. It turns out that our bottleneck is memory bandwidth. So moving Altivec to double-precision wouldn't help us any, but improving the memory bandwidth certainly would. There are probably many applications which fall into the same boat as us.
post #8 of 36
[quote]Originally posted by Appleworm:
XBook<hr></blockquote>
Not gonna happen.

<a href="http://www.accessmicro.com/laptopsystem/index.phtml" target="_blank">http://www.accessmicro.com/laptopsystem/index.phtml</a>

 

“The nitrogen in our DNA, the calcium in our teeth, the iron in our blood, the carbon in our apple pies were made in the interiors of collapsing stars. We are made of starstuff.” 
-Sagan
Reply

 

“The nitrogen in our DNA, the calcium in our teeth, the iron in our blood, the carbon in our apple pies were made in the interiors of collapsing stars. We are made of starstuff.” 
-Sagan
Reply
post #9 of 36
New Macintoshes never. Steve Jobs has officially announced that Apple is no longer going to produce Macintosh computers. Sorry, folks.. No new G4's now or never. Pity, Pity... Oh yea **CONFIRMED** <img src="graemlins/lol.gif" border="0" alt="[Laughing]" />
post #10 of 36
I've said it before and I'll say it again: an AltiVec-II extension (which supports 256 bit registers and possibly double precision) would require rewriting the existing AltiVec code and the existing double-precision math code in order to take advantage of it. We all know how slow the AltiVec adoption rate has been, do we really want to go through it again while waiting for AV-II to be taken advantage of?

On the other hand, there is plenty of room to improve the G4 without changing the programming model. Add another FPU and suddenly your existing double precision code is (roughly) twice as fast... on the day that the hardware is introduced! Add another VPU or more IPUs and the existing code gets faster.

Look at the mess Intel & AMD are in -- MMX, SSE1, SSE2, 3D-Now!, and serious differences in optimizing code from 486 -&gt; Pentium -&gt; PII/III -&gt; PIV -&gt; Athlon. Writing optimal code for the general market on the x86 side is pretty much impossible without shipping 4-5 versions of the code. Most developers just pick one and optimize for that. Apple / IBM / Moto can't afford this level of confusion and should strive to "keep it simple, stupid". They already complain about how people don't write optimal code for the PPC... if they start acting like the x86 guys it'll just get worse.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #11 of 36
[quote]Originally posted by hmurchison:
<strong>

Can I have my cake...and eat it too??

[ 08-06-2002: Message edited by: hmurchison ]</strong><hr></blockquote>

Implementing 64 bit FP support is a waste of time as long as Altivec stays at 128 bit. At the same time we know double FP performance suck, not so much because of no Altivec support, but more because of weak scalar FP implementation. Adding more scalar FP units would probably be better.

If I had to design next-gen PPC with limited resources, I wouldnt waste it on Altivec 64 bit FP support.
post #12 of 36
( ops.. i misunderstood Programmer.. sorry )

[ 08-06-2002: Message edited by: blabla ]</p>
post #13 of 36
Thanks for clearing that up. I think I'd prefer the additional FPU.
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
He's a mod so he has a few extra vBulletin privileges. That doesn't mean he should stop posting or should start acting like Digital Jesus.
- SolipsismX
Reply
post #14 of 36
Double-precision FPU... I thought I read on Ars Technica that the G4 did double-precision, and that the Pentium 4 didn't. Did I read something wrong?

Matthew
post #15 of 36
[quote]Originally posted by BR:
<strong>
Not gonna happen.

<a href="http://www.accessmicro.com/laptopsystem/index.phtml" target="_blank">http://www.accessmicro.com/laptopsystem/index.phtml</a></strong><hr></blockquote>

That's what we said about iPhoto.
post #16 of 36
iBook was also the trademarked name of a product before Apple released theirs.
"...within intervention's distance of the embassy." - CvB

Original music:
The Mayflies - Black earth Americana. Now on iTMS!
Becca Sutlive - Iowa Fried Rock 'n Roll - now on iTMS!
Reply
"...within intervention's distance of the embassy." - CvB

Original music:
The Mayflies - Black earth Americana. Now on iTMS!
Becca Sutlive - Iowa Fried Rock 'n Roll - now on iTMS!
Reply
post #17 of 36
I'm also going to wonder aloud whether E-On really needs to use double precision in Vue D'Esprit.

I have a funny feeling that if they went SP their renderer would suddenly get legs, and the impact on accuracy would be negligible when all was said and done. I believe NewTek made the same discovery a while back.

Just a hunch.

[ 08-06-2002: Message edited by: Amorph ]</p>
"...within intervention's distance of the embassy." - CvB

Original music:
The Mayflies - Black earth Americana. Now on iTMS!
Becca Sutlive - Iowa Fried Rock 'n Roll - now on iTMS!
Reply
"...within intervention's distance of the embassy." - CvB

Original music:
The Mayflies - Black earth Americana. Now on iTMS!
Becca Sutlive - Iowa Fried Rock 'n Roll - now on iTMS!
Reply
post #18 of 36
[quote]Originally posted by SuperMatt:
<strong>Double-precision FPU... I thought I read on Ars Technica that the G4 did double-precision, and that the Pentium 4 didn't. Did I read something wrong?
</strong><hr></blockquote>

Yes.

G4 and P4 both support double precision in their scalar FPUs. G4 only supports single precision in the SIMD unit, whereas I believe the P4 supports double precision (2 way) in the SSE2 unit.

Amorph's comment about a particular product not really needing more than single precision is quite possibly correct -- many times code is written without thinking about how to "preserve" the available precision which means a lot of it gets squandered for no particular reason other than the laziness of the programmers. I cannot say for certain one way or the other in this particular case -- I don't even know what the product does, I don't know what algorithms are used, etc. All I'm saying is that often single will do even though the developer had a precision "problem"... and they took the easy way out by switching to doubles.

This isn't always the case though... sometimes you just need doubles. Consider a model of the solar system where you need to know the position of something within 1m, but the space in question is 20 billion meters across. 24 bits of mantissa just isn't going to cut it.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #19 of 36
[quote]Originally posted by Programmer:
<strong>This isn't always the case though... sometimes you just need doubles. Consider a model of the solar system where you need to know the position of something within 1m, but the space in question is 20 billion meters across. 24 bits of mantissa just isn't going to cut it. </strong><hr></blockquote>

Apparently NASA just switches to yards when they run into these little problems.

Engineer Different.
The plural of 'anecdote' is not 'data'.
Reply
The plural of 'anecdote' is not 'data'.
Reply
post #20 of 36
post #21 of 36
[[[Because apps the need number crunching and require Double Precision FPU won't see any advantages from Altivec. Developers for POV Ray and E-On have mentioned this and it probably affects a large number of potential Altivec'd apps out there. ]]]

This is kina a myth... Double-precision is certainly not NEEDED everywhere. It's a biased benchmark to begin with. Oh and just so you know, My friend David K. Every worked with the fellow that developed POV-Ray way-back-when and it wasn't really written with the Mac in mind. BTW, you should all hop on over to Dave's site and check out his articles discussing benchmarks, They are a great read,

Anyway, I posted a little something on a different thread back in early July. I'll repost parts of it here. It talks about double precision and why YOU WOULD NOT WANT DOUBLE PRECISION IN THE VECTOR UNIT!

&lt;snip&gt;

How many are familiar with the old excuse:

"Um, but you see, it's not *our* fault the speed is underwhelming; there are just some things that AltiVec simply cannot be used for".

How often have we all heard this? The fact is that the "some things" always turned out to be ONE thing or specific things... "Our apps require double precision and AltiVec cannot be used in any way to perform double precision calculations"

Again, consumers were feeling disappointed and annoyed at Apple. As usual, I snooped around and found some interesting tidbits that many people fail to notice then I checked them for accuracy and validity by asking some legitimate sources... Many users and marketing-types absolutely swear by the "quality" of renders that a double-precision calc would produce. I notice that these claims fail to mention any threshold with respect to human limitations of sight and vision. There is a point where the human eye, no matter how good your vision, will not be able to discern/resolve any increase in resolution/quality even if it was there. And since we are talking about full-motion animated 3D scenes shot on current monitors and TVs, many tricks can be played out on human vision; even the best of it

From what I've discovered, It's reasonable to believe that you don't *need* double precision for 3D, unless you are really, really sloppy with your algorithms (and code).

Double precision calcs are usually employed because you can get away with a lot more slop in your coding. Here is a small rant about this endless nonsense about double precision in the vector unit. I obtained the Info from a trusted source -- a Ph.D. who happens to be a PPC/AltiVec programmer... I decided to cut and paste the info so I could reply more quickly to this forum discussion.

[[[Q: Is an updated double precision-centric AltiVec unit the way to go?

A: No.

This is why:

The vector registers have room for four single precision floats to fit in each one. So for single precision, you can do four calculations at a time with a single AltiVec instruction. AltiVec is fast because you can do multiple things in parallel this way.

Most AltiVec single precision floating point code is 3-4 times faster than the usual scalar single precision floating point code for this reason. The reason that it is more often only three times faster and not the full four times faster (as would be predicted by the parallelism in the vector register I just mentioned) is that there is some additional overhead for making sure that the floats are in the right place in a vector register, that you don't have to deal with in the scalar registers. (There is only one way to put a floating point value in a scalar register.)

Double precision floating point values are twice as big (take up twice as many bytes) as single precision floating point values. That means you can only cram two of them into the vector register instead of four. If our experience with single precision floating point translates to double precision floating point, then the best you could hope to get by having double precision in AltiVec is a (3 to 4)/2 = 1.5 to 2 times speed up.

Is that enough to justify massive new hardware on Motorola's or Apple's part?

In my opinion, no.

This is especially true when one notes that using the extra silicon to instead add a second or third scalar FPU could probably do a better job of getting you a full 2x or 3x speed up, and the beauty part of this is that it would require absolutely no recoding for AltiVec. In other words, it would be completely backwards compatible with code written for older machines, give *instant speedups everywhere* and require no developer retraining whatsoever. This would be a good thing.

Even if you still think that SIMD with only two way parallelism is better than two scalar FPU's, you must also consider that double precision is a lot more complicated than single precision. There is no guarantee that pipeline lengths would not be a lot longer. If they were, that 1.5x speed increase might evaporate -- Quickly.

Yes, Intel has SSE2, which has two doubles in a SIMD unit. Yes, it is faster -- for Intel. It makes sense for Intel for a bunch of reasons that have to do with shortcomings in the Pentium architecture and nothing to do with actual advantages with double precision in SIMD.

To begin with Intel does not have a separate SIMD unit like PowerPC does. If you want to use MMX/SSE/SSE2 on a Pentium, you have to shut down the FPU. That is very expensive to do. As a work around, Intel has added Double precision to its SIMD so that people can do double precision math without having to restart the FPU. You can tell this is what they had in mind because they have a bunch of instructions in SSE2 that only operate on one of the two doubles in the vector. They are in effect using their vector engine as a scalar processing unit to avoid having to switch between the two. Their compilers will even recompile your scalar code to use the vector engine in this way because they avoid the switch penalty.

Okay, so Intel has double precision in their vector unit and despite what I have said, you still think that is absolutely wonderful. But do they Really have a double precision vector unit? The answer is not so clear. Their vector unit actually does calculations on the two doubles in the vector in a similar "one at a time fashion" to the way an ordinary scalar unit would. They only can get one vector FP op through [every two cycles] for this reason. AltiVec has no such limitation.

AltiVec can push through one vector FP op per cycle, doing four floating point operations simultaneously (up to 20 in flight concurrently). AltiVec also has a MAF core, which in many cases does two FP operations per instruction. This is the reason why despite large differences in clock frequency, AltiVec can meet and often beat the performance of Intel's vector engine.

The other big dividend that they get from double precision SIMD is the fact that they can get two doubles into one register. When you only have eight registers this is a big deal! [PowerPC has 32 registers for each of scalar floating point and AltiVec!] In 90% of the cases, we programmers don't need more space in there and the registers the PPC provides are just fine.

Simply put, (from a developers position) we just don't need double precision in the vector engine, and we wouldn't derive much benefit from it if we had it. The worst thing that could possibly happen for Mac developers is that we get it, because that would mean that the silicon could not be used to make some other part of the processor faster and more efficient, and a lot of code would need to be rewritten for little to no performance benefit. It wouldn't be a logical tradeoff.

The only way this would be worthwhile would be to double the width of the vector register so that we get 4x parallelism for double precision FP arithmetic.

And with respect to 3D apps *requiring* double precision...

Most 3D rendering apps do not NEED double precision everywhere. They just need it in a few places, and often (if they really decide to look) they may find that there are more robust single-precision algorithms out there that would be just as good. In the end they should be using those algorithms anyway, because the speed benefits for SIMD are twice as good for single precision than they are for double precision.

Apps like that can get a lot more mileage out of the PowerPC if they just increase the amount of parallelism as much as possible in their data processing. Don't just take one square root at a time, do four etc. And this isn't even taking into account multiprocessing just yet or even AltiVec for that matter. The scalar units alone, by virtue of their pipelines, are capable of doing three to five operations simultaneously! However if you don't give them 3-5 things to do at every given moment, this power goes unused. Unfortunately, this can be noticed in quite a few Mac applications already on the market where performance doesn't seem to be as solid as it should be. What is baffling is why Mac many developers aren't taking advantage of this power. What it boils down to is that most of these apps just do one thing at a time (for the most part), and in turn are wasting 60-80% of the CPU cycles. That's a lot of waste. What's nice is that the AltiVec unit is also pipelined, so it is important to do a lot in parallel there too. The only problem is that developers actually have to make a conscious effort to use the processor the way it was designed to be used. ]]] - (Anonymous source) ]]]

&lt;end snip&gt;

Anyway, I hope that cleared up a few things that have been on the minds of some Mac users. Again, I'm not an expert, but I do research these things. I ask experts and professors and PPC designers and programmers all in an attempt to gain a better understanding of what's really going on.

Some of you might find this bit of info interesting. Check it:

<a href="http://developer.apple.com/hardware/ve/acgresearch.html" target="_blank">http://developer.apple.com/hardware/ve/acgresearch.html</a>

The two titles to "zero-in" on are:

- Vector implementation of multiprecision arithmetic

- Octuple-precision floating-point on Apple G4

and:

102 - Mac OS X Performance Optimization with Velocity Engine
<a href="http://developer.apple.com/adctv/descriptions.html" target="_blank">http://developer.apple.com/adctv/descriptions.html</a>

Best

--
Ed M.
post #22 of 36
[quote]Originally posted by Ed M.:
<strong>


"To begin with Intel does not have a separate SIMD unit like PowerPC does. If you want to use MMX/SSE/SSE2 on a Pentium, you have to shut down the FPU. That is very expensive to do. As a work around, Intel has added Double precision to its SIMD so that people can do double precision math without having to restart the FPU. You can tell this is what they had in mind because they have a bunch of instructions in SSE2 that only operate on one of the two doubles in the vector. They are in effect using their vector engine as a scalar processing unit to avoid having to switch between the two. Their compilers will even recompile your scalar code to use the vector engine in this way because they avoid the switch penalty."
</strong><hr></blockquote>

Now, thats interesting.
post #23 of 36
blabla...

Keep reading... ;-)

--
Ed
post #24 of 36
The x86 need to "turn off" the FPU applies only to the MMX unit, I believe, and it is because the MMX unit actually uses the FPU's register "stack" as the 8 MMX registers. The SSE/SSE2 added their own set of 8 128-bit registers in addition, so they can be used at the same time as the FPU. The problem with the FPU (and the reason why you might want to use SSE2 for doubles instead) is that it is a hideous stack-register based design that was outdated in 1990! The PowerPC FPU programming model is hugely superiour to the x86 model, and that is why both AMD and Intel are trying to replace it with something better.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #25 of 36
[quote] I have a funny feeling that if they went SP their renderer would suddenly get legs, and the impact on accuracy would be negligible when all was said and done. I believe NewTek made the same discovery a while back. <hr></blockquote>

No. Newtek discovered that SP (and hence Altivec) could be useful for modeling and previewing, but not for final rendering.

[quote] And with respect to 3D apps *requiring* double precision...

Most 3D rendering apps do not NEED double precision everywhere. They just need it in a few places, and often (if they really decide to look) they may find that there are more robust single-precision algorithms out there that would be just as good. In the end they should be using those algorithms anyway, because the speed benefits for SIMD are twice as good for single precision than they are for double precision. <hr></blockquote>

No. While it's true that there are places where SP can be used (transformations, hidden surface removal, light intensity calculations, etc.), the lion's share of the calculations in a high quality render MUST be DP. If you tried to substitute SP for DP, you'd find that the precision error would quickly effect the render, which is VERY BAD. It could even get so bad as to eclipse the actual signal; it depends on the complexity of the render. In a scene with millions of polygons, dozens of lights, dozens (at least) of textures, and using radiosity with the ray bouncing a couple dozen times, calculating a single pixel gets extremely complex, requiring hundreds or thousands of operations (perhaps tens of thousands?). The precision error adds up QUICKLY, so it's best to keep it as small as possible from the get-go.

[quote] The SSE/SSE2 added their own set of 8 128-bit registers in addition, so they can be used at the same time as the FPU. <hr></blockquote>

Hmmmm... I don't think that's quite right. IIRC, SSE reduced the overhead in switching from SSE to x87 compared to MMX (from 50 clock ticks to 1, or something like that), but it still uses the same register set, as does SSE2. With SSE2, it shouldn't make any difference; Intel designed SSE2 to completely replace x87, so there shouldn't be a reason to mix x87 and SSE2 code.

It's been a while since I've looked at this stuff, so I could just be remembering things wrong. It's happened before who knows .
post #26 of 36
[quote]Originally posted by Gamblor:
<strong>Hmmmm... I don't think that's quite right. IIRC, SSE reduced the overhead in switching from SSE to x87 compared to MMX (from 50 clock ticks to 1, or something like that), but it still uses the same register set, as does SSE2. With SSE2, it shouldn't make any difference; Intel designed SSE2 to completely replace x87, so there shouldn't be a reason to mix x87 and SSE2 code.

It's been a while since I've looked at this stuff, so I could just be remembering things wrong. It's happened before who knows .</strong><hr></blockquote>


<a href="http://x86.ddj.com/articles/sse_pt1/simd1.htm" target="_blank">http://x86.ddj.com/articles/sse_pt1/simd1.htm</a>

to quote:
"A major difference between MMX and SSE is that no new registers were defined for MMX, while eight new registers have been defined for SSE. Each of the registers for SSE is 128 bits long and can hold four single-precision floating-point numbers (each being 32 bits long). The arrangement of the floating-point numbers in the new data type handled by SSE is illustrated in Figure 1. "


The original Pentium MMX has a fairly hefty switch cost to/from MMX mode -- about 50-60 cycles, IIRC. Starting with the PentiumII, however that cost was reduced to almost nothing. Nonetheless you can't really intermingle MMX and FPU instructions in the same way that the PowerPC can with FPU and VMX instructions. Even more important, the AltiVec unit handles both integer and float data types whereas MMX does integer and SSE does floating point. I think SSE2 addresses this a little, but still doesn't really compare to the AltiVec unit.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #27 of 36
I was kinda right

From <a href="http://www.intel.com/support/processors/pentium4/sb/1059772772898408-prd483.htm" target="_blank">http://www.intel.com/support/processors/pentium4/sb/1059772772898408-prd483.htm</a>:

"Streaming SIMD Extensions 2 (SSE2) extends the MMX(TM) technology and SSE technology with the addition of 144 new instructions that deliver performance increases across a broad range of applications. The SIMD integer instructions introduced with MMX technology have been extended from 64 to 128 bits, doubling the effective execution rate of SIMD integer type operations.

New double-precision floating point SIMD instructions allow for two floating-point operations to be simultaneously executed in the SIMD format, providing support for double-precision operations that help accelerate content creation, financial, engineering, and scientific applications. "

So, basically, SSE2 extends both MMX & SSE. Since it's supposed to replace x87 code, Intel has resolved the conflict of sharing MMX with x87 registers by enhancing SSE to the point where it can effectively replace the x87 unit.

Or something like that.
post #28 of 36
[quote]Originally posted by Gamblor:
<strong>I was kinda right

From <a href="http://www.intel.com/support/processors/pentium4/sb/1059772772898408-prd483.htm" target="_blank">http://www.intel.com/support/processors/pentium4/sb/1059772772898408-prd483.htm</a>:

"Streaming SIMD Extensions 2 (SSE2) extends the MMX(TM) technology and SSE technology with the addition of 144 new instructions that deliver performance increases across a broad range of applications. The SIMD integer instructions introduced with MMX technology have been extended from 64 to 128 bits, doubling the effective execution rate of SIMD integer type operations.

New double-precision floating point SIMD instructions allow for two floating-point operations to be simultaneously executed in the SIMD format, providing support for double-precision operations that help accelerate content creation, financial, engineering, and scientific applications. "

So, basically, SSE2 extends both MMX & SSE. Since it's supposed to replace x87 code, Intel has resolved the conflict of sharing MMX with x87 registers by enhancing SSE to the point where it can effectively replace the x87 unit.

Or something like that. </strong><hr></blockquote>

Ah yes -- I didn't realize that SSE2 had expanded the MMX registers. So now they have 8 integer registers and 8 fpu registers? I suppose that's an improvement.
Providing grist for the rumour mill since 2001.
Reply
Providing grist for the rumour mill since 2001.
Reply
post #29 of 36
post #30 of 36
[quote]Originally posted by AirSluf:
Hmmm, now why do we ABSOLUTELY NEED double precision in renderers?[<hr></blockquote>

Not being a mathematician nor a computer programmer, I can only think of one good reason.

"Just because"

[ 08-07-2002: Message edited by: rickag ]</p>
just waiting to be included in one of Apple's target markets.
Don't get me wrong, I like the flat panel iMac, actually own an iMac, and I like the Mac mini, but...........
Reply
just waiting to be included in one of Apple's target markets.
Don't get me wrong, I like the flat panel iMac, actually own an iMac, and I like the Mac mini, but...........
Reply
post #31 of 36
[quote] But ALAS! Most programmers who have heard of quaternions are scared sh1tless of them because you have to understand the math first, and they don't want to make that time or effort. <hr></blockquote>

How do you know quaternions aren't used extensively in professional 3D renderers already? I don't imagine it would be something they'd advertise on the box.

(Certainly an interesting subject, though. It's been about a decade since I wrote any rendering code, and I don't recall learning about quaternions. In a quick Google search for them I stumbled across this web page: <a href="http://www.javaworld.com/javaworld/jw-08-1998/jw-08-step-p4.html" target="_blank">http://www.javaworld.com/javaworld/jw-08-1998/jw-08-step-p4.html</a>.
Notice what data type is used for all of the elements. I think I'm going to do some more digging on this. Thanks for the inadvertant tip! )
post #32 of 36
Amazing thread
Anyway, I grabbed this link from the yahoo AAPL board.

<a href="http://developer.apple.com/hardware/ve/pdf/oct3a.pdf" target="_blank">http://developer.apple.com/hardware/ve/pdf/oct3a.pdf</a>

It discuss how to implement 256 bit FP (!!) library using Altivec. <img src="graemlins/smokin.gif" border="0" alt="[Chilling]" />
post #33 of 36
post #34 of 36
Interestingly enough, my cursory search at Google on quaternions turned up the fact that 3DStudio MAX's file format supports them in some capacity. There's got to be something to that... If the code monkeys at Autodesk have caught on to them, then they must be in wide use all over the industry by now.

Thanks for the link! I'm going to be getting back into a bit of 3D programming in the next week or so (yay!), but I probably won't have a need for quaternions, which is a bummer (it'll just be a straight forward format converter, in this case VRML-&gt;Lightwave objects, along with some triangle decimation code, or whatever the latest equivalent is.)
post #35 of 36
I'm really surprised that quaternions don't have wider use,the math is actually very easy,which makes them so powerful-the alternative-Euler angles-are really a pain to deal with,and they also have singularities,tears, in the state space.Differential forms and Clifford algebras are other gadgets that are are very powerful,simple to use,much better than what is already in use,but not used much because people just aren't familiar with them.
post #36 of 36
i^2 = j^2 = k^2 = ijk = -1
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Future Apple Hardware
AppleInsider › Forums › Mac Hardware › Future Apple Hardware › Macintoshes coming next week ? month ? year ?