Quote:
Originally Posted by
nht 
I don't think the 320M can do PhysX.
It should do, the requirements are 32 cores + 256MB VRAM; the 320M has 48 cores. PhysX used to work on the 9400M before NVidia bumped up the requirements - I was able to run it with Unreal Tournament 3. It allows destructible environments in certain games:
http://www.youtube.com/watch?v=QD8tA1Vo-XU
That video is more or less how it plays on the 320M regarding the environment destruction.
Quote:
Originally Posted by
nht 
I also don't quite understand the contention that you can't do advanced shader techniques OpenGL via the OpenGL Shading Language. Graphics is not my area of specialty so perhaps you could explain what I'm missing. OpenCL is more open to general implementation but I would expect GLSL to be able to do any shader effect that a GPU is physically capable of doing.
GLSL seems to be similar to RSL (Renderman) where shader code gets executed at certain parts of a graphics pipeline e.g per vertex, pixel etc. Normally meaning you store intermediate data in predefined caches such as image buffers and you have to use Shadeops (like what OpenCL is) to extend the capabilities. Some of the main advantages of OpenCL over GLSL given in the following siggraph paper are:
Scattered writes
Local memory
Thread synchronization
Atomic memory operations
http://sa09.idav.ucdavis.edu/docs/SA09_GL_interop.pdf
From NVidia:
"What are the advantages of CUDA vs. graphics-based GPGPU?
CUDA is designed from the ground-up for efficient general purpose computation on GPUs. Developers can compile C for CUDA to avoid the tedious work of remapping their algorithms to graphics concepts.
CUDA exposes several hardware features that are not available via graphics APIs. The most significant of these is shared memory, which is a small (currently 16KB per multiprocessor) area of on-chip memory which can be accessed in parallel by blocks of threads. This allows caching of frequently used data and can provide large speedups over using textures to access data. Combined with a thread synchronization primitive, this allows cooperative parallel processing of on-chip data, greatly reducing the expensive off-chip bandwidth requirements of many parallel algorithms. This benefits a number of common applications such as linear algebra, Fast Fourier Transforms, and image processing filters.
Whereas fragment programs in graphics APIs are limited to outputting 32 floats (RGBA * 8 render targets) at a pre-specified location, CUDA supports scattered writes - i.e. an unlimited number of stores to any address. This enables many new algorithms that were not possible using graphics APIS to perform efficiently using CUDA.
Graphics APIs force developers to store data in textures, which requires packing long arrays into 2D textures. This is cumbersome and imposes extra addressing math. CUDA can perform loads from any address.
CUDA also offers highly optimized data transfers to and from the GPU."
http://forums.nvidia.com/index.php?s...0&#entry478583
It does note you don't write direct to the framebuffer so you have to make sure when you do graphics operations not to introduce overheads vs GLSL.
OpenCL is also device agnostic so it chooses the best hardware to run on.
You're right though, GLSL is very powerful and can do complex effects, OpenCL is just a different design that allows you to go beyond its capabilities. The following site has some experiments:
http://machinesdontcare.wordpress.co.../10/07/opencl/
He reaches an interesting conclusion recently:
http://machinesdontcare.wordpress.co...asariley-glsl/
"Well, after all that messing around with OpenCL kernels, it turns out its much easier, and more importantly, much faster, to do it in GLSL."
The following demo also shows GLSL being used for physics computation and it seems to run faster than OpenCL:
http://www.youtube.com/watch?v=anNClcux4JQ
In the end though, being able to work without the limits of the GPU APIs is better as it lets you do things like GPU video encoding like you see with Badaboom. This can't be done nearly as fast on the CPU and can't be done at all with GLSL.
Quote:
Originally Posted by
nht 
Yah, it's basically Intel grabbing low hanging fruit to say "See...that GPGPU stuff isn't worth the complexity". I think for some market segments that might be true.
Users of Motion, FCE, etc probably aren't going to be happy with any IGP. Users of iMovie on a MacBook probably aren't going to miss the OpenCL performance delta if the new MacBook is faster than the old MacBook.
Given that GLSL can handle the bulk of graphics computation and fixed-function encoding hardware can encode popular codecs much faster than even GPGPU can do currently, why we need OpenCL in the low-end is certainly a valid consideration. But still, Intel's GPUs have always shown significant lack of OpenGL support and performance so even GLSL etc run poorly vs NVidia/AMD.
Preview benchmarks of the Sandy Bridge GPU came out slower than the NVidia GPU we have now so gamers take a step back while people using the CPU take a step up. Kind of a zero-sum game because you just end up pissing off a different group of people who wonder why their upgrade is slower and less compatible with 3D apps and games.
I think Apple are going to have no choice but to go with i-series + dedicated. They can go with an i3 and it would be an improvement over C2D in terms of power consumption. It won't be a huge jump like an i5 but it differentiates it from the 15". I originally thought they'd do this just for the marketing:
i3 = 13", i5=15", i7=17"
I think it has to be cost more than anything for the i5 + 330M. Sony put this combo in a 13" along with a Blu-Ray drive and quad RAID-0 SSD but it was expensive (~$2000). Apple starts the MBP line much lower than that.
If they started $100 higher at $1299 and pushed out the 13" MBA but keeping a similar design, that might work out ok and I think it would be the most sensible option going forward giving consumers the best value while maintaining competitive hardware.