Intel Chipsets all the Way with OpenCL

Marvin · August 29, 2008 1:26PM

Quote:

Originally Posted by dutch pear

So your argument is that harddisks are big enough and people wanting more space should just select and compress the stuff they want to keep on their computers. I just don't see the logic in that:

Selecting takes time and effort, you run the risk of throwing too much away and have regrets later.

What about the regret that you have 2TB of junk that you don't have the time to sort and it takes ages to back up? Take physical home space. By the same argument, why aren't we all continually buying bigger and bigger houses? It's because there is a physical limit to how much useful stuff people tend to buy and collect. It applies differently in digital space but there are still limits to the amount of information that people generally collect.

A more appropriate example is the human brain. Why isn't everyone constantly reading books about stuff that has no relevance to what we do day to day? Because we don't need to use the capacity of the human brain that way. Every day, we all selectively store the information that is most relevant to us. We all have the capacity to use more but we choose not to for the sake of information management. Stuff that we don't reference often is appropriately forgotten.

This does not mean that people who use more are automatically storing irrelevant information as there will be cases where exceeding 256GB is necessary like huge amounts of raw scientific data. There are just few examples that require that much space in the main drive.

Quote:

Originally Posted by dutch pear

No I won't compress it as to me Raw is the digital equivalent to a negative and once you compress that and throw away the original you can never go back and do it over it again.

I'm not advocating that there be a limit to storage. I'm saying that for most people, 256GB is enough so a switch to SSD wouldn't be a bad thing. People can buy standard HDs if they decide they need more space and SSD will grow in size over time.

Quote:

Originally Posted by mdriftmeyner

I deploy on Linux and OS X and have on Windows. Please spare me the boasting.

What boasting, you're just being very childish. I was telling you that I develop websites and deploy to Linux servers, which they are good for.

Quote:

Originally Posted by mdriftmeyner

Having worked for two operating system companies I know how the way they operate and where they are being used.

^ this is boasting.

Quote:

Originally Posted by mdriftmeyner

Spare me the armchair 95% Windows+OS X market share.

Windows has peaked and it's all down hill from here.

This is pretty much the typical Linux user stance but you need to look outside at what the real world is doing. Linux is very popular in server space but is hardly used on the desktop end at all. Just wanting it to be different doesn't make it so.

Linux needs vendor hardware and software support. Until it gets it, users will be put off migrating to it. Until users do this, vendors don't see an urgency for the extra hardware and software support. Linux is stuck in a catch 22.

Quote:

Originally Posted by Hiro

Take your hoity toity you're all f___ed up because you don't archive to non-interactive formats and get the hell off my wavelength.

I'm not sure what that wavelength is but I can assure you I'm nowhere near it. What do you mean by non-interactive format? Lossless and lossy compressions are all inter-active. I do archive to lossless when I need to but I don't keep these on my working drive and I haven't said it's f'd up to do so, simply that it isn't the best use of space.

That's just plain logic. If you have an event that took place years ago and wasn't important, you could choose to keep the images in RAW format until the end of time but chances are you will never look at them again in your life time.

Quote:

Originally Posted by Hiro

You just keep coming up with more and more ridiculous ways to try and maintain some sort of artificial size limit on what others are allowed to have on hard drives before they become The Unwashed.

I have never said that you would only ever be allowed to store some artificial limit. I am pointing out that most computer users get by just fine with around 256GB of storage therefore a mainstream switch to SSD in the near future is not a problem.

Quote:

Originally Posted by Hiro

I once had a boss in my young-un days who said "If you are in a room and everyone else say's your s___ is all f___ed up, you better go take a damn good look at it, because it awful damn unlikely they are all wrong."

Well the chance all of us are wrong about the usefulness of hard drive sizes approached zero asymptotically a long time ago. Stop fearing the TeraByte!

Did this boss ever tell you to listen thoroughly to what people say before spinning off on some wild tangent? I'll clarify things a bit. I didn't say larger hard drives are not useful but large hard drives for most people are not necessary. This does not mean that 1.75TB out of 2TB of space is of no use. It means that most people would have only needed to use 256GB or thereabouts.

Typical usage where I work is that everyone except the film department uses under 150GB. The film dept use HDV and have maybe 5-10 projects at a time and they get by with 300GB scratch disks. A full HDV film is only about 20GB (same as DV). Once they are done, they get archived to tape or a 1TB hard drive.

You may say aha, so you do need more than 256GB. Of course, but this is for archival use. Our 1TB drives are hardly ever switched on because it is only very rarely that clients come back to us and ask us to continue working on a job that has been invoiced for and stored. I think it's only ever happened once or twice in a number of years.

But again, it's not about a size, the point is about the storage medium. If flash goes to 1TB then there's no problem and it's faster storage. Why would you continue down the route of mechanical storage simply in the interests of capacity? There will always be a limit on how fast a mechanical drive head can move across a disk platter and how fast it can spin.

By saying that drive space is always needed, therefore hard drives will always be better, you are implying that faster drive performance is unnecessary. Hard drives are the biggest bottlenecks in your machine. Most of the time the beach ball appears is because your drive isn't fast enough to handle your VM paging.

All I'm saying is that I believe SSD is the future of storage and it is now at a point where the capacity is not a major problem. I have no doubt that I'll be using hard drives for the next 2-3 years for archival purposes but I could buy that 128GB SSD I posted a link to tomorrow and be very content with my main drive space and performance and I bet most computer users would too.

wizard69 · August 29, 2008 4:23PM

Quote:

Originally Posted by Marvin

That's hardly typical usage and the storage isn't well managed. My brother is a keen photographer and still has under 20GB of photos.

The first thing I have to offer up is you simply don't know what is acceptable disk management for any person.

As to your brother sadly he must not be much of a photographer if he only has 20GB of photos. That is not much at all when you look at modern cameras outputting RAW files from high megapixel count sensors. Even if your goal is the replacement of 35 mm film you still need to hit 15 megapixels for approximately equal resolution.

Quote:

Old events are archived to DVD. 20GB is enough to store 80,000 photos compressed to 250k.

Who in the hell would compress picture data to that extent and call it an archive? In any event a single DVD is not an Archiving program, you would be using multiple DVD's and hopefully monitoring their aging. Further archiving isn't the issue here online access where ever your laptops is is.

Quote:

Do you know how long it takes to look through even 1,000 images? Keeping 80,000 on a main drive or insanely high resolution images is just unnecessary and wasting space.

Buy a good piece of software and setup a slide show. In any event any reasonable person groups their pictures in a manner sane to them. Which by the way isn't any different than film based management, some have shoe boxes and others have fully cataloged file cabinets.

As to high resolution how else would you keep your data on disk. RAW files are in effect the negative from which you generate your working files. They are necessary and certainly can't be considered wasted space. Further more one project could quickly generate 20 GB of data.

What I believe your problem is, is putting to little value on your data and the handling of it. People search for and buy large hard drives simple because it improves the utility of their machine and simplifies work flow no matter what activity they may be focused on.

Quote:

Your kid producing GBs of animation isn't common among kids and renders to uncompressed formats usually aren't archived - you archive the final render with composites and the original scene files, then if you need to go back to them in future, you can render them again or adjust a single uncompressed render.

Yeah like hollywood re-renders a movie every time they want to generate a video tape. Or say you are at a customers site and they want to see a CAD rendering from a project a year or two ago. What are you going to do say: sure just let me regen that, should take more than a half a day on a laptop.

There are any number of reasons why you would want to keep the final data around, what you suggest is not viable in all businesses. Further you expose your self to the problem of software updates leading to regressions in output.

Quote:

I doubt those animations will be longer than 5 minutes (if it's 3D, 5 minutes of animation would take about 15-30 hours to render) and 5 minutes anamorphic DVD uncompressed is under 4GB (depends on the content of course).

Now you are simply throwing out what would best be described as wild ass guesses about a third parties usage and data generation. The above is simply useless in this discussion.

Quote:

The 120GB extra was a figure pulled out of nowhere. I would highly doubt that your entire family will top 50GB of music combined. If so, the collection needs sorting because that's about 15,000 tracks. The OS size will be reduced and is currently only 15GB.

And just what would sorting do for them? Assuming they invested in each of those tracks would it be wise to throw them out simply because they need more space? More so what if somebody throws out a preferred track that somebody else wanted.

Quote:

Other stuff, we're talking about small movie clips, word documents etc - combined about 20GB. This leaves plenty of overhead.

256GB - 15GB OS - 100GB photos - 10GB software - 40GB music - 30GB uncompressed renders - 20GB miscellaneous files leaves 40GB of free space and that is still fairly badly managed storage.

How can you imply that it is badly managed? Have you seen the directory and partition layout, do you know what a partition is? Are you really sure about that 10GB for software or are you one of the people that never install any apps on their PC?

Quote:

I could understand if people want to run multiple systems and they want storage split between them but even at that, the space required in total won't exceed 256GB for most people.

Maybe what we need to do is to simply say you don't know what you are talking about. The phrase most people kinda highlights that. Do you really believe that everybody does and should use their PC in the same way?

Quote:

If 256GB is too low, go to 512GB, which will be available next year in SSD. SSD will increase in space too. I just don't see a point where storage is needed to keep on increasing. The popularity of Time Machine shows people are more concerned about data safety and SSD is more reliable.

Well we could discuss the reliability of SSD at length but I believe every body can agree they need backing up just as much as anything else.

But what I find really perplexing is you statement about there being no need for storage to keep on increasing!!! I just can't fathom how you came to that conclusion when one of the biggest issues people have is the storage of all their data.

Quote:

You clearly want to get into a whole Linux thing here but hardly any production software runs natively on Linux. Tell me an industry standard NLE like Final Cut or Avid that runs natively on Linux:

Again this is a stark indication that you are pouring forth gibberish not based on any facts. For example what OS do you think hollywood uses for most of their render farms? On top of that do you believe that all production streams require Final Cut or AVID? But to pul this part of the thread somewhat back on course how much disk storage space to you think those hollywood render farms use?

Quote:

http://community.avid.com/forums/t/5...px?PageIndex=1

How can it be used in production for this field when it doesn't run the software? Go to any store and note down all the boxes that have Linux support written on them.

Go to the store to buy Linux software - that is a good one. Better yet you should get a grip on the idea that you might not be going to the store to buy any software for any OS in the future.

Quote:

When I see demos of Linux window managers, they are often concerned with trying to make their interfaces like aqua. GTK+ is only going to bring native software like GIMP, which falls way short of Photoshop. Trolltech Qt has been on OS X for ages.

Obviously you don't know Linux or have been corrupted by Ubuntu.

Quote:

These companies didn't 'get smart'. They caved into pressure from people like you who demand that Linux be recognised as a viable alternative. They don't even advertise it clearly that they offer it as an option. It's not listed beside XP or Vista in BTO because people will see it as equal to Windows, buy it and then flood the seller with complaints saying that hardly any software works properly with it.

Very few OEMs sell it because it's not a good OS for desktop use. Machines like the EEEPC are different because it's a cut down version of an OS like the gOS Google Linux system.

It is interesting here that you try to imply that a stripped down OS offers more than a full install.

It should be noted here that I ran Linux on a desktop machine for years and never had an issue with usability. That running Fedora which is a bit bleeding edge. My whole interest in the Mac revolves around two things. One was access to iTunes the other was the coming iPhone.

Now what is really neat is that that Mac OS runs many of the programs I used on the Linux desktop machine. So do you think my MBP is less of a machine, for the desktop, because it can run Open source software? Software is like a pair of shoes, either it fits your needs or it leaves you in pain until you find a new pair.

Quote:

Ok so Windows + OS X has about 95% of the desktop market. If you're telling me that website developers take up that other 5% (if it is even that much), I'm not the one who's full of it.

In server space not desktop space. Linux is great for server use and I deploy websites on Linux servers, I wouldn't even consider using Windows servers.

Now isn't that perspective as valuable as the ones from people saying they have had great success with Linux on the desktop?

I know there are a lot of response to your postings, mostly negative, so I wonder doesn't this make you think a bit about the display you are putting on?

Dave

wizard69 · August 29, 2008 4:48PM

Quote:

Originally Posted by Marvin

What about the regret that you have 2TB of junk that you don't have the time to sort and it takes ages to back up? Take physical home space. By the same argument, why aren't we all continually buying bigger and bigger houses?

That has pretty much been the pattern in the USA for some time. Buy a starter home, hatch a few kids, move to a bigger house and rinse and repeat until no more kids. If you are luck and have done you home work right when the kids are gone you have a big sale buy a sail boat and wander the oceans never to be seen again.

Quote:

It's because there is a physical limit to how much useful stuff people tend to buy and collect. It applies differently in digital space but there are still limits to the amount of information that people generally collect.

Sure it is different in the digital space in one sense but not in the issue of how people value things. Try telling a woman that the pics of her first child's first birthday aren't valuable. No really try it as that would certainly remove you from this thread for awhile.

Quote:

A more appropriate example is the human brain. Why isn't everyone constantly reading books about stuff that has no relevance to what we do day to day?

Actually I see this everyday at work, in the parks and other places. This might surprise you some people actually read books more than once.

Quote:

Because we don't need to use the capacity of the human brain that way. Every day, we all selectively store the information that is most relevant to us. We all have the capacity to use more but we choose not to for the sake of information management. Stuff that we don't reference often is appropriately forgotten.

Ahh yes I'm pass forty so I know all about forgetting things

The difference with the computer and its hard drive is that when you need to reference that data there are tools there to let you find it. At least with modern OS'es there are. With the human brain you may never find the data again.

Quote:

This does not mean that people who use more are automatically storing irrelevant information as there will be cases where exceeding 256GB is necessary like huge amounts of raw scientific data. There are just few examples that require that much space in the main drive.

So say you. What many of us are saying here though is that you don't seem to grasp that needs vary greatly and the needs of one for storage can vary widely from that of another.

Quote:

I'm not advocating that there be a limit to storage. I'm saying that for most people, 256GB is enough so a switch to SSD wouldn't be a bad thing. People can buy standard HDs if they decide they need more space and SSD will grow in size over time.

It might not be a bad thing but what is fairly obvious to me is that SSD right now are too far behind on the capacity / price metric for consideration. There is more importance right now in capacity than in just aobut anything else.

Quote:

This is pretty much the typical Linux user stance but you need to look outside at what the real world is doing. Linux is very popular in server space but is hardly used on the desktop end at all. Just wanting it to be different doesn't make it so.

Yes but expressing the above doesn't make it so either.

Quote:

Linux needs vendor hardware and software support. Until it gets it, users will be put off migrating to it. Until users do this, vendors don't see an urgency for the extra hardware and software support. Linux is stuck in a catch 22.

Isn't software a part of a platforms buying decision?

Quote:

That's just plain logic. If you have an event that took place years ago and wasn't important, you could choose to keep the images in RAW format until the end of time but chances are you will never look at them again in your life time.

Doesn't really matter how long as long as they are still there.

Quote:

.

Dave

mjteix · August 29, 2008 6:24PM

Quote:

Originally Posted by Marvin

.

I'm lost, what was the subject of the thread?

mdriftmeyer · August 29, 2008 7:03PM

Quote:

Originally Posted by mjteix

I'm lost, what was the subject of the thread?

Exactly. OpenCL is going to create innovation across OS X, Linux and for those CAD companies who care about OpenGL for Windows, even on Vista.

The best will come from Apple.

futurepastnow · August 29, 2008 9:22PM

Quote:

Originally Posted by mdriftmeyer

Exactly. OpenCL is going to create innovation across OS X, Linux and for those CAD companies who care about OpenGL for Windows, even on Vista.

The best will come from Apple.

Since you mention Windows, I should point out that Microsoft will be pushing its own GPGPU API as part of DX11. That and OpenCL will effectively shut down CUDA and whatever ATI's proprietary plan is.

programmer · August 29, 2008 10:28PM

Quote:

Originally Posted by FuturePastNow

Since you mention Windows, I should point out that Microsoft will be pushing its own GPGPU API as part of DX11. That and OpenCL will effectively shut down CUDA and whatever ATI's proprietary plan is.

Which is a good thing because the proprietary solutions aren't good for the industry. The trouble arises when standards are run by committee. Microsoft has the advantage there with DX which has been evolving rapidly and driving GPU development. OpenGL has been playing a slow game of catch-up. OpenCL may succeed because Apple is offering it almost fait accompli to the standards committee and isn't likely to allow whatever the committee does to derail them shipping it in Snow Leopard. The DX11 compute shader stuff looks somewhat more limited than CUDA (and OpenCL if that resembles CUDA as much as is implied).

mdriftmeyer · August 30, 2008 1:52AM

Quote:

Originally Posted by FuturePastNow

Since you mention Windows, I should point out that Microsoft will be pushing its own GPGPU API as part of DX11. That and OpenCL will effectively shut down CUDA and whatever ATI's proprietary plan is.

Sorry to burst your bubble but ATI is moving Openstreams to OpenCL compliance--in engineering speak they are dropping their proprietary solution for OpenCL.

http://www.amd.com/us-en/Corporate/V...127451,00.html

I suspect they will offer quite a solution for Windows and OpenGL 3.x with OpenCL.

Marvin · August 30, 2008 3:26AM

Quote:

Originally Posted by wizard69

Yeah like hollywood re-renders a movie every time they want to generate a video tape. Or say you are at a customers site and they want to see a CAD rendering from a project a year or two ago. What are you going to do say: sure just let me regen that, should take more than a half a day on a laptop.

In future it will be possible but that's not what I said. You picked that part out of the sentence to make an argument.

Quote:

Originally Posted by wizard69

Now you are simply throwing out what would best be described as wild ass guesses about a third parties usage and data generation. The above is simply useless in this discussion.

So has everyone else.

Quote:

Originally Posted by wizard69

For example what OS do you think hollywood uses for most of their render farms?

Those aren't desktops.

Quote:

Originally Posted by wizard69

On top of that do you believe that all production streams require Final Cut or AVID?

You can use Premiere but that's not on Linux either. But yes I believe all film production workflows require film cutting and editing. Kinda obvious really.

Quote:

Originally Posted by wizard69

But to pul this part of the thread somewhat back on course how much disk storage space to you think those hollywood render farms use?

Maybe read the topic of the thread or maybe just read in general because I didn't put a cap on storage either. But what you'll find is that the render farm storage isn't that much. The assets are retrieved over a high bandwidth network - they even have high bandwidth networks between offices over huge geographic distances.

http://www.cgw.com/ME2/dirmod.asp?si...8FFFD86E9E8B5A

"The data comes from a model farm that's typically 3TB to 4TB in size."

"With Cars, because we were doing raytracing, the number of reads needed to calculate a frame increased dramatically"

(need higher performance over storage)

"The data needed for the renderfarm to do its work at any point in time is usually between 100GB and 200GB."

They have a 3,000 CPU farm. Put even 128GB in each and you get high performance 384TB of space.

Quote:

Originally Posted by wizard69

It might not be a bad thing but what is fairly obvious to me is that SSD right now are too far behind on the capacity / price metric for consideration.

So you've changed your mind completely since post 79?

Quote:

Originally Posted by mjteix

I'm lost, what was the subject of the thread?

Why are you asking me? wizard69 was the first to mention SSD and mdriftmeyner went off on some Linux thing.

SSD was about what Apple would take a hit on margins for, which vaguely ties in with whether they will spend it on better GPUs or specialized processors so I guess it had implications for the actual thread topic. I think in the interests of 10.6, the latter two are far more likely.

Quote:

Originally Posted by Programmer

OpenCL may succeed because Apple is offering it almost fait accompli to the standards committee and isn't likely to allow whatever the committee does to derail them shipping it in Snow Leopard.

My worry is what happens after that. Does Apple still control the developments in OpenCL kind of like how they push forward webkit or will it slow down to the pace of OpenGL. I think it will probably differ from OpenGL by forcing GPU makers to comply with the language instead of adding extensions to the language to support newer hardware features.

CUDA won't give any advantage as OpenCL uses the same interface as CUDA on Nvidia chips so OpenCL should replace it eventually. I'm still interested in how this impacts OpenGL though. Will it be the case where the development progress won't be slow any more as new features on GPUs can simply be supported via OpenCL?

This is what happens in Leopard with the LLVM except the code is executed on the CPU. In some ways this will mean OpenGL development will be slower but it won't be a bad thing. It will allow OpenGL to be used how it was intended, as a graphics library not a computation library.

This makes it much easier to improve graphics quality, which is the point behind OpenGL development as shaders can be programmed much more openly to allow a wider variety of techniques for faking certain effects such as refraction, light dispersion, occlusion, caustics, SSS.

I just hope the distinction is made clearly enough so that people know when to use which language. A lot of this stuff can be done now with OpenGL code. Will OpenCL code simply not work on older graphics hardware or go so slow as to be unusable on the CPU?

mdriftmeyer · August 31, 2008 12:35AM

Quote:

Originally Posted by Marvin

In future it will be possible but that's not what I said. You picked that part out of the sentence to make an argument.

So has everyone else.

Those aren't desktops.

You can use Premiere but that's not on Linux either. But yes I believe all film production workflows require film cutting and editing. Kinda obvious really.

Maybe read the topic of the thread or maybe just read in general because I didn't put a cap on storage either. But what you'll find is that the render farm storage isn't that much. The assets are retrieved over a high bandwidth network - they even have high bandwidth networks between offices over huge geographic distances.

http://www.cgw.com/ME2/dirmod.asp?si...8FFFD86E9E8B5A

"The data comes from a model farm that's typically 3TB to 4TB in size."

"With Cars, because we were doing raytracing, the number of reads needed to calculate a frame increased dramatically"

(need higher performance over storage)

"The data needed for the renderfarm to do its work at any point in time is usually between 100GB and 200GB."

They have a 3,000 CPU farm. Put even 128GB in each and you get high performance 384TB of space.

So you've changed your mind completely since post 79?

Why are you asking me? wizard69 was the first to mention SSD and mdriftmeyner went off on some Linux thing.

SSD was about what Apple would take a hit on margins for, which vaguely ties in with whether they will spend it on better GPUs or specialized processors so I guess it had implications for the actual thread topic. I think in the interests of 10.6, the latter two are far more likely.

My worry is what happens after that. Does Apple still control the developments in OpenCL kind of like how they push forward webkit or will it slow down to the pace of OpenGL. I think it will probably differ from OpenGL by forcing GPU makers to comply with the language instead of adding extensions to the language to support newer hardware features.

CUDA won't give any advantage as OpenCL uses the same interface as CUDA on Nvidia chips so OpenCL should replace it eventually. I'm still interested in how this impacts OpenGL though. Will it be the case where the development progress won't be slow any more as new features on GPUs can simply be supported via OpenCL?

This is what happens in Leopard with the LLVM except the code is executed on the CPU. In some ways this will mean OpenGL development will be slower but it won't be a bad thing. It will allow OpenGL to be used how it was intended, as a graphics library not a computation library.

This makes it much easier to improve graphics quality, which is the point behind OpenGL development as shaders can be programmed much more openly to allow a wider variety of techniques for faking certain effects such as refraction, light dispersion, occlusion, caustics, SSS.

I just hope the distinction is made clearly enough so that people know when to use which language. A lot of this stuff can be done now with OpenGL code. Will OpenCL code simply not work on older graphics hardware or go so slow as to be unusable on the CPU?

Apple pushes OpenCL and brings a rich standard that gets certified, endorsed and implemented by the OpenGL consortium [nvidia, amd, et.al.]

Apple then continues to extend OpenCL for it's own needs and whether or not these extensions get endorsed by the greater community is up to them. Meanwhile, Apple's chipsets to leverage these features, on Apple Hardware make their solutions even more compelling.

Marvin · August 31, 2008 5:00AM

Quote:

Originally Posted by mdriftmeyer

Apple then continues to extend OpenCL for it's own needs and whether or not these extensions get endorsed by the greater community is up to them. Meanwhile, Apple's chipsets to leverage these features, on Apple Hardware make their solutions even more compelling.

Problem is, will software developers use those Apple-specific changes? Multi-threaded OpenGL was supposed to be some big thing a while ago and I've only heard of Blizzard using it in World of Warcraft.

According to the OpenCL brief from one of the presenters at siggraph, OpenCL will be approachable but targeted at expert developers - no convenience functions. Although they then go on to say that it will have a rich set of built-in functions.

Then we have to add DX11 into the mix unfortunately. Will supporting Microsoft's compute shaders make porting this code more difficult - similar issue porting SSE to AltiVec or vice versa ( http://developer.apple.com/documenta...section_1.html )? They released presentations on their compute shader stuff for DX11 4 days ago:

http://www.microsoft.com/downloads/d...DisplayLang=en

They also want to jump on the multithreading bandwagon:

http://www.microsoft.com/downloads/d...displaylang=en

The first presentation talks about target uses being raytracing, radiosity, physics, AI. It supports tessellation/displacement too.

This is where it gets interesting because developers seem to be fairly unhappy with the OpenGL 3 release comparing to DX:

http://www.gamedev.net/community/for...opic_id=504547

http://www.opengl.org/discussion_boa...243193&fpart=1

http://www.intology.com/computers-in...-with-directx/

Khronos says:

"Just as importantly, OpenGL 3.0 sets the stage for a revolution to come ? we now have the roadmap machinery and momentum in place to rapidly and reliably develop OpenGL ? and are working closely with OpenCL to ensure that OpenGL plays a pivotal role in the ongoing revolution in programmable visual computing."

So OpenCL isn't planned as an addition to OpenGL but rather an integral part of its development in order to rival what DirectX have without breaking backwards compatibility (for all the CAD people).

As always it will be down to the implementation and how it is used by developers that decides how successful the strategy is. What happens if Microsoft makes a more accessible API though? Just look what's happened with the XBox 360.

They need something that's extensible, well supported, powerful and easily accessible. Microsoft can satisfy more of these requirements because they define what they need. With so many members contributing to and demanding from OpenGL, it's a case of too many cooks spoiling the broth.

With Apple defining OpenCL, they can improve on this and OpenGL can remain fairly static. The key I think is in how closely the two will work together. We could be left with an API for graphics that stagnates and an API for GPGPU computing few people will use. Meanwhile, DirectX with a more unified approach will still get more advanced features and graphics. I see no suggestion that DX11 compute shaders will be more limited than OpenCL.

I don't ever want to see a day when Windows apps run significantly faster than the Mac equivalent. DirectX 11 should be out late 2009 and OpenCL in Summer 2009. The problem I see is OpenCL will likely be accessible on Macs only around that time and yet again the Windows marketshare opens the doors to more developers for DX11. It will take time for both to be adopted but it would be nice if Apple could somehow get a more significant head start.

programmer · August 31, 2008 9:44AM

Quote:

Originally Posted by Marvin

I see no suggestion that DX11 compute shaders will be more limited than OpenCL.

They are much more limited.

mdriftmeyer · August 31, 2008 11:22PM

Quote:

Originally Posted by Marvin

Problem is, will software developers use those Apple-specific changes? Multi-threaded OpenGL was supposed to be some big thing a while ago and I've only heard of Blizzard using it in World of Warcraft.

According to the OpenCL brief from one of the presenters at siggraph, OpenCL will be approachable but targeted at expert developers - no convenience functions. Although they then go on to say that it will have a rich set of built-in functions.

Then we have to add DX11 into the mix unfortunately. Will supporting Microsoft's compute shaders make porting this code more difficult - similar issue porting SSE to AltiVec or vice versa ( http://developer.apple.com/documenta...section_1.html )? They released presentations on their compute shader stuff for DX11 4 days ago:

http://www.microsoft.com/downloads/d...DisplayLang=en

They also want to jump on the multithreading bandwagon:

http://www.microsoft.com/downloads/d...displaylang=en

The first presentation talks about target uses being raytracing, radiosity, physics, AI. It supports tessellation/displacement too.

This is where it gets interesting because developers seem to be fairly unhappy with the OpenGL 3 release comparing to DX:

http://www.gamedev.net/community/for...opic_id=504547

http://www.opengl.org/discussion_boa...243193&fpart=1

http://www.intology.com/computers-in...-with-directx/

Khronos says:

"Just as importantly, OpenGL 3.0 sets the stage for a revolution to come – we now have the roadmap machinery and momentum in place to rapidly and reliably develop OpenGL – and are working closely with OpenCL to ensure that OpenGL plays a pivotal role in the ongoing revolution in programmable visual computing."

So OpenCL isn't planned as an addition to OpenGL but rather an integral part of its development in order to rival what DirectX have without breaking backwards compatibility (for all the CAD people).

As always it will be down to the implementation and how it is used by developers that decides how successful the strategy is. What happens if Microsoft makes a more accessible API though? Just look what's happened with the XBox 360.

They need something that's extensible, well supported, powerful and easily accessible. Microsoft can satisfy more of these requirements because they define what they need. With so many members contributing to and demanding from OpenGL, it's a case of too many cooks spoiling the broth.

With Apple defining OpenCL, they can improve on this and OpenGL can remain fairly static. The key I think is in how closely the two will work together. We could be left with an API for graphics that stagnates and an API for GPGPU computing few people will use. Meanwhile, DirectX with a more unified approach will still get more advanced features and graphics. I see no suggestion that DX11 compute shaders will be more limited than OpenCL.

I don't ever want to see a day when Windows apps run significantly faster than the Mac equivalent. DirectX 11 should be out late 2009 and OpenCL in Summer 2009. The problem I see is OpenCL will likely be accessible on Macs only around that time and yet again the Windows marketshare opens the doors to more developers for DX11. It will take time for both to be adopted but it would be nice if Apple could somehow get a more significant head start.

I'm betting that Apple providing a rich programming environment for Parallel Computing, combined with OpenGL and leveraging ALL CORES in 10.6 will definitely be the kick in the ass developers have wanted for a long time.

No one wants to invest resources to later see Apple go in a completely different direction on such a time intensive investment as Parallel Computing.

OpenCL will be at least 3 quarters ahead of DirectX 11 and OpenCL has been a project for several years at Apple. It's already mature. This isn't an idea that was brought to the table and quickly snatched up by the OpenGL Consortium. It's a complete solution, ready to go. This maturity is what makes it a no-brainer for AMD and Nvidia to migrate to it, while I see Nvidia taking longer to do so.

I also don't see CUDA disappearing but being adapted to work with OpenCL.

wizard69 · September 1, 2008 12:22AM

Quote:

Originally Posted by mdriftmeyer

I'm betting that Apple providing a rich programming environment for Parallel Computing, combined with OpenGL and leveraging ALL CORES in 10.6 will definitely be the kick in the ass developers have wanted for a long time.

No one wants to invest resources to later see Apple go in a completely different direction on such a time intensive investment as Parallel Computing.

OpenCL will be at least 3 quarters ahead of DirectX 11 and OpenCL has been a project for several years at Apple. It's already mature. This isn't an idea that was brought to the table and quickly snatched up by the OpenGL Consortium. It's a complete solution, ready to go. This maturity is what makes it a no-brainer for AMD and Nvidia to migrate to it, while I see Nvidia taking longer to do so.

I also don't see CUDA disappearing but being adapted to work with OpenCL.

I guess my question is which cores, the ones in the video card, the main CPU complex or something new. You see this thread is interesting because of certain things Apple has said recently. That about system performance and the impact to finacials. Maybe all of that concern expressed by Apple is related to iPod family - who knows. What I want to know is if there is a tie in between the warnings and OpenCL.

Frankly I just see the GPU as a poor place for Apple to rely upon for competive power. The simple reason being that everybody has access to the GPU. Sure if they get a jump on MS they get a bit of an advantage, but MS will catch up and the software vendors have an alternative already. So is OpenCL really the part of the equation that addresses the alleged performance increases?

That is why I tuned into the speculation about Apple introducing it's own vector hardware. Further because so little has been heard about such a possibilty and is a low probability I've thrown out other ideas that may impact earnings in the next quarter. In a nut shell I'm not convinced that OpenCL is the big dog without some hardware wagging it's tail.

Especially considering my impression that OpenCL is a ways off.

What can I say, what Apple says and what seems possible short term just leaves me wondering what is up at Apple.

Marvin · September 1, 2008 3:58PM

Quote:

Originally Posted by mdriftmeyer

This maturity is what makes it a no-brainer for AMD and Nvidia to migrate to it, while I see Nvidia taking longer to do so.

I also don't see CUDA disappearing but being adapted to work with OpenCL.

According to an admin on the AMD forum, the current AMD APIs won't be deprecated but that DX11 and OpenCL are "simply additional programming entry points for developers for whom it makes more sense":

http://forums.amd.com/forum/messagev...threadid=98534

This sounds like what NVidia plan too. The Nvidia guy said that OpenCL uses the CUDA driver stack, which implies their CUDA software will continue but OpenCL will simply use it as a bridge to the hardware.

Here's how I imagine it will look:

High level code

---

compiled code using LLVM

---

CPU --- CUDA interface to NVidia hardware --- CAL interface to AMD hardware

At the lowest step, LLVM decides which interface to use so it runs best on each piece of hardware but the NVidia and AMD SDKs will give you direct access to one piece of hardware and it won't be portable. OpenCL is the obvious choice but I can see it being a C vs assembler deal (although they're all using C). Assembler is less portable but can on occasions give you better performance.

Quote:

Originally Posted by wizard69

Frankly I just see the GPU as a poor place for Apple to rely upon for competive power. The simple reason being that everybody has access to the GPU. Sure if they get a jump on MS they get a bit of an advantage, but MS will catch up and the software vendors have an alternative already.

Yeah, I can see the downside of everyone else having access to GPUs and not only that, they typically use faster ones than Apple but one thing that stands out in Aaftab Munshi's OpenCL siggraph paper is having to query, select and initialize compute devices in the system then create a compute context and execute a compute kernel (a small unit of code like a function).

It almost seems like you pick the hardware to run your code.

// create a compute context with GPU device

context = clCreateContextFromType(CL_DEVICE_TYPE_GPU);

// create a work-queue

queue = clCreateWorkQueue(context, NULL, NULL, 0);

// allocate the buffer memory objects

memobjs[0] = clCreateBuffer(context,

CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,

sizeof(float)*2*num_entries, srcA);

memobjs[1] = clCreateBuffer(context,

CL_MEM_READ_WRITE,

sizeof(float)*2*num_entries, NULL);

// create the compute program

program = clCreateProgramFromSource(context, 1,

&fft1D_1024_kernel_src, NULL);

// build the compute program executable

clBuildProgramExecutable(program, false, NULL, NULL);

// create the compute kernel

kernel = clCreateKernel(program, ?fft1D_1024?);

// create N-D range object with work-item dimensions

global_work_size[0] = n;

local_work_size[0] = 64;

range = clCreateNDRangeContainer(context, 0, 1,

global_work_size,

local_work_size);

// set the args values

clSetKernelArg(kernel, 0, (void *)&memobjs[0],

sizeof(cl_mem), NULL);

clSetKernelArg(kernel, 1, (void *)&memobjs[1],

sizeof(cl_mem), NULL);

clSetKernelArg(kernel, 2, NULL,

sizeof(float)*(local_work_size[0]+1)*16, NULL);

clSetKernelArg(kernel, 3, NULL,

sizeof(float)*(local_work_size[0]+1)*16, NULL);

// execute kernel

clExecuteKernel(queue, kernel, NULL, range, NULL, 0, NULL);

Here is an example kernel:

// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into

// calls to a radix 16 function, another radix 16 function and then a radix 4 function

__kernel void fft1D_1024 (__global float2 *in, __global float2 *out,

__local float *sMemx, __local float *sMemy) {

int tid = get_local_id(0);

int blockIdx = get_group_id(0) * 1024 + tid;

float2 data[16];

// starting index of data to/from global memory

in = in + blockIdx; out = out + blockIdx;

globalLoads(data, in, 64); // coalesced global reads

fftRadix16Pass(data); // in-place radix-16 pass

twiddleFactorMul(data, tid, 1024, 0);

// local shuffle using local memory

localShuffle(data, sMemx, sMemy, tid, (((tid & 15) * 65) + (tid >> 4)));

fftRadix16Pass(data); // in-place radix-16 pass

twiddleFactorMul(data, tid, 64, 4); // twiddle factor multiplication

localShuffle(data, sMemx, sMemy, tid, (((tid >> 4) * 64) + (tid & 15)));

// four radix-4 function calls

fftRadix4Pass(data); fftRadix4Pass(data + 4);

fftRadix4Pass(data + 8); fftRadix4Pass(data + 12);

// coalesced global writes

globalStores(data, out, 64);

}

"OpenCL is designed to efficiently share with OpenGL.

Efficient queuing of OpenCL and OpenGL commands"

http://s08.idav.ucdavis.edu/munshi-opencl.pdf

There are a few limitations in the language it points out too like recursion. Not sure if it eliminates recursion but that would limit it's usefulness in certain calculations:

http://en.wikipedia.org/wiki/Recursi...puter_science)

Makes sense in a way because you can't make that process parallel anyway.

All this points towards it being used on the GPU IMO. At the very least we know it's a language designed for highly parallel computation and Apple would simply have to make a chip with at least the equivalent of 32 cores running at 450MHz to rival the 8600M GT they already have in the MBP. They could say throw a quad 1GHz ARM or whatever into the mix that might run more code and be closer to the CPU but why wouldn't they use maybe an 8800M GTS with 64 cores instead? They can underclock those if they want but they've just doubled their processors for very little extra cost (this seems to go against the hit to the margins but not if they make their computers cheaper, which is what they implied with reference to being competitive). OpenCL is running specially written compute kernels so it doesn't matter if they are just stream processors.

I think the advantage that Apple will get is by putting OpenCL into the core of the OS. This ensures anybody on 10.6 gets access to the new features. Microsoft can't do this even with Windows 7 - DirectX 11 is optional now because it means too little compatibility. Apple on the other hand have the freedom to cut out the entire PPC platform.

If there was another processor, Apple will still be using GPUs. This means they have the cost of a GPU + the cost of a processor component. Why would they add this in if they have to manually initialize this vector compute device separately from the GPU? Larrabee I can understand because it would replace the GPU so you only have two working contexts to deal with, the GPU and the CPU. This makes it no more difficult than OpenGL programming where you do the same. You create an OpenGL context before drawing to it.

It also means that people who have invested in Apple hardware see a huge benefit upgrading to 10.6. Usually I like to be on the bleeding edge of things but I still haven't upgraded to 10.5 as I don't really see anything I'm missing. If my video encodes run 3 times faster then 10.6 is very interesting but having to buy new hardware again to see this benefit would kinda put me off and I'm sure it would to others - such as Mac Pro owners hoping to have a machine to last them 3 years with a current high end 8800 GT.

OpenCL isn't going to be a simple code recompile, it'll be like Alti-Vec and will heavily rely on people who understand parallel computing using it before you see any performance benefit. Again this is why it being in the OS is good because Apple can use it for all the core libraries they already have that developers use and they won't have to think about a lot of the tasks as the libraries will do this for them.

This gives them at least a 6 month lead on Microsoft as Windows 7 isn't coming until 2010 at the earliest and details on DX 11 compute shaders aren't widely known yet so we don't know what limitations they have yet. They sound like extensions to a shader language, which isn't entirely flexible. Whether they will be less flexible than OpenCL compute kernels we'll have to see. Microsoft are fully aware of OpenCL but their focus seems to be gaming as opposed to parallel computing.

mdriftmeyer · September 1, 2008 5:37PM

C99 Specification: http://www.open-std.org/JTC1/SC22/WG...docs/n1256.pdf

Selection from 6.5.2.2 Function Calls.

Quote:

10 Recursive function calls shall be permitted, both directly and indirectly through any chain of other functions.

Apple is well aware of the uses for recursion so I'd expect them to address this need.

mdriftmeyer · September 1, 2008 5:39PM

Quote:

Originally Posted by Marvin

According to an admin on the AMD forum, the current AMD APIs won't be deprecated but that DX11 and OpenCL are "simply additional programming entry points for developers for whom it makes more sense":

http://forums.amd.com/forum/messagev...threadid=98534

This sounds like what NVidia plan too. The Nvidia guy said that OpenCL uses the CUDA driver stack, which implies their CUDA software will continue but OpenCL will simply use it as a bridge to the hardware.

Here's how I imagine it will look:

High level code

---

compiled code using LLVM

---

CPU --- CUDA interface to NVidia hardware --- CAL interface to AMD hardware

At the lowest step, LLVM decides which interface to use so it runs best on each piece of hardware but the NVidia and AMD SDKs will give you direct access to one piece of hardware and it won't be portable. OpenCL is the obvious choice but I can see it being a C vs assembler deal (although they're all using C). Assembler is less portable but can on occasions give you better performance.

Yeah, I can see the downside of everyone else having access to GPUs and not only that, they typically use faster ones than Apple but one thing that stands out in Aaftab Munshi's OpenCL siggraph paper is having to query, select and initialize compute devices in the system then create a compute context and execute a compute kernel (a small unit of code like a function).

It almost seems like you pick the hardware to run your code.

// create a compute context with GPU device

context = clCreateContextFromType(CL_DEVICE_TYPE_GPU);

// create a work-queue

queue = clCreateWorkQueue(context, NULL, NULL, 0);

// allocate the buffer memory objects

memobjs[0] = clCreateBuffer(context,

CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,

sizeof(float)*2*num_entries, srcA);

memobjs[1] = clCreateBuffer(context,

CL_MEM_READ_WRITE,

sizeof(float)*2*num_entries, NULL);

// create the compute program

program = clCreateProgramFromSource(context, 1,

&fft1D_1024_kernel_src, NULL);

// build the compute program executable

clBuildProgramExecutable(program, false, NULL, NULL);

// create the compute kernel

kernel = clCreateKernel(program, “fft1D_1024”);

// create N-D range object with work-item dimensions

global_work_size[0] = n;

local_work_size[0] = 64;

range = clCreateNDRangeContainer(context, 0, 1,

global_work_size,

local_work_size);

// set the args values

clSetKernelArg(kernel, 0, (void *)&memobjs[0],

sizeof(cl_mem), NULL);

clSetKernelArg(kernel, 1, (void *)&memobjs[1],

sizeof(cl_mem), NULL);

clSetKernelArg(kernel, 2, NULL,

sizeof(float)*(local_work_size[0]+1)*16, NULL);

clSetKernelArg(kernel, 3, NULL,

sizeof(float)*(local_work_size[0]+1)*16, NULL);

// execute kernel

clExecuteKernel(queue, kernel, NULL, range, NULL, 0, NULL);

Here is an example kernel:

// This kernel computes FFT of length 1024. The 1024 length FFT is decomposed into

// calls to a radix 16 function, another radix 16 function and then a radix 4 function

__kernel void fft1D_1024 (__global float2 *in, __global float2 *out,

__local float *sMemx, __local float *sMemy) {

int tid = get_local_id(0);

int blockIdx = get_group_id(0) * 1024 + tid;

float2 data[16];

// starting index of data to/from global memory

in = in + blockIdx; out = out + blockIdx;

globalLoads(data, in, 64); // coalesced global reads

fftRadix16Pass(data); // in-place radix-16 pass

twiddleFactorMul(data, tid, 1024, 0);

// local shuffle using local memory

localShuffle(data, sMemx, sMemy, tid, (((tid & 15) * 65) + (tid >> 4)));

fftRadix16Pass(data); // in-place radix-16 pass

twiddleFactorMul(data, tid, 64, 4); // twiddle factor multiplication

localShuffle(data, sMemx, sMemy, tid, (((tid >> 4) * 64) + (tid & 15)));

// four radix-4 function calls

fftRadix4Pass(data); fftRadix4Pass(data + 4);

fftRadix4Pass(data + 8); fftRadix4Pass(data + 12);

// coalesced global writes

globalStores(data, out, 64);

}

"OpenCL is designed to efficiently share with OpenGL.

Efficient queuing of OpenCL and OpenGL commands"

http://s08.idav.ucdavis.edu/munshi-opencl.pdf

There are a few limitations in the language it points out too like recursion. Not sure if it eliminates recursion but that would limit it's usefulness in certain calculations:

http://en.wikipedia.org/wiki/Recursi...puter_science)

Makes sense in a way because you can't make that process parallel anyway.

All this points towards it being used on the GPU IMO. At the very least we know it's a language designed for highly parallel computation and Apple would simply have to make a chip with at least the equivalent of 32 cores running at 450MHz to rival the 8600M GT they already have in the MBP. They could say throw a quad 1GHz ARM or whatever into the mix that might run more code and be closer to the CPU but why wouldn't they use maybe an 8800M GTS with 64 cores instead? They can underclock those if they want but they've just doubled their processors for very little extra cost (this seems to go against the hit to the margins but not if they make their computers cheaper, which is what they implied with reference to being competitive). OpenCL is running specially written compute kernels so it doesn't matter if they are just stream processors.

I think the advantage that Apple will get is by putting OpenCL into the core of the OS. This ensures anybody on 10.6 gets access to the new features. Microsoft can't do this even with Windows 7 - DirectX 11 is optional now because it means too little compatibility. Apple on the other hand have the freedom to cut out the entire PPC platform.

If there was another processor, Apple will still be using GPUs. This means they have the cost of a GPU + the cost of a processor component. Why would they add this in if they have to manually initialize this vector compute device separately from the GPU? Larrabee I can understand because it would replace the GPU so you only have two working contexts to deal with, the GPU and the CPU. This makes it no more difficult than OpenGL programming where you do the same. You create an OpenGL context before drawing to it.

It also means that people who have invested in Apple hardware see a huge benefit upgrading to 10.6. Usually I like to be on the bleeding edge of things but I still haven't upgraded to 10.5 as I don't really see anything I'm missing. If my video encodes run 3 times faster then 10.6 is very interesting but having to buy new hardware again to see this benefit would kinda put me off and I'm sure it would to others - such as Mac Pro owners hoping to have a machine to last them 3 years with a current high end 8800 GT.

OpenCL isn't going to be a simple code recompile, it'll be like Alti-Vec and will heavily rely on people who understand parallel computing using it before you see any performance benefit. Again this is why it being in the OS is good because Apple can use it for all the core libraries they already have that developers use and they won't have to think about a lot of the tasks as the libraries will do this for them.

This gives them at least a 6 month lead on Microsoft as Windows 7 isn't coming until 2010 at the earliest and details on DX 11 compute shaders aren't widely known yet so we don't know what limitations they have yet. They sound like extensions to a shader language, which isn't entirely flexible. Whether they will be less flexible than OpenCL compute kernels we'll have to see. Microsoft are fully aware of OpenCL but their focus seems to be gaming as opposed to parallel computing.

Perhaps they'll offer a BTO PCI Express 16 1.1 and 2.0 specialty card for existing customers and on-board for future generation of machines, ala ZIF Socket.

wizard69 · September 1, 2008 10:25PM

Quote:

Originally Posted by Marvin

"OpenCL is designed to efficiently share with OpenGL.

Efficient queuing of OpenCL and OpenGL commands"

That is interesting because I had this view that OpenCL was more of a replacement for OpenGL. If OpenC is focused on massively parallel processing in a more general sense then it really needs to be able to support hardware other than the GPU. From what I can see from your example that is exactly what their intention is. Unfortunately it doesn't look like it will dynamically support different hardware.

Quote:

http://s08.idav.ucdavis.edu/munshi-opencl.pdf

There are a few limitations in the language it points out too like recursion. Not sure if it eliminates recursion but that would limit it's usefulness in certain calculations:

http://en.wikipedia.org/wiki/Recursi...puter_science)

Makes sense in a way because you can't make that process parallel anyway.

What would be interesting is to know where the limitation on recursion is. Is it the hardware that OpenCL would target such as a GPU? I know everybody gets starry eyed when thinking about the compute power in a GPU but lets face it the hardware there isn't exactly general purpose.

Quote:

All this points towards it being used on the GPU IMO. At the very least we know it's a language designed for highly parallel computation and Apple would simply have to make a chip with at least the equivalent of 32 cores running at 450MHz to rival the 8600M GT they already have in the MBP.

You just made me feel real good about that MBP I got this year :0

On a slightly more serious side lets imagine that Apple contracted with Intel to deliver a special version of Larrabee that sits in the second or fourth processor socket on a motherboard. This is a Larrabee without the video hardware more or less a dense vector processor. Would that not be acceptable to people. More so it would be able to speed up a larger variety of software than most GPU's. With Intels process technology you likely wouldn't need massive numbers of cores either as you could just run fast.

Yeah I know that everyone says Intel doesn't do custom but that is always subject to change. Further Apple owns enough IP that intel could easily say it is Apple only due to that IP. This seems like more of a possibility than Apple going whole hog into the co processor fray. Mind you I'd like to see Apple keep a little pressure on both Intel and AMD to innovate but developing a stand alone vector processor to sit on an Intel bus isn't child's play.

Quote:

They could say throw a quad 1GHz ARM or whatever into the mix that might run more code and be closer to the CPU but why wouldn't they use maybe an 8800M GTS with 64 cores instead? They can underclock those if they want but they've just doubled their processors for very little extra cost (this seems to go against the hit to the margins but not if they make their computers cheaper, which is what they implied with reference to being competitive). OpenCL is running specially written compute kernels so it doesn't matter if they are just stream processors.

Arm is certainly a possibility but frankly I think Apple would want to go with at least some X86 support. Maybe not the whole instruction set but enough to keep the processor on the bus so to speak.

ARM extended with parallel Alt-Vec like hardware though does sound very interesting. This of course brings us back to the idea that Apple could scale the platform to work on everything from the Iphone on up. I know ARM has its own solution to vector processing on ARM but frankly know nothing about its performance. So it is a question as to how much Apple would really have to do here. Even now I tend to believe that a dual core ARM would be very interesting in a Touch sized device. That should easily do away with the external video processor. I'm already impressed with the fact that my iPhone (when it doesn't crash) is faster than at least half the computers I've ever owned.

Quote:

I think the advantage that Apple will get is by putting OpenCL into the core of the OS. This ensures anybody on 10.6 gets access to the new features. Microsoft can't do this even with Windows 7 - DirectX 11 is optional now because it means too little compatibility. Apple on the other hand have the freedom to cut out the entire PPC platform.

My only hope is that they debug before release. Apples track record hasn't been real good in this respect lately.

Quote:

If there was another processor, Apple will still be using GPUs. This means they have the cost of a GPU + the cost of a processor component. Why would they add this in if they have to manually initialize this vector compute device separately from the GPU? Larrabee I can understand because it would replace the GPU so you only have two working contexts to deal with, the GPU and the CPU. This makes it no more difficult than OpenGL programming where you do the same. You create an OpenGL context before drawing to it.

I think that the advantage to Apple, if they can pull it off, would be that the same compute complex would be available on all platforms. Lets face it, it will be a long time before Apple ships all of its hardware with the same GPU or even GPU's form the same family. A compute complex that is closely allied to the main CPU means that they have only one acceleration environment to target. If that complex is scalable from say two cores in a Touch type device to 64 in a Mac Pro I think developers would be doing back flips to get on the platform. Sure there will still be vast differences in performance but that is life and you get what you pay for.

Quote:

It also means that people who have invested in Apple hardware see a huge benefit upgrading to 10.6. Usually I like to be on the bleeding edge of things but I still haven't upgraded to 10.5 as I don't really see anything I'm missing. If my video encodes run 3 times faster then 10.6 is very interesting but having to buy new hardware again to see this benefit would kinda put me off and I'm sure it would to others - such as Mac Pro owners hoping to have a machine to last them 3 years with a current high end 8800 GT.

I entered the fray at 10.5 and frankly am rather pleased. Yes Mac OS still has its soft points but it is not that bad though I believe Apple needs to work on two things. One is overall software quality. The other is keeping the open source software up to date.

Quote:

OpenCL isn't going to be a simple code recompile, it'll be like Alti-Vec and will heavily rely on people who understand parallel computing using it before you see any performance benefit. Again this is why it being in the OS is good because Apple can use it for all the core libraries they already have that developers use and they won't have to think about a lot of the tasks as the libraries will do this for them.

It certainly might help some of the core libraries, but maybe not as much as we would like. As you imply one needs to know what one is doing to leverage parallel processing.

Quote:

This gives them at least a 6 month lead on Microsoft as Windows 7 isn't coming until 2010 at the earliest and details on DX 11 compute shaders aren't widely known yet so we don't know what limitations they have yet. They sound like extensions to a shader language, which isn't entirely flexible. Whether they will be less flexible than OpenCL compute kernels we'll have to see. Microsoft are fully aware of OpenCL but their focus seems to be gaming as opposed to parallel computing.

If OpenCL is a more general solution than I first thought then I do believe Apple has a something hot on their hands. My first impression was that OpenCL was a way to realize parallel performance in OpenGL, if it is more than that then life will be very interesting in Mac land.

Dave

PS

By the way if Apple wants to realize this advantage at the low end as well as the high end then ARM based vector processing would seem to be the only way to go. The only problem is the number of cores ARM supports in SMP mode which I believe was rather limited. It might be adequate for machines up to iMac class but become limited on the Mac Pro. Of course this doesn't mean that Apple couldn't take an ARM core and munge it for highly parallel operation.

Dave

PSS

Does any one here happen to think that Apple engineers tune in to this thread just to get a laugh at the wild imaginings going on here.

D

futurepastnow · September 2, 2008 12:01AM

Quote:

Originally Posted by wizard69

PSS

Does any one here happen to think that Apple engineers tune in to this thread just to get a laugh at the wild imaginings going on here.

D

Apple engineers generally read rumor sites to find out about projects they're not working on, rather than the ones they are.

programmer · September 2, 2008 9:58AM

Quote:

Originally Posted by mdriftmeyer

C99 Specification: http://www.open-std.org/JTC1/SC22/WG...docs/n1256.pdf

Selection from 6.5.2.2 Function Calls.

Apple is well aware of the uses for recursion so I'd expect them to address this need.

Just because its in the C99 spec doesn't mean it'll be in OpenCL. They "based it on" C99, its not a C99 implementation. GPUs can't do recursion so very likely OpenCL won't allow it -- I'm pretty sure CUDA doesn't.

Quote:

... talk about basing vector cores on ARM

You guys need to think farther outside the box. I wouldn't expect custom hardware like this to be at all related to conventional CPUs. You're right, there isn't much point if Apple is just going to try to do what nVidia or Intel are doing... so they ought to do something different that maximizes what they get from the other vendors.

Intel Chipsets all the Way with OpenCL

Comments