AppleInsider AppleInsider Forums


Go Back   AppleInsider > Mac OS
Register Members List New Posts Mark Forums Read

Reply
 
Thread Tools Display Modes
Old 09-17-2009, 06:26 PM   #1
AppleInsider
Kasper's Automated Slave
 
Join Date: Nov 1997
Posts: 6,169
Snow Leopard's Grand Central, Open CL boost app by 50%

A developer reports seeing a 50% jump in real world performance after adding initial support for two new Snow Leopard technologies: Grand Central Dispatch and Open CL.

As reported by the French site HardMac, MovieGate developer Christophe Ducommun found his app jumped from 104 frames per second encoding under Leopard to 150 fps performance on the same hardware under Snow Leopard after implementing support for the new features.

Grand Central Dispatch helps developers efficiently maximize multiple processors available on the system, and OpenCL enables applications to make use of the latent power within available video card GPUs.

In addition to an overall performance boost in output, Ducommun also reported CPU utilization for MPEG-2 encoding under the ffmpeg open source library leap from 100% to 130% on his quad core Mac Pro, indicating a significant improvement in tapping its multicore potential.

At the same time, decoding operations dropped CPU utilization from 165% to 70% under Snow Leopard because a significant amount of the work could be delegated to the machine's GPU.

The observations illustrate how Snow Leopard's Grand Central Dispatch and OpenCL combine to improve performance both in raw computing power and in increased use of otherwise idle hardware.

In particular, it indicates the potential for GPU delegation to speed things up while reducing the load on the CPU for some operations, particular video playback. The battery life of mobile devices stand to benefit from such optimizations.
AppleInsider is offline   Reply With Quote
Old 09-17-2009, 06:53 PM   #2
NanoAkron
Registered User
 
Join Date: Jan 2007
Posts: 49
Interesting news...

Sounds awesome.

Be good to hear (or see) this outcome for a number of other apps too...

Maybe AppleInsider could keep a list online?
NanoAkron is offline   Reply With Quote
Old 09-17-2009, 06:57 PM   #3
camroidv27
Registered User
 
Join Date: Nov 2006
Location: Arizona
Posts: 334
This type of thing is the wave of the future. GPUs are quite powerful chips too, so I'm glad to see this.

I was encoding a video file today on my PC (with specs a little better than the lowest Mac Pro, including the Xeon processor) and with CUDA functions enabled along with multi-threading, I was amazed at how fast this thing cranked through video! And that's just on *Gasp* Vista! If my PC can go that fast, I imagine the Mac must be....
... well, probably about the same speed, maybe a little faster. But still, it should be amazingly blinding fast too!


openSuSe 11.2, 32 and 64 bit, for Mac and PC!
"Shiny capt'n. Everything thing is A-Okay."
camroidv27 is offline   Reply With Quote
Old 09-17-2009, 07:12 PM   #4
vinney57
Registered User
 
Join Date: Nov 2001
Location: The UK of Englandshire
Posts: 985
I've seen pure compute demos using scientific mapping software that resulted in a 50X speed increase. Many algorithms aren't suited to massive-parallelism but when they are you can see huge gains. Apple have done some seriously good work in Snow Leopard that should attract a lot of previously ambivalent developers.
vinney57 is offline   Reply With Quote
Old 09-17-2009, 07:26 PM   #5
Marvin
Global Moderator
 
Join Date: Feb 2006
Posts: 5,251
Quote:
Originally Posted by AppleInsider View Post
Ducommun also reported CPU utilization for MPEG-2 encoding under the ffmpeg open source library leap from 100% to 130% on his quad core Mac Pro, indicating a significant improvement in tapping its multicore potential.
Not close to the 400% it could be though. It would be nice if video compressors encoded groups of pictures in separate threads in parallel, then just cached the output so it would write to file in order.

The OpenCL speedup was quite low considering even the 9400M can rival an 8-core Mac Pro CPU in some cases.

I'm sure it will take time to get to grips with these technologies in the best ways though and it's good to see real-world uses.
Marvin is offline   Reply With Quote
Old 09-17-2009, 09:57 PM   #6
Gazoobee
Registered User
 
Join Date: Feb 2009
Location: Somewhere in the Cheese
Posts: 464
Quote:
Originally Posted by AppleInsider View Post
... Ducommun also reported CPU utilization for MPEG-2 encoding under the ffmpeg open source library leap from 100% to 130% on his quad core Mac Pro, ...
Sounds good, but Handbrake on my Mac Pro uses between 500% and 750% of the CPU. Unless we are talking different things, 130% seems low.


It was a widely held belief by the smartest people in late 1400's Europe that human knowledge and indeed civilisation itself, had advanced to such a nearly complete and perfect state, that the "end times" were certainly almost upon them.
Gazoobee is offline   Reply With Quote
Old 09-17-2009, 10:06 PM   #7
brucep
Registered User
 
Join Date: Jan 2007
Location: methane seas of neptune
Posts: 1,488
Quote:
Originally Posted by vinney57 View Post
I've seen pure compute demos using scientific mapping software that resulted in a 50X speed increase. Many algorithms aren't suited to massive-parallelism but when they are you can see huge gains. Apple have done some seriously good work in Snow Leopard that should attract a lot of previously ambivalent developers.
yes and apple has just begun this new fast lean road of intergrated cpu and dual gpu's all working together under GCD AND open CL . If i said this right .


Change your company's name. Not that big of a deal.

The  Beatles .
brucep is offline   Reply With Quote
Old 09-17-2009, 10:40 PM   #8
coolfactor
Registered User
 
Join Date: Jul 2004
Location: Van Isle, BC, Canada
Posts: 209
Quote:
Originally Posted by Gazoobee View Post
Sounds good, but Handbrake on my Mac Pro uses between 500% and 750% of the CPU. Unless we are talking different things, 130% seems low.
The 130% figure illustrates that the application was able to split across two processors (parallelism), rather than max out just one, but some of the work was also offloaded to the GPU's, which would not be reported in the CPU utilization. The 130% alone is *not* the full processing power being tapped.

There's two technologies being used here:
- Grand Central Dispatch - utilizing multiple CPU's/cores
- OpenCL - utilizing graphics processors
coolfactor is offline   Reply With Quote
Old 09-17-2009, 10:48 PM   #9
lilgto64
Registered User
 
Join Date: Apr 2005
Location: The Northcoast
Posts: 127
how many CPUs on the test machine?

Quote:
Originally Posted by coolfactor View Post
The 130% figure illustrates that the application was able to split across two processors (parallelism), rather than max out just one, but some of the work was also offloaded to the GPU's, which would not be reported in the CPU utilization. The 130% alone is *not* the full processing power being tapped.

There's two technologies being used here:
- Grand Central Dispatch - utilizing multiple CPU's/cores
- OpenCL - utilizing graphics processors
Did I miss it - did the article or that comment mention how many CPUs were in the test machine? When I read that I was thinking dual-core machine where before only 1 core could be tapped and now both cores are used for a 30% boost in performance. I was thinking this because aren't the majority of Intel Macs out there dual core? Core 2 or Core 2 Duo? iMac, Mac Mini, MacBook, MacBook Pro? Sure there are 8 core desktops and xServe - and that is more likely where you would expect to see these apps - just saying that the configuration of the test platform was not obvious to me.
lilgto64 is offline   Reply With Quote
Old 09-17-2009, 10:58 PM   #10
IQatEdo
Registered User
 
Join Date: Jun 2003
Location: Where East meets West
Posts: 230
Quote:
Originally Posted by camroidv27 View Post
This type of thing is the wave of the future. GPUs are quite powerful chips too, so I'm glad to see this.

I was encoding a video file today on my PC (with specs a little better than the lowest Mac Pro, including the Xeon processor) and with CUDA functions enabled along with multi-threading, I was amazed at how fast this thing cranked through video! And that's just on *Gasp* Vista! If my PC can go that fast, I imagine the Mac must be....
... well, probably about the same speed, maybe a little faster. But still, it should be amazingly blinding fast too!
Hi

Have you tried Windows 7 on your system yet? Just interested in a speed comparison.

Looking forward to the new iMacs, interesting to see where they are going.


Where are we on the curve? We'll know once it goes asymptotic!
IQatEdo is offline   Reply With Quote
Old 09-17-2009, 11:03 PM   #11
IQatEdo
Registered User
 
Join Date: Jun 2003
Location: Where East meets West
Posts: 230
Quote:
Originally Posted by Marvin View Post
Not close to the 400% it could be though. It would be nice if video compressors encoded groups of pictures in separate threads in parallel, then just cached the output so it would write to file in order.

The OpenCL speedup was quite low considering even the 9400M can rival an 8-core Mac Pro CPU in some cases.

I'm sure it will take time to get to grips with these technologies in the best ways though and it's good to see real-world uses.
Apple aficionados knew that SL was an 'under-the-hood' revolution - it'll be nice now to hear of and to witness the pay-off.


Where are we on the curve? We'll know once it goes asymptotic!
IQatEdo is offline   Reply With Quote
Old 09-17-2009, 11:20 PM   #12
manonthemove
Registered User
 
Join Date: Sep 2009
Posts: 2
good but could be better....

Just so you guys know, a factor of 10-30 in performance is not uncommon for scientists who use CUDA. Given the similarities between OpenCL and CUDA, we should (hopefully) see a lot of improvement in the near future. Here is a link (http://sussi.megahost.dk/~frigaard/) to a standard piece of scientific code G2X (it does N-body calculations) modified to use CUDA by C. FRIGAARD (go to the bottom) which gets a factor of 30 for one subroutine and a factor of 10 overall.
manonthemove is offline   Reply With Quote
Old 09-17-2009, 11:21 PM   #13
Cubert
Registered User
 
Join Date: Jun 2005
Location: Philadelphia
Posts: 472
Once again proving that Snow Leopard is more about positioning the Mac platform for the future than trying to drum up massive sales (which is happening anyway). And, I guess, it's also about "encouraging" people to upgrade their hardware, too.
Cubert is offline   Reply With Quote
Old 09-17-2009, 11:27 PM   #14
IQatEdo
Registered User
 
Join Date: Jun 2003
Location: Where East meets West
Posts: 230
Quote:
Originally Posted by Cubert View Post
Once again proving that Snow Leopard is more about positioning the Mac platform for the future than trying to drum up massive sales (which is happening anyway). And, I guess, it's also about "encouraging" people to upgrade their hardware, too.
Just doubling my MBP (3,1) RAM to 4 GB and replacing the 160 GB HD with a 500 GB one made a huge difference after installing SL. An old dog taught new tricks - love it.


Where are we on the curve? We'll know once it goes asymptotic!
IQatEdo is offline   Reply With Quote
Old 09-17-2009, 11:50 PM   #15
BertP
Registered User
 
Join Date: May 2009
Posts: 82
Quote:
Originally Posted by lilgto64 View Post
Did I miss it - did the article or that comment mention how many CPUs were in the test machine? When I read that I was thinking dual-core machine where before only 1 core could be tapped and now both cores are used for a 30% boost in performance. I was thinking this because aren't the majority of Intel Macs out there dual core? Core 2 or Core 2 Duo? iMac, Mac Mini, MacBook, MacBook Pro? Sure there are 8 core desktops and xServe - and that is more likely where you would expect to see these apps - just saying that the configuration of the test platform was not obvious to me.
I checked Hardmac and came up with "Mac Pro 2007 (Quad Core 2.66 GHz with a GeForce 8800 GT)". I'm not sure if the CPU is hyper-threaded or not, but I would guess 'Yes'. I think the Mac Pro has Xeon CPUs.

I very much approve of Apple doing a separate Mac OS X release as a foundational release without the distraction of new features orientated toward the consumer. This takes courage, and has been a good software development strategy in my opinion.

Edit: searched apple.com/support and came up with Xeon 5400 Series, not hyper-threaded. So the test would be on 4 CPU threads.


Last edited by BertP; 09-18-2009 at 12:31 AM.. Reason: Info on number of threads.
BertP is offline   Reply With Quote
Old 09-17-2009, 11:50 PM   #16
LighteningKid
Registered User
 
Join Date: Jun 2009
Location: Canada
Posts: 20
A bit off topic...

A bit off topic, but something this "performance boost" talk reminded me of: I seem to remember a couple of years ago (probably here on AI) there being mention that Leopard was bloated because some sort of developer files were left in the OS that should have been removed when it went to Golden Master.

Does anyone know what I'm talking about? If you do, do you know if the 6GB freed up in Snow Leopard is a true improvement, or does it come just from cleaning up the bloat that should never have been there in the first place?

Or maybe I'm getting my info totally crossed
LighteningKid is offline   Reply With Quote
Old 09-18-2009, 12:59 AM   #17
FattyMcButterpants
Registered User
 
Join Date: Sep 2009
Location: Columbus, OH
Posts: 6
Quote:
Originally Posted by LighteningKid View Post
A bit off topic, but something this "performance boost" talk reminded me of: I seem to remember a couple of years ago (probably here on AI) there being mention that Leopard was bloated because some sort of developer files were left in the OS that should have been removed when it went to Golden Master.

Does anyone know what I'm talking about? If you do, do you know if the 6GB freed up in Snow Leopard is a true improvement, or does it come just from cleaning up the bloat that should never have been there in the first place?

Or maybe I'm getting my info totally crossed
You're thinking of all the talk about debug code being left in OS X. Back when OS X was slow (remember how Finder windows would not move around cleanly?), lots of people were saying Apple had left debugging code in the system. That was a bunch of bull.

The 6GB of disk space reclaimed in Snow Leopard was mostly from removing PowerPC support.

The performance improvements in SL come from some massive under-the-hood optimizations, not from removing PPC support as the code which executed in Universal apps was determined at run-time and the other CPU support would remain on disk.
FattyMcButterpants is offline   Reply With Quote
Old 09-18-2009, 04:11 AM   #18
Denmaru
Registered User
 
Join Date: Oct 2004
Location: Vienna
Posts: 182
Oh my god... we need an OpenCL and GrandCentral ready version of Handbrake...


Now running on a 20" aluminium iMac (Fall 2008), as well as a Macboook Pro 13" (mid 2009) and an iPhone.
Denmaru is offline   Reply With Quote
Old 09-18-2009, 05:37 AM   #19
nvidia2008
Registered User
 
Join Date: Feb 2007
Posts: 3,706
Quote:
Originally Posted by Denmaru View Post
Oh my god... we need an OpenCL and GrandCentral ready version of Handbrake...
Totally. The PC equivalent using ATI's Stream (and MediaEspresso something) is fast but buggy. http://badaboomit.com/ for Nvidia cards on PC promises fast encoding, haven't tried it as I have an ATI 4830 512MB in my PC.

Freeware DVD and BluRay transcoder, using OpenCL and GrandCentral would equal big Win.
nvidia2008 is offline   Reply With Quote
Old 09-18-2009, 07:11 AM   #20
hdasmith
Registered User
 
Join Date: Jun 2005
Location: UK
Posts: 114
Anyone know how much work had to be put in to this developers application to take advantage of these technologies? I'm guessing it's not as simple as checking a couple of check boxes.
hdasmith is offline   Reply With Quote
Old 09-18-2009, 07:32 AM   #21
shadow
Registered User
 
Join Date: Feb 2005
Posts: 347
OpenCL performance

I had a previous post regarding OpenCL performance on MacBook Pro. Anyone who has access to the developer tools and sample code could try this on a MacPro and report back.

Here is the link to the post with more details.

The other important aspect of OpenCL which seems misunderstood, it is not a GPGPU only. It is a technology which takes advantage of all compute resources available, including CPUs, GPUs, DSPs (digital Signal Processors) or any custom encoding/decoding chips available. I am not sure Apple has some ideas beyond CPU and GPU right now but this is what COULD be done with OpenCL.

Both OpenCL and GCD add flexibility for future architectures, e.g. cell-like processors or Larrabee architecture. Considering the fact that Apple controls the hardware, this could be a great advantage for Apple, if a new powerful architecture emerges.
shadow is offline   Reply With Quote
Old 09-18-2009, 07:39 AM   #22
shadow
Registered User
 
Join Date: Feb 2005
Posts: 347
Quote:
Originally Posted by hdasmith View Post
Anyone know how much work had to be put in to this developers application to take advantage of these technologies? I'm guessing it's not as simple as checking a couple of check boxes.
Definitely not as simple as checking a couple of check boxes. GCD has support at Cocoa level, which makes it's use much simpler. Also, Cocoa applications can enjoy "free lunch" sometimes: Apple says that CoreImage was re-written using OpenCL and got 30% (AFAIR) boost on average. Some of the Leopard classes (NSOperation?) take advantage of GCD without recompile. But the rest of the code needs change. And OpenCL may need changes to the software architecture. OpenCL is good for a relatively limited number of tasks, N-body calculations being the showcase example.
shadow is offline   Reply With Quote
Old 09-18-2009, 07:40 AM   #23
cwingrav
Registered User
 
Join Date: Oct 2008
Posts: 14
Quote:
Originally Posted by manonthemove View Post
Just so you guys know, a factor of 10-30 in performance is not uncommon for scientists who use CUDA. Given the similarities between OpenCL and CUDA, we should (hopefully) see a lot of improvement in the near future. Here is a link (http://sussi.megahost.dk/~frigaard/) to a standard piece of scientific code G2X (it does N-body calculations) modified to use CUDA by C. FRIGAARD (go to the bottom) which gets a factor of 30 for one subroutine and a factor of 10 overall.
Just to clarify a bit about optimization by multiple cores, it all really depends on the algorithm. N-body calculations are pretty much the ABSOLUTE BEST case for multi-core optimization as the calculations are small and are not dependent upon each other's results, until the next timestep (and even that can be fudged). I don't know too much about video optimization but would guess that it too is pretty open for optimization w/ multiple cores but there are quite a few dependencies between frames and the like that make it more complicated and tricky. I think the best test will be for something like a word processor and the like, more typical of desktop applications. However, video and pictures are applications that have more computational needs so maybe they are the best realistic benchmark?
cwingrav is offline   Reply With Quote
Old 09-18-2009, 07:57 AM   #24
mikemcfarlane
Registered User
 
Join Date: Jun 2009
Posts: 2
Lets hope Apple applies all this new technology to a rewrite of their fantastic quality, but fantastically slow Aperture app. Having just spent £130 on it and having to stop using it (except for my portfolio grade RAW conversions) as it is so slow to process RAW files (and I know I'm not alone in this problem), I really hope there is some scope for still image as well as video improvements.
mikemcfarlane is offline   Reply With Quote
Old 09-18-2009, 08:36 AM   #25
shadow
Registered User
 
Join Date: Feb 2005
Posts: 347
These new technologies require paradigm shift from developers. The future seems to be heading there and Apple skates "where the puck is going to be". An unexpected breakthrough in semiconductor/processor technology could bring the free ride on core speed back, however. Very unlikely, but possible.
shadow is offline   Reply With Quote
Old 09-18-2009, 08:37 AM   #26
shadow
Registered User
 
Join Date: Feb 2005
Posts: 347
Quote:
Originally Posted by mikemcfarlane View Post
Lets hope Apple applies all this new technology to a rewrite of their fantastic quality, but fantastically slow Aperture app. Having just spent £130 on it and having to stop using it (except for my portfolio grade RAW conversions) as it is so slow to process RAW files (and I know I'm not alone in this problem), I really hope there is some scope for still image as well as video improvements.
I am with you here. Fingers crossed for Aperture
shadow is offline   Reply With Quote
Old 09-18-2009, 09:45 AM   #27
Roc Ingersol
Registered User
 
Join Date: Oct 2008
Location: Detroit, MI
Posts: 123
Quote:
Originally Posted by cwingrav View Post
I don't know too much about video optimization but would guess that it too is pretty open for optimization w/ multiple cores but there are quite a few dependencies between frames and the like that make it more complicated and tricky.
Slightly. You should be able to split the video into chunks by keyframes. Then dependencies aren't a problem. And while the process isn't maximally parallelized, it's a fairly straightforward change for a huge benefit that can be implemented before a serious re-write that gets in deeper.
Roc Ingersol is offline   Reply With Quote
Old 09-18-2009, 10:05 AM   #28
manonthemove
Registered User
 
Join Date: Sep 2009
Posts: 2
Quote:
Originally Posted by cwingrav View Post
Just to clarify a bit about optimization by multiple cores, it all really depends on the algorithm. N-body calculations are pretty much the ABSOLUTE BEST case for multi-core optimization as the calculations are small and are not dependent upon each other's results, until the next timestep (and even that can be fudged). I don't know too much about video optimization but would guess that it too is pretty open for optimization w/ multiple cores but there are quite a few dependencies between frames and the like that make it more complicated and tricky. I think the best test will be for something like a word processor and the like, more typical of desktop applications. However, video and pictures are applications that have more computational needs so maybe they are the best realistic benchmark?
I agree with you that everyday tasks matter more (especially those involving video pictures and music) and that the optimization is algorithm dependent. However, the ability to get an order of magnitude in performance is there. I hope developers will figure out better algorithms for everyday tasks so they can get more than just factor of 2.
manonthemove is offline   Reply With Quote
Old 09-18-2009, 10:19 AM   #29
Tauron
Registered User
 
Join Date: Jun 2008
Posts: 888
Quote:
Originally Posted by IQatEdo View Post
Hi

Have you tried Windows 7 on your system yet? Just interested in a speed comparison.

Looking forward to the new iMacs, interesting to see where they are going.
Windows 7 will now require 4 GB of RAM as a minimum and applications running on Windows 7 will use 30% more CPU cycles to keep it from crashing.
Tauron is offline   Reply With Quote
Old 09-18-2009, 11:28 AM   #30
DESuserIGN
Registered User
 
Join Date: Apr 2007
Location: Chicago
Posts: 82
Quote:
Originally Posted by coolfactor View Post
The 130% figure illustrates that the application was able to split across two processors (parallelism), rather than max out just one, but some of the work was also offloaded to the GPU's, which would not be reported in the CPU utilization. The 130% alone is *not* the full processing power being tapped.

There's two technologies being used here:
- Grand Central Dispatch - utilizing multiple CPU's/cores
- OpenCL - utilizing graphics processors
Yes. But I would add that GCD not only orchestrates the use of available cores/CPUs but also other Resources such as DSP's as well as open CL capable GPUs. I guess their close association is why folks confuse/conflate them
DESuserIGN is offline   Reply With Quote
Old 09-18-2009, 11:36 AM   #31
Gazoobee
Registered User
 
Join Date: Feb 2009
Location: Somewhere in the Cheese
Posts: 464
Quote:
Originally Posted by coolfactor View Post
The 130% figure illustrates that the application was able to split across two processors (parallelism), rather than max out just one, but some of the work was also offloaded to the GPU's, which would not be reported in the CPU utilization. The 130% alone is *not* the full processing power being tapped.

There's two technologies being used here:
- Grand Central Dispatch - utilizing multiple CPU's/cores
- OpenCL - utilizing graphics processors
I don't pretend to know about the details of this kind of stuff. I was just thinking that since I'm using a very similar (or perhaps even the exact same) hardware, and that since Handbrake is also performing the same kind of work, that a direct comparison could be made.

I'm taking the 130% (and my 750%) as direct indications of how many cores are in use (1.3 and 7.5), but as I say I'm not totally sure of that.

It's nice to see real world implementations of this sort of thing so soon either way. Too often someone invents a really cool and much better way to do things and yet it's never implemented because of some foolish capitalist or legal reason that has no bearing on the technology itself. Hopefully the uptake on these technologies will be better than that.


It was a widely held belief by the smartest people in late 1400's Europe that human knowledge and indeed civilisation itself, had advanced to such a nearly complete and perfect state, that the "end times" were certainly almost upon them.
Gazoobee is offline   Reply With Quote
Old 09-18-2009, 11:39 AM   #32
DESuserIGN
Registered User
 
Join Date: Apr 2007
Location: Chicago
Posts: 82
Quote:
Originally Posted by lilgto64 View Post
Did I miss it - did the article or that comment mention how many CPUs were in the test machine? When I read that I was thinking dual-core machine where before only 1 core could be tapped and now both cores are used for a 30% boost in performance. I was thinking this because aren't the majority of Intel Macs out there dual core? Core 2 or Core 2 Duo? iMac, Mac Mini, MacBook, MacBook Pro? Sure there are 8 core desktops and xServe - and that is more likely where you would expect to see these apps - just saying that the configuration of the test platform was not obvious to me.
Yes. Some core2's (core2 solo, I think only in some early Minis) are single core.
DESuserIGN is offline   Reply With Quote
Old 09-18-2009, 11:48 AM   #33
DESuserIGN
Registered User
 
Join Date: Apr 2007
Location: Chicago
Posts: 82
Quote:
Originally Posted by FattyMcButterpants View Post
You're thinking of all the talk about debug code being left in OS X. Back when OS X was slow (remember how Finder windows would not move around cleanly?), lots of people were saying Apple had left debugging code in the system. That was a bunch of bull.

The 6GB of disk space reclaimed in Snow Leopard was mostly from removing PowerPC support.

The performance improvements in SL come from some massive under-the-hood optimizations, not from removing PPC support as the code which executed in Universal apps was determined at run-time and the other CPU support would remain on disk.
I've read that some of the reclaimed space is partially from the move to decimal file sizes.
1GB=1024KB=1024*1024*1024B=1.074K(decimal)or a 7% "gain" in disk size.
I assume files also store a little more efficiently with the smaller Byte sized blocks?
Anyone know what the true skinny is on this?
DESuserIGN is offline   Reply With Quote
Old 09-18-2009, 11:59 AM   #34
DESuserIGN
Registered User
 
Join Date: Apr 2007
Location: Chicago
Posts: 82
Quote:
Originally Posted by shadow View Post
Definitely not as simple as checking a couple of check boxes. GCD has support at Cocoa level, which makes it's use much simpler. Also, Cocoa applications can enjoy "free lunch" sometimes: Apple says that CoreImage was re-written using OpenCL and got 30% (AFAIR) boost on average. Some of the Leopard classes (NSOperation?) take advantage of GCD without recompile. But the rest of the code needs change. And OpenCL may need changes to the software architecture. OpenCL is good for a relatively limited number of tasks, N-body calculations being the showcase example.
Definitely not simple, but also definitely not so hard either. Minor changes in the code can enable parallellization by GCD (at least for the CPU cores.) I'm not a coder, but here is a link to what I have read:
http://www.macresearch.org/cocoa-sci...-grand-central


Last edited by DESuserIGN; 09-18-2009 at 01:06 PM..
DESuserIGN is offline   Reply With Quote
Old 09-18-2009, 12:18 PM   #35
Lukeskymac
Registered User
 
Join Date: Apr 2009
Posts: 80
Quote:
Originally Posted by camroidv27 View Post
This type of thing is the wave of the future. GPUs are quite powerful chips too, so I'm glad to see this.

I was encoding a video file today on my PC (with specs a little better than the lowest Mac Pro, including the Xeon processor) and with CUDA functions enabled along with multi-threading, I was amazed at how fast this thing cranked through video! And that's just on *Gasp* Vista! If my PC can go that fast, I imagine the Mac must be....
... well, probably about the same speed, maybe a little faster. But still, it should be amazingly blinding fast too!
No, it has the potential to be a lot faster. Because OpenCL may be similar to CUDA, but there is nothing available to Windows that works like Grand Central Dispatch
Lukeskymac is offline   Reply With Quote
Old 09-18-2009, 12:34 PM   #36
solipsism
Registered User
 
Join Date: Apr 2006
Location: The Ansible
Posts: 11,895
Quote:
Originally Posted by Denmaru View Post
Oh my god... we need an OpenCL and GrandCentral ready version of Handbrake...
It’s open source so have at it. There hasn’t been a new update for a year so I am doubtful that we’ll see anyone take the ball and run with it at this point.


Quote:
Originally Posted by DESuserIGN View Post
I've read that some of the reclaimed space is partially from the move to decimal file sizes.
1GB=1024KB=1024*1024*1024B=1.074K(decimal)or a 7% "gain" in disk size.
I assume files also store a little more efficiently with the smaller Byte sized blocks?
Anyone know what the true skinny is on this?
Technically speaking, they change from BASE-2 to BASE-10 does not alter the space used. A Byte is a Byte. You’ll get back at least 7GB but many are reporting 20GB, because of other software installed, but mostly because of the BASE change. I’m glad they made the change and everyone else needs to get on board. There is no reason why the user needs to be doing binary calculations when decimal is natural. Let the computer deal with binary; it’s good at it.

As for your 7%, that is only for a 1GB. With the Terabyte nomenclature —a common size now— that discrepancy jumps to 10%. Apple should have gone a step further and used the Kibi-, Mebi-Gibi- and Tebibyte of the IEC standard o that it’s not confused with the SI standard of Kilo-, Mega-, Giga-, and Terabyte now that they are using BASE-10 in the OS UI. I can’t think of anything else that uses the exactly same writing to represent two similar but very distinct representations in math. It’s fraught with issues.
http://www.iec.ch/zone/si/si_bytes.htm
solipsism is online now   Reply With Quote
Old 09-18-2009, 01:03 PM   #37
DESuserIGN
Registered User
 
Join Date: Apr 2007
Location: Chicago
Posts: 82
Quote:
Originally Posted by solipsism View Post
Technically speaking, they change from BASE-2 to BASE-10 does not alter the space used. A Byte is a Byte. You’ll get back at least 7GB but many are reporting 20GB, because of other software installed, but mostly because of the BASE change. I’m glad they made the change and everyone else needs to get on board. There is no reason why the user needs to be doing binary calculations when decimal is natural. Let the computer deal with binary; it’s good at it.

As for your 7%, that is only for a 1GB. With the Terabyte nomenclature —a common size now— that discrepancy jumps to 10%. Apple should have gone a step further and used the Kibi-, Mebi-Gibi- and Tebibyte of the IEC standard o that it’s not confused with the SI standard of Kilo-, Mega-, Giga-, and Terabyte now that they are using BASE-10 in the OS UI. I can’t think of anything else that uses the exactly same writing to represent two similar but very distinct representations in math. It’s fraught with issues.
http://www.iec.ch/zone/si/si_bytes.htm
Sorry I was not clear. Is there any "real" regained disk space, or is it all just imagined by people who were not aware of the switch to decimal MB and GB? Presumably Apple would not be deceptive about this. The reported disk savings I have heard posted seem more in line with the 7-10% one would expect from the decimal change.

Also my other thought was that there might be savings due to a change in block sizes (? not sure if that's the right term) on the disk. Smaller blocks might be slightly more efficient at storing some kinds of files (but this is totally wild ass uninformed speculation on my part.)
DESuserIGN is offline   Reply With Quote
Old 09-18-2009, 01:09 PM   #38
solipsism
Registered User
 
Join Date: Apr 2006
Location: The Ansible
Posts: 11,895
Quote:
Originally Posted by DESuserIGN View Post
Sorry I was not clear. Is there any "real" regained disk space, or is it all just imagined by people who were not aware of the switch to decimal MB and GB? Presumably Apple would not be deceptive about this. The reported disk savings I have heard posted seem more in line with the 7-10% one would expect from the decimal change.

Also my other thought was that there might be savings due to a change in block sizes (? not sure if that's the right term) on the disk. Smaller blocks might be slightly more efficient at storing some kinds of files (but this is totally wild ass uninformed speculation on my part.)
Yes, at least 7GB is real space that is freed up. Most of the additional is just a difference in reporting from binary to decimal depending on partition size, which is why you see reports of 20GB and more.
solipsism is online now   Reply With Quote
Old 09-18-2009, 01:55 PM   #39
cjones051073
Registered User
 
Join Date: Sep 2009
Posts: 4
Quote:
Originally Posted by Gazoobee View Post
Sounds good, but Handbrake on my Mac Pro uses between 500% and 750% of the CPU. Unless we are talking different things, 130% seems low.
Don't expect to see any improvement in handbrake due to these technologies.

Handbrake can already efficiently uses all cores on multicore machines since the x264 library it uses supports this. Has done for some time, long before Grand Central Dispatch came along. So nothing to gain there.

Moreover, the x264 devs have already looked into OpenCL/CUDA and (from memory) deduced there is not much they can gain from that. GPUs may well be fast but they have some serious limitations, and in the case of H264 encoding result in them not being ideal (note I said encoding, not decoding...)

Last and not least handbrake is multi-platform. They support linux and windows as well as OSX, so are unlikely to start a widespread rewrite for some new technology only available on one platform.

Chris
cjones051073 is offline   Reply With Quote
Old 09-18-2009, 02:21 PM   #40
Munch
Registered User
 
Join Date: Jun 2003
Location: BC Canada
Posts: 6
Space saving is more than the loss of PPC code

The space saving in 10.6 is a lot more than stripping PPC code and re-defining a GB - apple actually optimized a ton of code AND implemented compression across the board under the hood. If you haven't read the ars review - here is a full page detailing it: http://arstechnica.com/apple/reviews...s-x-10-6.ars/3
Munch is offline   Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 11:19 AM.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.