... The problem isn't the compilers, that is a symptom. The problem (as you astutely point out) is the languages, and this is much deeper than just the type/length of intermediates. This is going to become more obvious going forward.
Well, what can we do about that? As a former assembler guy, I've looked at generated assembly from C, and didn't like it much; C++ is far worse due to all the dynamic pointers. But the world is going off in the direction of more complex languages, not to mention the scripting and bytecode stuff. It's almost like the software world is deliberately wasting the processor advances.
There ought to be a way to design a HLL that generates efficient code. But first, efficiency will have to become a priority, and I don't see that happening.
Well, what can we do about that? As a former assembler guy, I've looked at generated assembly from C, and didn't like it much; C++ is far worse due to all the dynamic pointers. But the world is going off in the direction of more complex languages, not to mention the scripting and bytecode stuff. It's almost like the software world is deliberately wasting the processor advances.
There ought to be a way to design a HLL that generates efficient code. But first, efficiency will have to become a priority, and I don't see that happening.
A while ago I ran the Cinebench benchmark on two PCs and my Mini. The last test which is the scene render took the following:
Mac Mini 1.42ghz 1GB - 201 seconds.
Pentium-M 1.4ghz 256MB - 141 seconds.
Pentium 4 3Ghz 2GB - 80 seconds.
All Cinebench shows is that Apple's OpenGL implementation sucks. If they don't improve their OpenGL implementation in the Intel Macs then we'll have equally embarrassing scores there too.
Come on, if a quad G5 with a high end nVidia card can get lower scores than some old 9600 equipped Pentium4 then you know it's not the CPU that's the issue. For whatever reason, Apple's graphics are slow and the OS shoves a lot more through a graphics card than Windows does hence the appearance of it being slow. In the Cinebench tests, rendering the image to the display slows down the calculation.
All Cinebench shows is that Apple's OpenGL implementation sucks. If they don't improve their OpenGL implementation in the Intel Macs then we'll have equally embarrassing scores there too.
Come on, if a quad G5 with a high end nVidia card can get lower scores than some old 9600 equipped Pentium4 then you know it's not the CPU that's the issue. For whatever reason, Apple's graphics are slow and the OS shoves a lot more through a graphics card than Windows does hence the appearance of it being slow. In the Cinebench tests, rendering the image to the display slows down the calculation.
The render scene is entirely CPU limited. The G4 is just a terrible processor.
OpenGL (and any 3D api) is a complex beast and there are many ways to use it, and many ways to implement it. Each implementation involves trade-offs that affect how it behaves at a system level, how it responds to what the individual applications do with it, and how it interacts with the actual CPU and GPU hardware. Plus you need to consider how well the drivers themselves are implemented.
I haven't personally seen Apple's implementation code, nor the ATI & nVidia drivers. I haven't seen the application code for these various poorly performing applications. I have seen many many examples of application authors who don't have a clue about efficient hardware utilization, drivers that are hacked and maligned code evolved over many years (rather than a clean re-implementation for a new CPU/GPU/OS combination), and graphics APIs which have been optimized for particular use cases and not others. One weak link can result in crappy performance. More than one can result in crappy performance that is hard to explain.
I have also seen Apple's OpenGL implementation pushed by well written applications smoke the PC version on what should be equivalent hardware. This is not a simple situation, so try not to over simplify assessments of it.
The core itself isn't too bad. The SDR bus should have been dealt with four years ago.
Well the core has a few issues too, but really they should have done to an onchip memory controller / DMA unit plus some IO. Freescale has some SoC technology and they could build an interesting laptop chip, but they are too busy preserving MPX compatibility. That's probably because Apple won't pay for it.
I have also seen Apple's OpenGL implementation pushed by well written applications smoke the PC version on what should be equivalent hardware. This is not a simple situation, so try not to over simplify assessments of it.
OK, but in the real world where applications share the same code between windows and Mac, anything that seems to be dual platform is getting smoked by Windows OpenGL from games to 3D applications. So I'd suggest that Apple would be wise to spend some time working out why the Windows versions of common applications run OpenGL so much quicker.
Well the core has a few issues too, but really they should have done to an onchip memory controller / DMA unit plus some IO. Freescale has some SoC technology and they could build an interesting laptop chip, but they are too busy preserving MPX compatibility. That's probably because Apple won't pay for it.
This is of course what they have planned for the 8641D which has an onboard DDR2 controller PER CORE, 4 gigabit ethernet, RapidIO and PCI-Express onboard and supposedly comes in at 25W typical for a 1.5Ghz chip. It has an FSB capable of 9.6GB/s so easily quicker than even Intel's.
I'd guess Apple were not keen on the lateness (shipping end of 2006), RapidIO, power consumption or the low clockspeed however.
OK, but in the real world where applications share the same code between windows and Mac, anything that seems to be dual platform is getting smoked by Windows OpenGL from games to 3D applications. So I'd suggest that Apple would be wise to spend some time working out why the Windows versions of common applications run OpenGL so much quicker.
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens. The situation will get more interesting with Windows Vista because there Microsoft will be aiming for the same level of OS functionality and robustness than Apple has, so the playing field will be somewhat more level. The main problem will be that MS is de-emphasizing OpenGL relative to Direct3D and that will have a big performance hit for OGL on Windows.
This is of course what they have planned for the 8641D which has an onboard DDR2 controller PER CORE, 4 gigabit ethernet, RapidIO and PCI-Express onboard and supposedly comes in at 25W typical for a 1.5Ghz chip. It has an FSB capable of 9.6GB/s so easily quicker than even Intel's.
I'd guess Apple were not keen on the lateness (shipping end of 2006), RapidIO, power consumption or the low clockspeed however.
Yeah, I'm well aware of Freescale's plans... but their slow rate of advance and emphasis on the embedded rather than laptop space has not doubt contributed to Apple's switch to Intel.
The most recent 7448 spec on the Freescale site suggests that a 1.7Ghz processor sucks down 21-25W at peak.
This still doesn't hide the fact that something must have happened...they are still not in Apple laptops. Either Apple has already 'jumped' or F'scale can't produce it in high enough volume.
This discussion seems to be a repeated of June this year...I have a strange feeling of deja vu reading this thread.
As for the evolution of the G4, Ars has had a number of threads talking about why it didn't progress further than it has. One poster suggested that the FSB hadn't progressed past 167Mhz because Apple hadn't asked for it or wasn't prepared to pay for improvements. Thats not even going into the debate as to whether F'scale wanted Apple to take RapidIO as the next FSB and run with it as the next interconnect. Personally, this doesn't make any sense but...I'm not in any position to comment otherwise.
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens.
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens.
Which is why I think there's some milage in Apple working out how to get cross platform apps which may have started out on Windows running faster. And for that matter, if they want developers to work on Macs, removing the disparity the other way too.
Quote:
Originally posted by Programmer
The situation will get more interesting with Windows Vista because there Microsoft will be aiming for the same level of OS functionality and robustness than Apple has, so the playing field will be somewhat more level. The main problem will be that MS is de-emphasizing OpenGL relative to Direct3D and that will have a big performance hit for OGL on Windows.
Microsoft seem to be de-emphazing everything in the OS unless you've got pretty high end kit whereas Apple allows most effects even on my 8MB VRAM equipped iBook G3. I was actually quite stunned they had expose and dashboard all working on it still.
I thought Microsoft only provided a stub implementation of OpenGL though and the actual implementation was up to the card vendors? Whereas with Apple, it's mostly Apple that has to do the work?
I thought Microsoft only provided a stub implementation of OpenGL though and the actual implementation was up to the card vendors? Whereas with Apple, it's mostly Apple that has to do the work?
MS actually supports two variations, the high performance one being called ICD (IIRC) and it has ownership of the entire OGL implementation from API calls to hardware. That tends to give too much control to the GPU vendor, however, and limits what MS can do... which is why they are deprecating it in Vista. The other model is so rarely used I can't remember what its called, and for that reason it is likely worse than Apple OGL implementation in all respects.
Comments
Originally posted by Programmer
... The problem isn't the compilers, that is a symptom. The problem (as you astutely point out) is the languages, and this is much deeper than just the type/length of intermediates. This is going to become more obvious going forward.
Well, what can we do about that? As a former assembler guy, I've looked at generated assembly from C, and didn't like it much; C++ is far worse due to all the dynamic pointers. But the world is going off in the direction of more complex languages, not to mention the scripting and bytecode stuff. It's almost like the software world is deliberately wasting the processor advances.
There ought to be a way to design a HLL that generates efficient code. But first, efficiency will have to become a priority, and I don't see that happening.
Originally posted by cubist
Well, what can we do about that? As a former assembler guy, I've looked at generated assembly from C, and didn't like it much; C++ is far worse due to all the dynamic pointers. But the world is going off in the direction of more complex languages, not to mention the scripting and bytecode stuff. It's almost like the software world is deliberately wasting the processor advances.
There ought to be a way to design a HLL that generates efficient code. But first, efficiency will have to become a priority, and I don't see that happening.
Yep.
Originally posted by Jootec from Mars
A while ago I ran the Cinebench benchmark on two PCs and my Mini. The last test which is the scene render took the following:
Mac Mini 1.42ghz 1GB - 201 seconds.
Pentium-M 1.4ghz 256MB - 141 seconds.
Pentium 4 3Ghz 2GB - 80 seconds.
All Cinebench shows is that Apple's OpenGL implementation sucks. If they don't improve their OpenGL implementation in the Intel Macs then we'll have equally embarrassing scores there too.
Come on, if a quad G5 with a high end nVidia card can get lower scores than some old 9600 equipped Pentium4 then you know it's not the CPU that's the issue. For whatever reason, Apple's graphics are slow and the OS shoves a lot more through a graphics card than Windows does hence the appearance of it being slow. In the Cinebench tests, rendering the image to the display slows down the calculation.
Originally posted by aegisdesign
All Cinebench shows is that Apple's OpenGL implementation sucks. If they don't improve their OpenGL implementation in the Intel Macs then we'll have equally embarrassing scores there too.
Come on, if a quad G5 with a high end nVidia card can get lower scores than some old 9600 equipped Pentium4 then you know it's not the CPU that's the issue. For whatever reason, Apple's graphics are slow and the OS shoves a lot more through a graphics card than Windows does hence the appearance of it being slow. In the Cinebench tests, rendering the image to the display slows down the calculation.
The render scene is entirely CPU limited. The G4 is just a terrible processor.
I haven't personally seen Apple's implementation code, nor the ATI & nVidia drivers. I haven't seen the application code for these various poorly performing applications. I have seen many many examples of application authors who don't have a clue about efficient hardware utilization, drivers that are hacked and maligned code evolved over many years (rather than a clean re-implementation for a new CPU/GPU/OS combination), and graphics APIs which have been optimized for particular use cases and not others. One weak link can result in crappy performance. More than one can result in crappy performance that is hard to explain.
I have also seen Apple's OpenGL implementation pushed by well written applications smoke the PC version on what should be equivalent hardware. This is not a simple situation, so try not to over simplify assessments of it.
Originally posted by Existence
The render scene is entirely CPU limited. The G4 is just a terrible processor.
The core itself isn't too bad. The SDR bus should have been dealt with four years ago.
Originally posted by BenRoethig
The core itself isn't too bad. The SDR bus should have been dealt with four years ago.
Well the core has a few issues too, but really they should have done to an onchip memory controller / DMA unit plus some IO. Freescale has some SoC technology and they could build an interesting laptop chip, but they are too busy preserving MPX compatibility. That's probably because Apple won't pay for it.
Originally posted by Programmer
I have also seen Apple's OpenGL implementation pushed by well written applications smoke the PC version on what should be equivalent hardware. This is not a simple situation, so try not to over simplify assessments of it.
OK, but in the real world where applications share the same code between windows and Mac, anything that seems to be dual platform is getting smoked by Windows OpenGL from games to 3D applications. So I'd suggest that Apple would be wise to spend some time working out why the Windows versions of common applications run OpenGL so much quicker.
Originally posted by Programmer
Well the core has a few issues too, but really they should have done to an onchip memory controller / DMA unit plus some IO. Freescale has some SoC technology and they could build an interesting laptop chip, but they are too busy preserving MPX compatibility. That's probably because Apple won't pay for it.
This is of course what they have planned for the 8641D which has an onboard DDR2 controller PER CORE, 4 gigabit ethernet, RapidIO and PCI-Express onboard and supposedly comes in at 25W typical for a 1.5Ghz chip. It has an FSB capable of 9.6GB/s so easily quicker than even Intel's.
I'd guess Apple were not keen on the lateness (shipping end of 2006), RapidIO, power consumption or the low clockspeed however.
Originally posted by aegisdesign
OK, but in the real world where applications share the same code between windows and Mac, anything that seems to be dual platform is getting smoked by Windows OpenGL from games to 3D applications. So I'd suggest that Apple would be wise to spend some time working out why the Windows versions of common applications run OpenGL so much quicker.
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens. The situation will get more interesting with Windows Vista because there Microsoft will be aiming for the same level of OS functionality and robustness than Apple has, so the playing field will be somewhat more level. The main problem will be that MS is de-emphasizing OpenGL relative to Direct3D and that will have a big performance hit for OGL on Windows.
Originally posted by aegisdesign
This is of course what they have planned for the 8641D which has an onboard DDR2 controller PER CORE, 4 gigabit ethernet, RapidIO and PCI-Express onboard and supposedly comes in at 25W typical for a 1.5Ghz chip. It has an FSB capable of 9.6GB/s so easily quicker than even Intel's.
I'd guess Apple were not keen on the lateness (shipping end of 2006), RapidIO, power consumption or the low clockspeed however.
Yeah, I'm well aware of Freescale's plans... but their slow rate of advance and emphasis on the embedded rather than laptop space has not doubt contributed to Apple's switch to Intel.
This still doesn't hide the fact that something must have happened...they are still not in Apple laptops. Either Apple has already 'jumped' or F'scale can't produce it in high enough volume.
This discussion seems to be a repeated of June this year...I have a strange feeling of deja vu reading this thread.
As for the evolution of the G4, Ars has had a number of threads talking about why it didn't progress further than it has. One poster suggested that the FSB hadn't progressed past 167Mhz because Apple hadn't asked for it or wasn't prepared to pay for improvements. Thats not even going into the debate as to whether F'scale wanted Apple to take RapidIO as the next FSB and run with it as the next interconnect. Personally, this doesn't make any sense but...I'm not in any position to comment otherwise.
Intel is it...viva no difference
originally posted by programmer:
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens.
Such days may come.
Originally posted by Programmer
More correctly, anything that is dual platform and originates on the PC will perform better than on the Mac. If code is optimized for Mac then ported to PC, you are likely to see the situation reversed. Unfortunately that rarely happens.
Which is why I think there's some milage in Apple working out how to get cross platform apps which may have started out on Windows running faster. And for that matter, if they want developers to work on Macs, removing the disparity the other way too.
Originally posted by Programmer
The situation will get more interesting with Windows Vista because there Microsoft will be aiming for the same level of OS functionality and robustness than Apple has, so the playing field will be somewhat more level. The main problem will be that MS is de-emphasizing OpenGL relative to Direct3D and that will have a big performance hit for OGL on Windows.
Microsoft seem to be de-emphazing everything in the OS unless you've got pretty high end kit whereas Apple allows most effects even on my 8MB VRAM equipped iBook G3. I was actually quite stunned they had expose and dashboard all working on it still.
I thought Microsoft only provided a stub implementation of OpenGL though and the actual implementation was up to the card vendors? Whereas with Apple, it's mostly Apple that has to do the work?
Originally posted by aegisdesign
I thought Microsoft only provided a stub implementation of OpenGL though and the actual implementation was up to the card vendors? Whereas with Apple, it's mostly Apple that has to do the work?
MS actually supports two variations, the high performance one being called ICD (IIRC) and it has ownership of the entire OGL implementation from API calls to hardware. That tends to give too much control to the GPU vendor, however, and limits what MS can do... which is why they are deprecating it in Vista. The other model is so rarely used I can't remember what its called, and for that reason it is likely worse than Apple OGL implementation in all respects.