Faster FSB's instead of faster CPU's

Posted:
in Future Apple Hardware edited January 2014
The next revision of PowerMac's may not have a faster processor, but what it Apple kept working on improving all the smaller stuff. The FSB which is at 1.25Ghz, if they boost it to 1.75Ghz, performance would be increased no matter how fast the processor is? correct? If Apple could get there FSB to 2Ghz, that would be quite impressive. If they implemented faster ram (they have 550Mhz ram out PC4400), unidirectional traffic between I/0 and system controller, unidiretional PCIe x16 slot.



If Apple Increased all this other stuff besides the processor, they would be setting up a rock solid, powerful base when we get that extra 1Ghz boost from a dual 3Ghz system. The new 3Ghz chips would absolutly love all that bandwidth. 16GB's of aggregate FSB bandwidth is incredible and all creative pros would love it.





The 2.5Ghz G5 chip is powerful enough for now IMO, we could get dual 3Ghz performace if the ram was upgraded to 550Mhz, and FSB at 2Ghz.

Comments

  • Reply 1 of 19
    zapchudzapchud Posts: 844member
    What if increasing core frequency is easier than increasing FSB frequency? And cheaper?



    What if increasing FSB bandwidth didn't actually increase performance in any substantial (or measurable) way?



    How about "simply" ditching the FSB like AMD did, would do the job instead? Because that actually does work (as in increasing performance).



    How about reducing the latency of the FSB?



    Faster RAM would also work, because the FSBs are too fast for the RAM they're currently using already.



    There are too many good alternatives to increasing FSB bandwidth/frequency. :-)
  • Reply 2 of 19
    hmurchisonhmurchison Posts: 12,419member
    I'm sure you could engineer a faster FSB but it comes to a point where you'll hit diminishing returns. The question we need to ask is why do we need a FSB anyways?



    Ondie memory controllers seem to be the most sensible way. You increase the pin count of the CPU and complexity but those costs are somewhat ameliorated by eschewing the need for an external memory controller.



    DDR2 has additional latency issues because it pumps the bus more than DDR. Pretty much the same path that FSB saw when we moved from double pumped busses to quad pumped.



    To me, it's looking like moving to DDR2 667 and on will be best mated with an ondie memory controller which should offset the the increase in memory latency due it it's efficiency.



    Then we just connect the CPUs via something like Hypertransport 2.0 and and find an efficient way to delegate control over snooping the caches that's perhaps a bit more efficient than the Opteron MP systems in which one CPU handles that(not sure what type of performance hit this entails).
  • Reply 3 of 19
    zapchudzapchud Posts: 844member
    Quote:

    Originally posted by hmurchison

    I'm sure you could engineer a faster FSB but it comes to a point where you'll hit diminishing returns. The question we need to ask is why do we need a FSB anyways?



    I bet we see plenty of diminishing returns already. At least until (if there ever will be) there is fast enough RAM to supply the FSBs.
  • Reply 4 of 19
    If I'm not mistaken, the G5s fsb always operates at half the processors frequency. That would put the ball in IBMs court. Besides, engineering isn't holding back G5 speed bumps, its manufacturing.
  • Reply 5 of 19
    merovingianmerovingian Posts: 436member
    Quote:

    Originally posted by Altivec_2.0

    The next revision of PowerMac's may not have a faster processor, but what it Apple kept working on improving all the smaller stuff. The FSB which is at 1.25Ghz, if they boost it to 1.75Ghz, performance would be increased no matter how fast the processor is? correct? If Apple could get there FSB to 2Ghz, that would be quite impressive. If they implemented faster ram (they have 550Mhz ram out PC4400), unidirectional traffic between I/0 and system controller, unidiretional PCIe x16 slot.



    If Apple Increased all this other stuff besides the processor, they would be setting up a rock solid, powerful base when we get that extra 1Ghz boost from a dual 3Ghz system. The new 3Ghz chips would absolutly love all that bandwidth. 16GB's of aggregate FSB bandwidth is incredible and all creative pros would love it.





    The 2.5Ghz G5 chip is powerful enough for now IMO, we could get dual 3Ghz performace if the ram was upgraded to 550Mhz, and FSB at 2Ghz.




    You're right. If the system bus and main memory were faster, computers in general would be faster today. Over the years, we have concentrated on getting CPUs faster and faster, and the following has happened:







    The disparity between memory and CPU speed has increased dramatically... m.
  • Reply 6 of 19
    Building faster FSBs is hard to do and would lead to huge heat dissipation from the chipset (actually, in the PMac G5, the chipset dissipates more heat than the 970 itself).



    Breaking the memory wall is the solution for sure...
  • Reply 7 of 19
    programmerprogrammer Posts: 3,457member
    Quote:

    Originally posted by The One to Rescue

    Building faster FSBs is hard to do and would lead to huge heat dissipation from the chipset (actually, in the PMac G5, the chipset dissipates more heat than the 970 itself).



    Do you have any concrete proof of that? It is accepted that the chipset is quite hot, but that's a long way from saying that it generates more heat than the processor. The heat pipe on the G5 chipset is nothing like the monster heat sinks on the processors, so I'm extremely skeptical of your statement unless you can produce some technical document proving it.







    I don't think we can break the "memory wall", as you called it. Latency relative to the speed of computation is increasing all the time, even within a single chip. We can't increase the physical speed at which we can send signals (due to the immutable wall of the speed of light), and most of the techniques we use for speeding up systems cause the length of the path to increase in favour of bandwidth. There are a few architectural things that can be done to the G5, in particular, which have been done in other systems (on-chip memory controller being the leading example) that will temporarily improve the situation, but not without cost. And then the situation will worsen in the next generation of hardware as computational speed and bandwidth improve again, but latency doesn't.





    As for the original topic of the thread... I don't think trying to improve the FSB speed relative to processor speed (for the G5) would help noticably. If the memory subsystem could be sped up, especially in terms of latency, that would be a much bigger deal. The biggest gains in that area are out of Apple's hands, however... either IBM's processor needs to talk to memory directly, or the memory has to get faster. Apple is probably working with IBM on the former, and waiting for the industry on the latter.
  • Reply 8 of 19
    Quote:

    Originally posted by Programmer

    Do you have any concrete proof of that? It is accepted that the chipset is quite hot, but that's a long way from saying that it generates more heat than the processor. The heat pipe on the G5 chipset is nothing like the monster heat sinks on the processors, so I'm extremely skeptical of your statement unless you can produce some technical document proving it.







    I don't think we can break the "memory wall", as you called it. Latency relative to the speed of computation is increasing all the time, even within a single chip. We can't increase the physical speed at which we can send signals (due to the immutable wall of the speed of light), and most of the techniques we use for speeding up systems cause the length of the path to increase in favour of bandwidth. There are a few architectural things that can be done to the G5, in particular, which have been done in other systems (on-chip memory controller being the leading example) that will temporarily improve the situation, but not without cost. And then the situation will worsen in the next generation of hardware as computational speed and bandwidth improve again, but latency doesn't.





    As for the original topic of the thread... I don't think trying to improve the FSB speed relative to processor speed (for the G5) would help noticably. If the memory subsystem could be sped up, especially in terms of latency, that would be a much bigger deal. The biggest gains in that area are out of Apple's hands, however... either IBM's processor needs to talk to memory directly, or the memory has to get faster. Apple is probably working with IBM on the former, and waiting for the industry on the latter.




    as much as i agree with some of the sentiments in this thread, it seems like people need to look at the bigger picture to see what the bottlenecks are now...hard drives are really slow compared to almost anything else in the system. that's expected, since they're basically the only mechanical device left (yes, and optical drives, which are even slower, but don't really limit the effective speed of the overall system). I'm really surprised that the mass storage industry hasn't been embracing radical new technology and research directions. I feel like WD, Maxtor, Hitachi, etc are all stuck in a snow storm with blinders on. Those are the companies that should be dedicating a lot of R&D towards new technology to REPLACE mechanical hard drives, not to improve the existing technology. They don't seem to understand that they're in the MASS STORAGE industry, not the HARD DRIVE industry. Advances in all types of memory are pretty exciting these days, but nothing compared to what might be possible if a bunch of the mass storage guys put their waffles on the plate.



    [/rant]
  • Reply 9 of 19
    wizard69wizard69 Posts: 13,377member
    Here is my take which can be digested at will.



    At this point it would not be much of a performance inprovement to increase the FSB speed. This would especially be the case at the odd frequencies mentioned. Running 1 to 1 may offer marginal improvements but there are far to many other bottle necks to contend with.



    Probally the two most important things that could be done to improve performance at this point is to implement an integrated memory controller and to increase the cache size.



    A larger cache size and other immprovements there, are absolutely required; it does make one wonder about the 970FX and why Apple & IBM did nothing more than a die shrink. The problem is that memory is not getting faster at a rate that really matters, this has a significant impact as clock rates increase. This requires proportionally larger caches as clock rates increase to maintain good performance. At this point though the modest increase from 2 to 2.5 GHz does not seem to have had a major impact on scaling with the few bits of performance information seen so far.



    As far as people loving an increase in bandwidth, that is only of importance if the rest of the system can make use of that bandwidth. To that end the current G5's are a rather poor example of computer technology actually making use of the bandwidth available. Even if the next generation G5 went with PCI-Express video and a raid controller for disk I/O there is still an issue of actually making use of that bandwidth. Memory is the modern choke point and everything data wise eventually goes there.



    It should be noted that one very promising new technology from Apple is the use of the GPU. This has the potential to reduce demand on main memory, thus freeing up bandwidth to this overloaded subsystem. Of course this then loads another subsystem. The point is that there is no free lunch, any increase in bandwidth has to be followed by other improvement to be worthwhile. Bandwidth in and of itself is not a feature, even though Apple may be marketing it that way. All the bandwidth in the world will not make up for slow components else where.



    Dave





    Quote:

    Originally posted by Altivec_2.0

    The next revision of PowerMac's may not have a faster processor, but what it Apple kept working on improving all the smaller stuff. The FSB which is at 1.25Ghz, if they boost it to 1.75Ghz, performance would be increased no matter how fast the processor is? correct? If Apple could get there FSB to 2Ghz, that would be quite impressive. If they implemented faster ram (they have 550Mhz ram out PC4400), unidirectional traffic between I/0 and system controller, unidiretional PCIe x16 slot.



    If Apple Increased all this other stuff besides the processor, they would be setting up a rock solid, powerful base when we get that extra 1Ghz boost from a dual 3Ghz system. The new 3Ghz chips would absolutly love all that bandwidth. 16GB's of aggregate FSB bandwidth is incredible and all creative pros would love it.





    The 2.5Ghz G5 chip is powerful enough for now IMO, we could get dual 3Ghz performace if the ram was upgraded to 550Mhz, and FSB at 2Ghz.




  • Reply 10 of 19
    Quote:

    Originally posted by Programmer

    Do you have any concrete proof of that? It is accepted that the chipset is quite hot, but that's a long way from saying that it generates more heat than the processor. The heat pipe on the G5 chipset is nothing like the monster heat sinks on the processors, so I'm extremely skeptical of your statement unless you can produce some technical document proving it.



    That's just what our CPU architecture teacher told us during class back in March. Since the guy has worked for IBM, and is now working on POWER-based mother board design, I assumed that his statement was true.

    Anyway, except the notes I have taken when attending his class, I have absolutely no concrete proof of that, so feel free to believe it or not.



    However, what I was saying is that the chipset is already running pretty hot, and increasing the speed of the FSB wouldn't make it better... would it?
  • Reply 11 of 19
    Quote:

    Those are the companies that should be dedicating a lot of R&D towards new technology to REPLACE mechanical hard drives, not to improve the existing technology. They don't seem to understand that they're in the MASS STORAGE industry, not the HARD DRIVE industry.



    I might be wrong on this one, but I am pretty sure that those companies make R&D efforts for finding solutions to replace mechanical hard drives, but in terms of storage/access time ratio, I'm not sure that they have found a better solution that mechanics when it comes to mass storage.



    Anyway, when the apps are well programmed, the hard drive ends up being used as seldom as possible, so that reduces the impact of the bottleneck...
  • Reply 12 of 19
    programmerprogrammer Posts: 3,457member
    Quote:

    Originally posted by The One to Rescue

    That's just what our CPU architecture teacher told us during class back in March. Since the guy has worked for IBM, and is now working on POWER-based mother board design, I assumed that his statement was true.

    Anyway, except the notes I have taken when attending his class, I have absolutely no concrete proof of that, so feel free to believe it or not.



    However, what I was saying is that the chipset is already running pretty hot, and increasing the speed of the FSB wouldn't make it better... would it?




    Nah, I'm afraid I don't buy it. Perhaps you misinterpreted his statement or something like that, but I'm not going to believe it without some real evidence. You're right though -- bumping its clock rate further would increase its heat problems for minimal performance gain.







    The mass storage question is an interesting one. Lots of companies are doing R&D on alternatives, but most of the money is going into pushing the envelope on the known quantity. The capacity/speed/cost parameters of a hard disk are very compelling and alternative technologies just aren't there yet. Perhaps one day soon something will come along that can compete, but clearly that hasn't happened yet.
  • Reply 13 of 19
    mmmpiemmmpie Posts: 628member
    The 'northbridge' or main logic chip runs at some speed, x.

    The part of the northbridge that interfaces with the FSB of the processor has to run at a minimum, at the speed of the FSB. It may have to run faster, depending on latency requirements, and functionality requirements ( eg: in the olden days the northbridge managed off chip cache, and in the G5 it is responsible for integrating memory requests from 2 cpus ).



    Heat is a function of speed, so if you increase the FSB you have to increase the speed of the northbridge, and hence its power requirements.



    Im sure that the northbridge could go faster, but it is certainly getting into the realm of cpu speed in the case of the G5, and so making it go faster will put greater load on the cooling system.



    Just increasing the FSB isnt the answer to computing speed. Increasing a system's speed is a matter of balancing competing demands for resources ( typically versus cost ). Increasing the FSB speed probably wont buy you much when memory speeds havent increased. You also cant just increase cache size ( increasing cache size decreases cache speed, compare Athlon and P4 cache architectures and speed - exercise left to reader ).



    Bringing the memory controller on chip is probably a better use of space than increasing cache, in the context of consumer class hardware. It also makes sense to me to dump the current G5 bus in favor of AMD's hypertransport. Then Apple can leverage off the shelf hypertransport hardware for AGP, PCI-X, PCIe etc etc.



    Other components that are 'obvious' candidates for improvement suffer from the same issues. There isnt much point increasing HD speeds when typical machines are limited to PCI bandwidth of 100 mb/s ( real world ). So that has to improve ( witness PCIe ). Of course you can get faster drives, or use RAID to improve speed. The immediate problems of HD speed can, however, be solved by buying more RAM, so HD manufacturers dont feel heavily pressured to change their model.



    There are real improvements waiting at the software level. It would be nice to have actual hardware acceleration for quartz ( not just accelerated compositing in quartz extreme ), maybe core imaging will help, but it doesnt look like it.
  • Reply 14 of 19
    hmurchisonhmurchison Posts: 12,419member
    Quote:

    There are real improvements waiting at the software level. It would be nice to have actual hardware acceleration for quartz ( not just accelerated compositing in quartz extreme ), maybe core imaging will help, but it doesnt look like it.



    Enter Quartz 2D



    Quartz 2D is the powerful 2D graphics engine in Mac OS X, with advanced features such as transparency, anti-aliasing, and PDF support. Exciting new developments in Quartz 2D will be discussed in this session along with a focus on best practices you should follow to get the most out of Quartz 2D. This session is a must see for all WWDC attendees who use 2D graphics in their applications.



    Word on the street is a lot more functions are going to be accelerated via Q2D.



    Check for Thaens post halfway down



    Quote:

    4) Quartz2D has been accelerated. It DOES REQUIRE some modification by application developers who use custom views, but not much. After modification, general UI speedup is 2x. Processor utilization goes way down.



    Looks like Apple is taking care of some of the weaker areas of OSX.
  • Reply 15 of 19
    wizard69wizard69 Posts: 13,377member
    The only thing I take acception to is your statment below and even that is subject to perspective.



    There is certainly a protential of slowing down the cache subsystem when expanding the size of the unit. It doesn't have to happen but may very well be an issue. At some point though loosing a cycle on cache accesses will be perferrable to going to main memory. When that happens and what sort of application mix will trigger the need for a larger cache is a bit up in the air. Certain server work loads could use the cache today.



    Even an integrated memory controller is not the ultimate solution if the ratio between CPU cycle time and memory access continues to rise. Without the ability to scale memory subsystem performance with the processor, the integrated memory controller only offers an incremental one time gain. You are then back to either a larger cache or improvements to the memory system, such as wider paths. It is indeed interesting to see how AMD is addressing this very issue.



    The software mix running on the machine obviously has some impact on the utility of any system enhancements. It will be very interesting to see how the 2.5 GHz machine performs against the old 2 GHz machine when they are out. We should be able to see if 500MHz is enough to stress the current cache in any one workload.





    Quote:

    Originally posted by mmmpie

    You also cant just increase cache size ( increasing cache size decreases cache speed, compare Athlon and P4 cache architectures and speed - exercise left to reader ).



  • Reply 16 of 19
    hmurchisonhmurchison Posts: 12,419member
    Quote:

    Even an integrated memory controller is not the ultimate solution if the ratio between CPU cycle time and memory access continues to rise. Without the ability to scale memory subsystem performance with the processor, the integrated memory controller only offers an incremental one time gain. You are then back to either a larger cache or improvements to the memory system, such as wider paths. It is indeed interesting to see how AMD is addressing this very issue.



    I don't forsee that as being too much of a problem for the next few years. There will probably be DDR-2 800(PC 6400) by the end of 2005. Run that in a dual channel configuration and you have 12,800GBps throughput per processor. Hot damn that's fast and each cpu with it's ondie controller would have it's dedicated connection. I'm not concerned as much with the ratio between CPU and Memory if I'm allowed to scale with increased CPUs or CPU clockspeed.



    However AMD has options. Athlons and Opterons have the ability to shut down the ondie controller in lieu of an external controller. IBM needs to go this route if they can't stretch the elastic bus any further.
  • Reply 17 of 19
    programmerprogrammer Posts: 3,457member
    Quote:

    Originally posted by wizard69

    The software mix running on the machine obviously has some impact on the utility of any system enhancements. It will be very interesting to see how the 2.5 GHz machine performs against the old 2 GHz machine when they are out. We should be able to see if 500MHz is enough to stress the current cache in any one workload.



    Early benchmarks show a linear performance increase (i.e. 25%).
  • Reply 18 of 19
    onlookeronlooker Posts: 5,252member
    Quote:

    Originally posted by hmurchison

    I don't forsee that as being too much of a problem for the next few years. There will probably be DDR-2 800(PC 6400) by the end of 2005. Run that in a dual channel configuration and you have 12,800GBps throughput per processor. Hot damn that's fast and each cpu with it's ondie controller would have it's dedicated connection. I'm not concerned as much with the ratio between CPU and Memory if I'm allowed to scale with increased CPUs or CPU clockspeed.



    However AMD has options. Athlons and Opterons have the ability to shut down the ondie controller in lieu of an external controller. IBM needs to go this route if they can't stretch the elastic bus any further.



    DDR3 is already being used in PNY Nvidia Graphics cards.



    Quote:

    Originally hosted at Nvidia.com

    PNY GeForce 6800 Ultra


    PNY_is offering the 256MB, DDR3 GeForce 6800 Ultra bundled with a beach cooler. Drench yourself in the powerful and elegant graphics of NVIDIA's most powerful GPU, featuring support for DX9 Shader Model 3.0, a superscalar 16-pipe architecture and 64-bit texture filtering and blending.




  • Reply 19 of 19
    hmurchisonhmurchison Posts: 12,419member
    That's actually GDDDR3 (Graphics DDR) and it's insanely fast.



    http://www.xbitlabs.com/news/memory/...213210313.html





    GDDR3 on todays cards has bandwidth of 15+ GBps. The article above talks about a batch of memory Nvidia tested at 50GBps peak!



    If money was no object GDDR3 would make for one hell of a boost for system RAM.
Sign In or Register to comment.