Simultaneous Multithreading

2»

Comments

  • Reply 21 of 33
    programmerprogrammer Posts: 3,461member
    [quote]Originally posted by mmicist:

    <strong>

    Rename registers aren't visible to the program...

    </strong><hr></blockquote>



    Yes, I understand all that but the number of rename registers isn't always a direct function of the number of instructions in-flight (at least it doesn't seem to be in the processor specs I've looked at). The number of registers needed by each instruction in-flight varies, especially in the face of instruction cracking or decoding. Some sequences of instructions will require more rename registers than others. If a rename register isn't available then you'll get a stall. So why not provide the worst-case number of rename registers? They are expensive in terms of circuitry so at some point the space is best spent on something other than eliminating rare stalls in those unusual cases where there aren't enough rename registers.



    FYI: the PPC970 has 80 GPR, 80 FPR, and 80 VPR rename registers, and a maximum of about 200 in-flight instructions. The division between integer, floating point and vector instructions isn't 1/3, 1/3, 1/3 however.
  • Reply 22 of 33
    mmicistmmicist Posts: 214member
    [quote]Originally posted by Programmer:

    <strong>



    Yes, I understand all that but the number of rename registers isn't always a direct function of the number of instructions in-flight (at least it doesn't seem to be in the processor specs I've looked at). The number of registers needed by each instruction in-flight varies, especially in the face of instruction cracking or decoding. Some sequences of instructions will require more rename registers than others. If a rename register isn't available then you'll get a stall. So why not provide the worst-case number of rename registers? They are expensive in terms of circuitry so at some point the space is best spent on something other than eliminating rare stalls in those unusual cases where there aren't enough rename registers.



    FYI: the PPC970 has 80 GPR, 80 FPR, and 80 VPR rename registers, and a maximum of about 200 in-flight instructions. The division between integer, floating point and vector instructions isn't 1/3, 1/3, 1/3 however.</strong><hr></blockquote>



    Yes, there is a trade-off between expected instruction throughput and chip complexity, here, I was trying to simplify a little.



    The point I was trying to make, though, was that adding rename registers shouldn't change the algorithms used, which see only the architectural registers, and that adding a second thread doesn't involve doubling the number of rename registers, but might imply a slight increase, and adding further threads will never take the number of rename registers beyond the maximum possible number of instructions in flight.



    michael
  • Reply 23 of 33
    airslufairsluf Posts: 1,861member
  • Reply 24 of 33
    programmerprogrammer Posts: 3,461member
    [quote]Originally posted by mmicist:

    <strong>



    Yes, there is a trade-off between expected instruction throughput and chip complexity, here, I was trying to simplify a little.



    The point I was trying to make, though, was that adding rename registers shouldn't change the algorithms used, which see only the architectural registers, and that adding a second thread doesn't involve doubling the number of rename registers, but might imply a slight increase, and adding further threads will never take the number of rename registers beyond the maximum possible number of instructions in flight.

    </strong><hr></blockquote>



    Right -- I wasn't saying anything about the architectural registers though, just discussing the design tradeoffs when building the internal structure of the processor. While the number of rename registers will probably never be taken beyond the maximum number that could possibly be used, with multi-threading it makes a lot more sense to push the rename register count closer to that theoretical maximum because of the better utilization. That was the whole gist of my discussion.
  • Reply 25 of 33
    klinuxklinux Posts: 453member
    [quote]Originally posted by Powerdoc:

    <strong>

    Thanks i understand HT is HypersuckingMT

    </strong><hr></blockquote>



    Uh yeah, very convincing argument. If Apple/Moto can do that with currently shipping CPUs we'd all be cheering now. Intel deserves credit for being the only one, not AMD, Motorola, or IBM, shipping a consumer CPU that supports this functionality.



    In any case, while I do not think Intel's implementation of HT is particularly great but it is does wring more performance out of a single processor than without hyperthreading on certain tasks. See THG's take on this <a href="http://www6.tomshardware.com/cpu/20021114/p4_306ht-24.html"; target="_blank">http://www6.tomshardware.com/cpu/20021114/p4_306ht-24.html</a>; and also the video showing the difference. Very interesting.
  • Reply 26 of 33
    hmurchisonhmurchison Posts: 12,431member
    [quote] Uh yeah, very convincing argument. If Apple/Moto can do that with currently shipping CPUs we'd all be cheering now. Intel deserves credit for being the only one, not AMD, Motorola, or IBM, shipping a consumer CPU that supports this functionality. <hr></blockquote>



    Yes but it's damn near Intel's Ethos to over engineer their processors in one area to overcome some glaring deficiency in another. Hyperthreading is par for the course for Intel. I await seeing SMT done RIGHT the first time.
  • Reply 27 of 33
    programmerprogrammer Posts: 3,461member
    [quote]Originally posted by hmurchison:

    <strong>Yes but it's damn near Intel's Ethos to over engineer their processors in one area to overcome some glaring deficiency in another. Hyperthreading is par for the course for Intel. I await seeing SMT done RIGHT the first time.</strong><hr></blockquote>



    Actually its much more "Intelian" to toss in a half-baked feature to see if it sinks or swims at the customer's expense, and then revise it a couple of times until its "good enough" -- naturally forcing software to be rewritten and confusing the heck out of the hardware installed base. MMX/SSE/SSE2 being a case-in-point which survived the process. HyperThreading looks successful enough that it'll get rev'd a couple of times and they'll eventually have a decent implementation of it. The PowerPC guys seem to think about the problems a bit more and plan ahead -- then deliver a good-to-excellent solution first time out of the gate... of course since they're spending far less money on processor development that generally means we get things last (VMX, 64-bit, SMT, SMP being cases-in-point).
  • Reply 28 of 33
    klinuxklinux Posts: 453member
    True, I agree with your Programmer.



    However, will PowerPC guys overcome the marketing edge that Intel has when IBM comes out the better engineered solution and Intel is already touting that 1) they had it first and has since had it for the past 12 months and 2) now it features 'SuperUltraCentrinoKinetic (S.U.C.K) Hyperthread threading II'!
  • Reply 29 of 33
    nevynnevyn Posts: 360member
    I'd just like to point out what a good thing for _Apple_ it is that Intel has SMT going into comsumer level stuff.



    To _really_ eke out the benefits, the programs have to be written in a 'multi-threaded' fashion, so there's something better to do in the 'other thread' in the CPU.



    This is one of those areas where there's been a lot room for improvement. A single-threaded app on a dual-CPU Mac doesn't get anywhere near the improvement an app that takes full benefit of the second CPU.



    So the Mac-v-PC benchmarks can show _wild_ variances depending on if the app is multi-threaded. We should be delighted if major PC apps/games/whatnot are redesigned to work well in a multi-threaded environment.



    The other nice thing is that the _tough_ step is getting things to go from 1-CPU to 2-CPU. It's a lot easier to take an app designed for a dual & make it run well on a dual-cored, hyperthreaded, dual CPU box at full speed, than is is to go from single-threaded to multi-threaded.



    IOW, I think there should be a net increase in apps that run _well_ on Macs.
  • Reply 30 of 33
    programmerprogrammer Posts: 3,461member
    [quote]Originally posted by Nevyn:

    <strong>I'd just like to point out what a good thing for _Apple_ it is that Intel has SMT going into comsumer level stuff.



    To _really_ eke out the benefits, the programs have to be written in a 'multi-threaded' fashion, so there's something better to do in the 'other thread' in the CPU.



    This is one of those areas where there's been a lot room for improvement. A single-threaded app on a dual-CPU Mac doesn't get anywhere near the improvement an app that takes full benefit of the second CPU.



    So the Mac-v-PC benchmarks can show _wild_ variances depending on if the app is multi-threaded. We should be delighted if major PC apps/games/whatnot are redesigned to work well in a multi-threaded environment.



    The other nice thing is that the _tough_ step is getting things to go from 1-CPU to 2-CPU. It's a lot easier to take an app designed for a dual & make it run well on a dual-cored, hyperthreaded, dual CPU box at full speed, than is is to go from single-threaded to multi-threaded.



    IOW, I think there should be a net increase in apps that run _well_ on Macs.</strong><hr></blockquote>





    This is a good point -- multi-threaded software has been a chore to get people to write because most of the machines out there run slower if you have multiple threads... so why write the code that way? The way of the future, however, is multi-threaded and this has been apparent for a long time now for anybody paying attention. The PC world is finally getting common hardware which benefits from multiple threads, and that will finally start encouraging developers to do the work. Since Apple has been shipping multiprocessor machines for quite a while now, this will benefit them more -- especially when IBM brings multiple cores and SMT hardware to the PowerPC line up.
  • Reply 31 of 33
    os10geekos10geek Posts: 413member
    Since the Intel Pentium 4 HT, or whatever it is called, was recently released, I would expect that more multi-thread apps will be developed on the PC side, stimulating developement of good Mac software out of formerly PC only developers. For example, will a mac version of Discreet 3DS come?
  • Reply 32 of 33
    cowerdcowerd Posts: 579member
    [quote]For example, will a mac version of Discreet 3DS come?<hr></blockquote>There is a better chance of Bush and Saddam buying a timeshare in Key West than any 3D software from Autodesk showing up on the Mac platform.
  • Reply 33 of 33
    os10geekos10geek Posts: 413member
    <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" /> <img src="graemlins/lol.gif" border="0" alt="[Laughing]" />
Sign In or Register to comment.