benracicot

About

Username
benracicot
Joined
Visits
1
Last Active
Roles
member
Points
1
Badges
0
Posts
2
  • M2 and beyond: What to expect from the M2 Pro, M2 Max, and M2 Ultra

    I'm sorry, but this article is a serious failure due to ignorance of some of the basic underlying technologies.

    For example, the guesses about memory are completely off base. There is literally no chance at all that they're even close, based on the article's assumptions.

    The M1 has a bandwidth of ~68GB/s because it has a 128-bit memory bus and uses LPDDR4 memory at 4.266GT/s. The M1 Pro has higher bandwidth of ~200GB/s because it uses LPDDR5 memory at 6.4GT/s, and ALSO because it uses a double-wide bus (256 bits).

    The M2 has the same memory bus size (128 bits) as the M1, but it's already using LPDDR5 at 6.4GT/s. If there's an M2 Pro based on the same doubling as the M1 Pro was, it won't get any further benefit from the LPDDR5 (since the M2 already has that). It will have the same ~200GB/s bandwidth as the M1 Pro.

    Of course this all depends on timing - if the M2 Pro came out a year from now, higher-performance LPDDR5 might be common/cheap enough for the M2 Pro to use it, in which case you'd see additional benefits from that. But it DEFINITELY wouldn't get you to 300GB/s. LPDDR5 will never be that fast (that would require 9.6GT/s, which is not happening in the DDR5 timeframe - unless DDR6 is horribly delayed, years from now).

    You're also assuming Apple won't go with HBM, which is not at all a safe assumption. If they do they might well do better than 300GB/s for the "M2 Pro", if such a thing were built.

    Your entire article could have been written something like this:
    M1 Ultra = 2x M1 Max = 4x M1 Pro ~= 6x-8x M1, so expect the same with the M2 series.

    It's a really bad bet though.

    There are much more interesting things to speculate about! What are they doing for an interconnect between CPU cores, GPU cores, Neural Engine, etc? Improvements there are *critical* to better performance - the Pro, Max, and Ultra are great at some things but extremely disappointing at others, and that's mostly down to the interconnect- though software may also play some part in it (especially with the GPU).

    Similarly, the chip-to-chip interconnect for the Ultra is a *huge* advance in the state of the art, unmatched by any other vendor right now... and yet it's not delivering the expected performance in some (many) cases. What have they learned from this, what can they do better, and when will they do it?

    (Edit to add) Most of all, will desktop versions of the M2 run at significantly higher clocks? I speculated about this here when the A15 came out - that core looked a lot like something built to run at higher clocks than earlier Ax cores. I'd like to think that I was right, and that that's been their game all along. But... Apple's performance chart (from the keynote) for the M2, if accurate, suggests that I was wrong and that they don't scale clocks any better than the M1 did. That might still be down to the interconnect, though it seems unlikely. It's also possible that they're holding back on purpose, underestimating performance at the highest clocks, though that too seems unlikely (why would they?).

    For this reason, I suspect that the M2 is a short-lived interim architecture, as someone else already guessed. Though in terms of branding, they may retain the "M2" name even if they improve the cores further for the "M2 Pro" or whatever. That would go against all past behavior, but they don't seem terribly bound by tradition.
    This comment contains many of my thoughts as well!
    if you were to “extrapolate” that’s pretty boring actually. Yeah a linear-scaled guess is barely an article. You could do this for the M4, M5, M10 etc. it’s kind of useless and didn’t even go that far.

    at the same time the article ignores the most exciting questions like mentioned above. What can Apple and TSMC do to improve bandwidth on ArmV9? Fabric stacking? HBM over LPDDR5? 

    Also, at 3nm the next Pro and Max chips are not the linear improvement the article suggests. Why was it written like that? 
    programmer