AMD's Radeon Pro Vega II and Duo offer Mac Pro up to 28Tflops of GPU performance

Posted:
in Current Mac Hardware
AMD has detailed the Radeon Pro Vega II, the graphics processing unit included in the new Mac Pro in standard single and "Duo" configurations, with the 14.2-teraflop 7-nanometer chip providing considerable performance for the creative workhorse.

AMD's Radeon Pro Vega II Duo with the Infinity Fabric Link interconnect highlighted
AMD's Radeon Pro Vega II Duo with the Infinity Fabric Link interconnect highlighted


As revealed on Monday, the Radeon Pro Vega II is produced using a 7-nanometer manufacturing process, the same as its other 2019 GPU launches like the Radeon 7. To further its performance, the Pro Vega II is paired with 32 gigabytes of high-bandwidth memory (HBM2), a faster and more efficient memory than the more typical GDDR5 delivering 1 terabyte per second of memory bandwidth.

AMD uses a GPU interconnect technology called Infinity Fabric Link to connect the GPU with other components at extremely high speeds, at up to 84 gigabytes per second per direction, five times faster than PCI3 Gen 3. This all helps enable the GPU's 14 teraflops of single-precision floating-point (FP32) performance, rising to 28 teraflops for half-precision floating-point (FP16) performance.

According to AMD's specifications, the GPU offers 64 compute units, 4,096 stream processors, has a 4,096-bit memory interface, and a peak engine clock of 1.7GHz. It is also optimized to support Metal, Apple's graphics architecture, to enable "seamless content creation."

As the name suggests, the Radeon Pro Vega II Duo takes full advantage of the Infinity Fabric Link to put two GPUs right next to each other. In effect, this doubles the compute units and stream processors to 128 and 8,192 respectively, allows up to 64GB of HBM2 memory to be used, and FP32 performance of up to 28.3 teraflops.

In the Mac Pro, the graphics options start with the Radeon Pro 580X, as offered in other Macs, but can be upgraded to the single-GPU Radeon Pro Vega II or the Radeon Pro Vega II Duo as MPX Modules, Apple's proprietary form factor enabling for quiet cooling that takes advantage of the Mac Pro's construction. As it is possible to add two MPX Modules to a Mac Pro, that means the use of two Radeon Pro Vega II Duo MPX Modules equates to four of the high-performance GPUs in one Mac.

Pricing for the Radeon Pro Vega II and Duo version have yet to be announced, but they are almost certainly going to be offered alongside the Mac Pro at its release this fall.

Comments

  • Reply 1 of 12
    Instead of a five (or six?) digit number, geekbench's score for this top-end machine will just be a series of emoji: πŸ˜³πŸ™ŒπŸ’₯πŸ”₯

    πŸ†

    MacProemig647cornchipviclauyyc1983watto_cobra
  • Reply 2 of 12
    MacProMacPro Posts: 19,727member
    Talking of AMD.  It's interesting Apple chose 'Catalyst' for Marzipan's release name given AMD's long-standing use of that for their dual GPU system.
    cornchipwatto_cobra
  • Reply 3 of 12
    tipootipoo Posts: 1,142member
    I believe this is the same die as the VII, but without one CU disabled, and far more memory/the extra PCI-E/infinity fabric. For anyone wondering if "Vega II" was any different than Vega on 7nm, or a new successor architecture. 
  • Reply 4 of 12
    mdriftmeyermdriftmeyer Posts: 7,503member
    tipoo said:
    I believe this is the same die as the VII, but without one CU disabled, and far more memory/the extra PCI-E/infinity fabric. For anyone wondering if "Vega II" was any different than Vega on 7nm, or a new successor architecture. 
    Vega was 14nm and Vega II is an improved GCN version over Vega on 14nm.
    watto_cobra
  • Reply 5 of 12
    A Titan RTX has 16.2 TFLOPS 32 bit float performance or on their Tensor core, 130 TFLOPS 16 bit float performance, just to put things in perspective.
    sweetheart777
  • Reply 6 of 12
    tipootipoo Posts: 1,142member
    tipoo said:
    I believe this is the same die as the VII, but without one CU disabled, and far more memory/the extra PCI-E/infinity fabric. For anyone wondering if "Vega II" was any different than Vega on 7nm, or a new successor architecture. 
    Vega was 14nm and Vega II is an improved GCN version over Vega on 14nm.


    VII was already a 7nm shrink of Vega. II here uses the same 7nm process die, but fully enabled without the disabled CU (for yields in the VII) here. 

    https://www.amd.com/en/products/graphics/amd-radeon-vii
    https://www.amd.com/en/graphics/workstations-radeon-pro-vega-ii

    AMD has said there's no core architectural changes from Vega - VII, just a pretty straight die shrink. Certainly not a new numbered GCN version. 
    edited June 2019
  • Reply 7 of 12
    A Titan RTX has 16.2 TFLOPS 32 bit float performance or on their Tensor core, 130 TFLOPS 16 bit float performance, just to put things in perspective.
    I wonder why Titan RTX has 4x the performance at half-precision, but AMD only doubles.
    rezwits
  • Reply 8 of 12
    frantisekfrantisek Posts: 756member
    Just for curiosity. 56 TFLOPS was somehow performance of top supercomputer in 2003.
    watto_cobra
  • Reply 9 of 12
    1st1st Posts: 443member
    more zero than my finger can count after number? wow boy, Am I in trouble for future math? Fabric link? cross interconnect to route? did it go z-as well? too many questions and too little info... interesting. Nice stuff.
    watto_cobra
  • Reply 10 of 12
    tipootipoo Posts: 1,142member
    A Titan RTX has 16.2 TFLOPS 32 bit float performance or on their Tensor core, 130 TFLOPS 16 bit float performance, just to put things in perspective.
    I wonder why Titan RTX has 4x the performance at half-precision, but AMD only doubles.

    Those are on the Tensor cores, where the AMD figure is from double packed math on their regular GPU cores. Tensor cores are further tuned for low precision performance over regular GPU cores. Also, I think at least until Volta, Nvidia hadn't used Rapid Packed Math for double the FP16 performance on the main GPU cores at all, instead pushing Tensor for low precision performance (also goes down to INT8)
  • Reply 11 of 12
    thttht Posts: 5,447member
    frantisek said:
    Just for curiosity. 56 TFLOPS was somehow performance of top supercomputer in 2003.
    All FLOPS aren’t the same. That supercomputer in 2003 likely outperforms these GPUs in double precision math and anything that requires anything more than a 32 bit multiply+add. 

    Like, say, if a square root function is used... yowsers. And if an β€œif” statement was in there...
    watto_cobra
  • Reply 12 of 12
    22july201322july2013 Posts: 3,572member
    MacPro said:
    Talking of AMD.  It's interesting Apple chose 'Catalyst' for Marzipan's release name given AMD's long-standing use of that for their dual GPU system.
    Curious observation. It's possible that Apple bought or licensed the trademark, if it was trademarked.
    watto_cobra
Sign In or Register to comment.