Intel details Larrabee multi-core graphics processor

Posted:
in Future Apple Hardware edited January 2014
As a prelude to the SIGGRAPH expo, Intel has revealed some key details of its first dedicated 3D chipset and its potential for expanding what developers can do with 3D graphics.



Still going under its Larrabee codename, Intel's architecture is the first to be based on many cores using an x86 processor architecture based on the Pentium rather than the more proprietary designs of companies that specialize in graphics, such as AMD's ATI Radeon HD or NVIDIA's GeForce GTX series.



Using a more universal architecture would free developers of games and application programming interfaces (APIs) much more freedom, Intel says. Where current-day video chipsets force these software creators to work with largely pre-existing tools, the familiarity of an x86 design would purportedly give developers a "blank canvas" to add new effects or otherwise extend what Larrabee could do without changing the hardware itself.



The implementation would be different enough from Intel's normal central processor units to be optimized for graphics, however. The core itself would contain logic to handle graphics as well as many simultaneous code threads, and would have a dedicated vector processing unit to greatly improve the potential power of the chip for its intended role.



Larrabee would also reduce some of the traffic problems that can bog down existing multi-core chips. The design will use an ultra-wide "ring" network in between the numerous cores to reduce the lag between many different cores, even when processing many tasks at the same time.



Each core would also have extensions to handle 64-bit data in addition to multithreading, further optimizing the depth and number of duties Larrabee can handle at once.



The new architecture isn't due to appear until 2009 or 2010 but will first be launched directly into the mainstream instead of the high-end workstation and server markets Intel often uses as its testbeds for its central processors. It also wouldn't require a substantial break for developers and would support the DirectX and OpenGL graphics libraries used to make existing software, including games and other 3D apps.

Comments

  • Reply 1 of 16
    This sounds pretty interesting. I'm still wondering what Apple will do with their graphics cards and chipsets. (I admit, I'm not that experienced in GPU technology.) According to this article, they could easily go with either company in regards to either their chipsets. After all, NVIDIA and Intel still work together really well.



    This development from Intel looks like it will definitely compliment both the graphics cards in Apple's computers and the technologies, especially OPEN CL, used in Mac OS 10. I already know NVIDIA is already excellent at graphics and that Apple might really partner with them for their chipsets. However, Intel looks like they are really coming along with their technology. Assuming this Larrabee chip will make its way into Apple computers - let alone see the light of day, Apple might bode well to stay with Intel.
  • Reply 2 of 16
    Does Larrabee even need OpenCL support? They are basically x86 cores so Grand Central should be able to manage them directly without having to go through OpenCL. Should be less overhead and translation.
  • Reply 3 of 16
    retroneoretroneo Posts: 240member
    The great thing is:



    OpenCL has the backing of Intel (Larabee), AMD (Radeon), Nvidia (GeForce), Imagination (the incredible PowerVR in iPhone) and ARM (Miro - unused by Apple as yet)



    OpenCL gives a single API that will be supported from iPhone to Mac Pro on and give Apple a wide variety of choices for graphics processors.



    It's great to not be locked in.
  • Reply 4 of 16
    Quote:
    Originally Posted by ltcommander.data View Post


    Does Larrabee even need OpenCL support? They are basically x86 cores so Grand Central should be able to manage them directly without having to go through OpenCL. Should be less overhead and translation.



    Indeed. It's not a GPU, it's a CPU, something many people don't understand.
  • Reply 5 of 16
    1337_5l4xx0r1337_5l4xx0r Posts: 1,558member
    Am I the only one who feels Larabee is doomed to fail?
  • Reply 6 of 16
    retroneoretroneo Posts: 240member
    Quote:
    Originally Posted by FuturePastNow View Post


    Indeed. It's not a GPU, it's a CPU, something many people don't understand.



    I wouldn't say that's accurate. It doesn't replace a CPU in a computer, and it is being marketed as a GPU initially.



    It won't offer any decent performance running today's x86 code. Using OpenCL/OpenGL however makes sense.



    This is much closer:



    Larrabee is to an x86 CPU what the Toshiba SpursEngine GPU is to a PowerPC CPU.







    Check out for more detail:



    http://arstechnica.com/news.ars/post...ntium-pro.html
  • Reply 7 of 16
    retroneoretroneo Posts: 240member
    Quote:
    Originally Posted by 1337_5L4Xx0R View Post


    Am I the only one who feels Larabee is doomed to fail?



    Fortunately with hardware abstracted APIs like OpenGL and OpenCL, it doesn't matter whether it fails or succeeds.



    There are plenty of choices for Apple. OS X currently supports PowerVR, Intel GMA, Radeon and GeForce in Mac OS X.



    Miro, Larrabee, GoForce, Imageon are some of the options for the future in addition the above.
  • Reply 8 of 16
    retroneoretroneo Posts: 240member
    Quote:
    Originally Posted by ltcommander.data View Post


    Does Larrabee even need OpenCL support? They are basically x86 cores so Grand Central should be able to manage them directly without having to go through OpenCL. Should be less overhead and translation.



    OpenCL is designed exactly for a Larrabee-type processor. You could code directly for it, but it won't be portable to another stream processor like a Radeon or GeForce.



    Running traditional x86 code won't use Larrabee anywhere near its potential. They are slow in-order x86 cores like Atom and the original Pentium.



    Look up the Toshiba SpursEngine for a PowerPC equivalent of the Larrabee. It's a GPU. An IBM Cell processor without the main PowerPC core, just the simple PowerPC/Vector cores.



    OpenGL will also be supported for graphics apps.



    Both of these APIs mean code written for OpenCL or OpenGL will automatically take advantage of the available hardware be it Radeon, GeForce, Larabee or PowerVR. Applications will not need to be rewritten for each hardware type.
  • Reply 9 of 16
    1337_5l4xx0r1337_5l4xx0r Posts: 1,558member
    I agree OpenCL makes the 'what parallelized vector engine are you running' question moot. But specific to Larabee, I've read they lack power management of any sort, suck about 300+ watts, and get trounced thoroughly by Nvidia and Radeon GPUs. I just don't see the market. A GPU in a high end workstation, that renders blazing fast, a la NVidia's Gelato renderer, OK, I can see that. But a half-assed GPU from intel? I love intel's processors, but in the realm of graphics, they need a solid beating with a clue stick.



    edit: I say that as a guy who uses Gelato for renders. A GPU that does double-duty, I'm all for...
  • Reply 10 of 16
    Quote:
    Originally Posted by retroneo View Post


    I wouldn't say that's accurate. It doesn't replace a CPU in a computer, and it is being marketed as a GPU initially.



    It's being marketed as a GPU to get it into more computers, to increase volume and get prices down. I bet you Intel will offer a socketed version of it for servers.
  • Reply 11 of 16
    Quote:
    Originally Posted by retroneo View Post


    OpenCL is designed exactly for a Larrabee-type processor. You could code directly for it, but it won't be portable to another stream processor like a Radeon or GeForce.



    Running traditional x86 code won't use Larrabee anywhere near its potential. They are slow in-order x86 cores like Atom and the original Pentium.



    The whole point of Larrabee is the ability to take existing x86 code and be able to run it in a highly parallized vector fashion. The impediment to CUDA and similarly OpenCL is that porting is involved, while the hope of Larrabee is that existing x86 can just be recompiled for Larrabee with no major modification.



    Of course, we're not talking about common office programs. From Intel's SIGGRAPH paper, the target is HPC and numeric-intensive computing environments, where presumably the software is already well-threaded. Obviously, greater performance will be extracted from code specifically written for Larrabee, but presumably the compiler will try it's best to automatically optimize code.



    And of course the speedup is not just for multithreaded code. If the code is floating point or vector intensive, the autovectorization ability in Intel's compilers along with Larrabee's 512-bit vector unit should be able to crunch through things very efficiently even if not all Larrabee's cores are utilized.



    Larrabee will probably eventually support OpenCL for newer programs (although the SIGGRAPH paper doesn't seem to mention OpenCL), but Larrabee's key niche is the ability to accelerator existing x86 programs with little effort.
  • Reply 12 of 16
    programmerprogrammer Posts: 3,457member
    Quote:
    Originally Posted by ltcommander.data View Post


    The whole point of Larrabee is the ability to take existing x86 code and be able to run it in a highly parallized vector fashion ... but Larrabee's key niche is the ability to accelerator existing x86 programs with little effort.



    Nope, sorry, try again. This is not the whole point, and this is not actually possible either. Existing x86 code might run on a Larrabee core (maybe, and only if it doesn't use SSE/MMX), but it is unlikely and would have to be modified to get it there. This thing uses Pentium cores because they are simple and small, and the design has been carefully refined and debugged for about 15 years now. The original Pentium is a short pipeline in-order core, which is precisely what all the many-core designs are using (e.g. the SPUs in the Cell).



    Quote:

    The impediment to CUDA and similarly OpenCL is that porting is involved, while the hope of Larrabee is that existing x86 can just be recompiled for Larrabee with no major modification.



    This is a false hope. Parallelism cannot be automatically injected into non-parallel code. And most of the approaches to parallelism in existing code aren't tuned for a single chip with dozens of cores, they either target a handful of scalar cores or a network of machines. OpenCL is just one project that is attempting to create a way to write code that is highly parallel and can be retargeted automatically to whatever hardware you happen to have in your machine (not really intended for network parallelism). This is an extremely important technology because it affords hardware designers more flexibility in how they design their products. This is similar to how OpenGL/DirectX allow greater diversity in GPU design than in CPU design -- compare the ATI, nVidia, Intel integrated, and Intel Larrabee approaches to implementing the GPU. These vary much more than x86 chip designs from Intel and AMD, and even the differences between x86, MIPS, SPARC, PowerPC, etc.
  • Reply 13 of 16
    programmerprogrammer Posts: 3,457member
    Quote:
    Originally Posted by 1337_5L4Xx0R View Post


    But specific to Larabee, I've read they lack power management of any sort, suck about 300+ watts, and get trounced thoroughly by Nvidia and Radeon GPUs.



    While I have no doubt that somebody wrote such things (nVidia or their fanboys, no doubt) it doesn't seem likely it be based on much reality given that the hardware doesn't exist yet and Intel has been keeping it super-secret until very recently.
  • Reply 14 of 16
    olternautolternaut Posts: 1,376member
    Quote:
    Originally Posted by 1337_5L4Xx0R View Post


    Am I the only one who feels Larabee is doomed to fail?



    .................yes
  • Reply 15 of 16
    retroneoretroneo Posts: 240member
    Quote:
    Originally Posted by ltcommander.data View Post


    The whole point of Larrabee is the ability to take existing x86 code and be able to run it in a highly parallized vector fashion.



    Unfortunately that isn't how it works. By your reasoning we could take existing PowerPC code and run in amazingly fast on the Toshiba SpursEngine GPU or the many cored IBM Cell in a Playstation.
  • Reply 16 of 16
    Quote:
    Originally Posted by ltcommander.data View Post


    The whole point of Larrabee is the ability to take existing x86 code and be able to run it in a highly parallized vector fashion. The impediment to CUDA and similarly OpenCL is that porting is involved, while the hope of Larrabee is that existing x86 can just be recompiled for Larrabee with no major modification. ....but Larrabee's key niche is the ability to accelerator existing x86 programs with little effort.



    I'm going to have to agree with the others here, and say that your post sounds like it was written from a naive Intel marketing type and not someone with extensive experience in concurrent programming techniques. I don't know enough about the low-level architecture yet to really understand the possibilities, but I would be VERY skeptical that there will be more than a handful cases where existing threaded, (x86) optimized code will efficiently scale on to a ~32 core Larabee processor with nothing but a recompile. As has been said, most parallel processing today is either optimized for a small number of x86 cores (many times using an SSE variant) OR fully distributed network computing, with little in between. Having said that, I'm not an expert and have only dabbled around with nVidia's CUDA SDK. I'm sure Intel is betting on the hope that Larabee will offer more flexibility in parallel processing than the likes of Nvidia's CUDA and ATI's CTM (or whatever ATI has now), but with so many concurrent/GPGPU abstraction layers being worked on (aka OpenCL/Grand Central/Directx "compute" shaders/etc), and the always-present fear of vendor lock-in, I'm not sure it will become much of an advantage. If an efficient vendor and technology neutral abstraction layer is created that can use CUDA, CTM, Larabee, etc, then the seeming primary benefit of Larabee becomes inconsequential.
Sign In or Register to comment.