Benchmarks show that Intel's Alder Lake chips aren't M1 Max killers

michelb76 · January 27, 2022 8:00AM

No longer relevant, since we won't get macs with Alder Lake.

danox · January 27, 2022 1:54PM

hucom2000 said:

People who are paying attention to speed claims only might fall for Intel’s marketing…

They “caught up” in less than a year. Impressive, as long as you leave power consumption out of the equation, which is the EXACT reason Apple ditched Intel.

Apple’s ARM chips are way superior, no doubt. The question is if anyone, except us geeks, will understand and care?

The Intel geeks care and their minds are blown, but not in a good way….they are crying.

blastdoor · January 27, 2022 2:02PM

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

tenthousandthings · January 27, 2022 3:20PM

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Thanks for writing this. It’s very helpful.

Seems pretty likely we’ll see Apple’s first salvo in the desktop space soon, like March or April. Sounds like it will still be a single-die configuration, basically a desktop tier for the M1 Max. Then a second salvo later in the year, with dual-die configurations in the Apple workstation(s). I think it’s unrealistic to expect quad-die in this initial round, or even for M2, but of course I don’t know anything about the science.

So here’s another serious question — What do you make of these rumors about multiple-die configurations? What does that even mean? Is that something Intel and/or AMD does?

blastdoor · January 27, 2022 3:45PM

tenthousandthings said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Thanks for writing this. It’s very helpful.

Seems pretty likely we’ll see Apple’s first salvo in the desktop space soon, like March or April. Sounds like it will still be a single-die configuration, basically a desktop tier for the M1 Max. Then a second salvo later in the year, with dual-die configurations in the Apple workstation(s). I think it’s unrealistic to expect quad-die in this initial round, or even for M2, but of course I don’t know anything about the science. It just seems to me that a quad-die configuration would have to be planned from the very beginning with TSMC, on the process level — so the first process where Apple’s Macintosh Silicon would be in position to do that would be N3P (i.e., the M3 Max).

So here’s another serious question — What do you make of these rumors about multiple-die configurations? What does that even mean? Is that something Intel and/or AMD does?

Happy to help and thanks for the kind words :-)

Intel and AMD have done multiple die configurations for many years. A particularly old example was the PentiumD, which was Intel's Desperation response to AMD's first dual-core Athlon. The PentiumD consisted of two single-core Penitum4 dies in the same package. This is called a multi-chip module (MCM). Historically, an MCM approach was more expensive than a single die, which is why for decades only companies like IBM would use MCM designs in big expensive computers.

More recently, AMD resurrected the MCM approach in mainstream products with Ryzen. I have a Threadripper 2990wx next to my desk right now which consist of four complete CPUs in a MCM. Today, AMD uses a 'chiplet' approach, in which the multiple chips are not complete CPUs by themselves. Instead, they have something they call the PCH (Platform Controller Hub) that handles communication among many multi-core chiplets and with the outside world (like RAM and PCIe).

There are a lot of ways to do MCM, and improving the implementation of the MCM concept is a major focus for all the big guys today (Intel, AMD, TSMC, Apple). Intel will be introducing an approach they call 'tiles'.

If the rumors about multiple M1 Pro/Max dies are correct, then that will also be an MCM implementation. If the rumors are literally true, then it would be an implementation similar to the 2990wx or PentiumD because it would involve multiple complete CPUs (each of which could stand alone) as opposed to a combination of a chiplet design, in which the chips can only work together. The difference is that Apple will be using much more advanced packaging technology than those older chips.

The trick with MCM is balancing the need for all the chips to communicate with each other quickly with the increased power cost that comes from sending information at high speeds off-chip. So the interesting thing about Apple's design will be how they manage that tradeoff.

9secondkox2 · January 27, 2022 6:41PM

fastasleep said:

waveparticle said:

imikejackson said:

So I'll go ahead and start the flame war with the cliche, as far as price goes you are *not* comparing apples-to-apples (every pun intended). The MSI has a 17" screen so let's go ahead and actually compare the price of the MSI Raider with 17" screen as seen here (https://www.bhphotovideo.com/c/product/1639843-REG/msi_ge76_raider_11ug_054_17_3_ge76_raider_gaming.html?SID=s1643237475129c3nda52417) which costs $2485 to the 16" MacBook Pro (https://www.apple.com/shop/buy-mac/macbook-pro/16-inch) which costs $3499. Both have 32GB of RAM and 1TB NVMe.

So maybe a correction or, god forbid, a little honesty in your reporting. Here come the bullet points:
* Been using macOS since 1988.
* Also, the 14" M1 Pro Max MacBook Pro with 32GB Ram and 1TB NVMe drive is $2800. So where are you getting your price data from?
* I don't mind the Apple Fan-boy stuff. Just try to keep it honest.
* We all know that the M1 is way more power efficient. Period. No arguments from anybody, not even Intel. That isn't my complaint about this article.

You are comparing Apple to Oranges. The MSI has a 17" screen has far inferior resolution than MBP Pro 16" which has a 5K screen. Further, with much less pixels MSI gaming program needs to compute far less pixels than MBP.

What? As much as I love mine, the 16" MBP doesn't even have a 4K screen — it's 3456x2234 — much less 5K.

That works out to pretty much 4k. A bit less horizontal. A bit more vertical.

Everyone who has seen it believes it is the best screen currently in a laptop.

Point is, it is WAY more resolution - 4 times more - than the MSI has to push.

9secondkox2 · January 27, 2022 6:42PM

fastasleep said:

darkvader said:

bloggerblog said:

Apple is now able to produce their own SoC specifically designed for a piece of hardware, they can build custom silicon for every product, from AirPods to AppleCar, something that no other company will be able to do.

Nope. Apple has ZERO chip production capability. Apple builds virtually nothing. The only actual Apple 'factories' are a final assembly facility in Cork, Ireland and a final assembly facility in Texas. ALL the components those facilities use are imported, mostly from China. Everything else Apple 'builds' is actually manufactured by other companies, mostly Foxconn in China.

JFC. They obviously didn't mean literally build with their own hands. But they design it and TMSC manufactures it. Nobody said anything about Apple chip factories, troll.

Great point. That's also why Apple products say "DESIGNED in California." They are designed by Apple and get manufactured with third parties. It is the same with the SOCs.

dewme · January 27, 2022 8:42PM

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Hmmm.

Ever heard of Intel i860? Or i960? Both are RISC chips. The i860/i960 was actually the first chip that Windows NT prototypes ran on in the late 1980s before being moved to MIPS R3000 RISC and finally Intel x86 CISC chips. There are i860/i960 based systems still in use today in critical national defense systems.

Thanks for the links.

tht · January 27, 2022 9:11PM

dewme said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Hmmm.

Ever heard of Intel i860? Or i960? Both are RISC chips. The i860/i960 was actually the first chip that Windows NT prototypes ran on in the late 1980s before being moved to MIPS R3000 RISC and finally Intel x86 CISC chips. There are i860/i960 based systems still in use today in critical national defense systems.

Thanks for the links.

Yes, the ISA is basically immaterial in the grand scheme of things. There's been so much talk over RISC vs CISC, but it's really immaterial. It's really transistor density and how cheap they are to make. Intel being stuck on 14nm for 5 years meant they couldn't really add more transistors, not a doubling that a full node improvement would give, to their chips. There only recourse was to increase clock rates and they running on the fumes of fumes on 14 nm. That's how "thermal velocity boost" become a marketable feature for them.

A huge intangible is going to be product design decisions. The chip processor team has to make the right design decisions and have the right predictions of where fab tech and software are going to be. This is really part of the "magic" of processor teams that continually ship improving designs. It doesn't take much to fail with this.

Given the amount of software compiled for x86, with a rather large percentage of that software not actively maintained or poorly maintained, Intel and AMD have basically an infinite amount of time to worry about a non-x86 systems from truly hurting them. Legacy software is such a huge and inherent driver for customers to keep on buying x86 hardware, even though they may suck. They won't switch unless it is zero effort seamless. So, there is lots of room for mistakes for Intel and AMD.

tht · January 27, 2022 9:46PM

waveparticle said:

tht said:

cogitodexter said:

waveparticle said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

Power is consumed when a transistor switch from 0 to 1 or 1 to 0. Switching is controlled by clock cycles. The more switching the more power is consumed.

Well, that is presumably a given. And possibly at a slightly lower level than I was alluding to. More specifically, is there some set of processing or overall design feature that Intel does wrong? Or does it do more 'stuff' that the M1 doesn't do? Is it required to support legacy ways of doing stuff that the M1 is free from?

It basically all comes down to economics. There's a lot of hoo-ha about instruction set architecture (RISC vs CISC), but it's not a big deal imo. It comes down to the economics of how many transistors you can have in a chip, what power budgets the OEM is willing to design for, and whether it is profitable in the end.

First and foremost, Intel's fabrication technology - how small they can make the transistors - was effectively broken for close to 4 years. 2 CEOs and executive teams were sacked because of this. The smaller the transistors, the more transistors you can put in a chip and the less power it will take to power them. Intel was the pre-eminent manufacturer of computer chips for the better part of 40 years, with 75 to 80% marketshare for most of those years. It takes a lot of mistakes for them to lose their fab lead.

Two things enabled TSMC, Apple's chip manufacturer, to catch and lap Intel in terms of how small a transistor they could make. The smartphone market became the biggest chip market, both in terms of units and money. It's bigger than PCs and servers. This allowed TSMC to make money, a lot of it Apple fronted, and invest in making smaller and smaller transistors. The other thing was Intel fucked up, on both ends. They decided not to get into the smartphone market (they tried when it became obvious, but failed). They then made certain decisions about the design of their 10nm fab that ended up not working and hence a 4 year delay, allowing TSMC to lap them. Even Samsung caught up and lapped them a little.

More transistors mean more performance. Apple's chips have a lot more transistors than Intel's, probably by a factor of 2. If you aren't economical to have a lot of transistors, you can increase performance by having higher clock rates. High clock rates mean it will take more power to run. It's not a linear relationship. It's an exponential increase in power consumption. So, Apple's chips have transistors that are about 2x smaller than Intel's, there are more of them, and consequently can design their chips to run at relatively lower clock rates, and consuming less power.

Intel can theoretically design chips with the same number of transistors as Apple, but the chips will be 2x as large. They will not be profitable doing this. Well, it's really, they will not enjoy their traditional 60% margins if they do it this way, ie, not profitable "enough". So smallish chips with higher power consumption is their way. Apple hates high power consumption chips, and they do it the opposite way (big chips lower power consumption), and you end up with an M1 Pro having about the same performance as an Alder Lake i9-12900H, but with the M1 Pro needing 30 W and the i9 need 80 to 110 W. And Apple has a 2x to 3x more performant on-chip GPU than Intel has. They can not do this if TSMC didn't become the leader in chip manufacturing.

Intel has plans to regain the fab lead, be able to fab the smallest transistors, but we will see about that. They might, or not.

Intel has lost the PC performance war. It can still enjoy over 50% profit margin primarily due to servers. The same is true for Microsoft. The growth market of hi-tech now is cloud computing which requires a lot of servers with tremendous amount of memory.

The AI article is saying that they just regained the performance lead. Intel has regained the performance lead in both PC desktop and laptops. They need to use 2x to 3x more power to do it, but they do in fact have the performance lead. All Intel really needs is performance parity or only trail by a little bit, somewhere around there, and they will be able to maintain their PC share. The vast majority of buyers don't care about power consumption. They care about upfront costs, but power consumption? No. There isn't any, or much, growth in the PC market, but having 70% of it is a rather large chunk of money that Intel really should try to maintain, and obviously that's what they are trying to do.

The server market is rather interesting. It seems to be getting more diverse in hardware, not commoditizing around x86 like the PC market did. Nvidia has a ridiculous stock market valuation right now, all because of GPU compute and cryptocurrency. All major cloud vendors seem to have projects for designing their own hardware, and not just your generic CPU or GPU. Lots of tensor units, crypto units, along with ARM upstarts. Server vendors care about power consumption, and power efficiency may be enough to get them to switch because electricity savings could be big. And, the market still has a lot of growth left.

Apple could make a play here, but it's not a business they want to be in. An Apple 64+8 server chip running at about 300 W. Would be pretty competitive, especially since it can idle at very low power. Not their thing.

dewme · January 27, 2022 9:58PM

tht said:

dewme said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Hmmm.

Ever heard of Intel i860? Or i960? Both are RISC chips. The i860/i960 was actually the first chip that Windows NT prototypes ran on in the late 1980s before being moved to MIPS R3000 RISC and finally Intel x86 CISC chips. There are i860/i960 based systems still in use today in critical national defense systems.

Thanks for the links.

Yes, the ISA is basically immaterial in the grand scheme of things. There's been so much talk over RISC vs CISC, but it's really immaterial. It's really transistor density and how cheap they are to make. Intel being stuck on 14nm for 5 years meant they couldn't really add more transistors, not a doubling that a full node improvement would give, to their chips. There only recourse was to increase clock rates and they running on the fumes of fumes on 14 nm. That's how "thermal velocity boost" become a marketable feature for them.

A huge intangible is going to be product design decisions. The chip processor team has to make the right design decisions and have the right predictions of where fab tech and software are going to be. This is really part of the "magic" of processor teams that continually ship improving designs. It doesn't take much to fail with this.

Given the amount of software compiled for x86, with a rather large percentage of that software not actively maintained or poorly maintained, Intel and AMD have basically an infinite amount of time to worry about a non-x86 systems from truly hurting them. Legacy software is such a huge and inherent driver for customers to keep on buying x86 hardware, even though they may suck. They won't switch unless it is zero effort seamless. So, there is lots of room for mistakes for Intel and AMD.

I agree. One of the other reasons I don’t get excited at these Apple Silicon vs Intel “benchmark dog & pony shows” is because there is a vast number of people who are stuck with Intel (or AMD) because they are running legacy apps that will never be ported to more modern architectures. I have some of these, which is why I keep an x86 Windows machine around. To me it’s a no-big-deal because these legacy apps don’t need an expensive platform to run on and I can peruse the bargain bin to meet my needs. This allows me to put my real money on modern systems and modern software that have far fewer compromises and more directly consumable tangible benefits, like being able to run fast and unplugged for 15-16 hours versus 3-5 hours. Today the most modern and optimized HW and SW combination with the greatest performance headroom that’s generally available is Apple Silicon based platforms running Apple Silicon native apps.

Intel is obviously happy to see these dog & pony benchmarking performances against Apple’s chips to try to convince themselves that they are able to run with the big dogs, even when the really big dog that they want to be seen running with is not really competing with them at a tactical level, but more at a social media and blogger level.

Gaming on computers (vs consoles) is kind of a niche that I’m totally unfamiliar with, having never worked in that space. I’m assuming that gaming is an area where the market share advantage of PCs over Macs is enough of a big deal for game developers to stay comfortably committed to the PC platform. The open architecture of the PC platform allows for all kinds of ugliness and inefficiency of the base platform to be spackled over with the addition of a massive add-on GPU board that’s far more powerful and sophisticated than the platform it plugs into. It’s the ultimate tail wagging the dog situation.

fastasleep · January 27, 2022 11:39PM

9secondkox2 said:

fastasleep said:

waveparticle said:

imikejackson said:

So I'll go ahead and start the flame war with the cliche, as far as price goes you are *not* comparing apples-to-apples (every pun intended). The MSI has a 17" screen so let's go ahead and actually compare the price of the MSI Raider with 17" screen as seen here (https://www.bhphotovideo.com/c/product/1639843-REG/msi_ge76_raider_11ug_054_17_3_ge76_raider_gaming.html?SID=s1643237475129c3nda52417) which costs $2485 to the 16" MacBook Pro (https://www.apple.com/shop/buy-mac/macbook-pro/16-inch) which costs $3499. Both have 32GB of RAM and 1TB NVMe.

So maybe a correction or, god forbid, a little honesty in your reporting. Here come the bullet points:
* Been using macOS since 1988.
* Also, the 14" M1 Pro Max MacBook Pro with 32GB Ram and 1TB NVMe drive is $2800. So where are you getting your price data from?
* I don't mind the Apple Fan-boy stuff. Just try to keep it honest.
* We all know that the M1 is way more power efficient. Period. No arguments from anybody, not even Intel. That isn't my complaint about this article.

You are comparing Apple to Oranges. The MSI has a 17" screen has far inferior resolution than MBP Pro 16" which has a 5K screen. Further, with much less pixels MSI gaming program needs to compute far less pixels than MBP.

What? As much as I love mine, the 16" MBP doesn't even have a 4K screen — it's 3456x2234 — much less 5K.

That works out to pretty much 4k. A bit less horizontal. A bit more vertical.

Everyone who has seen it believes it is the best screen currently in a laptop.

Point is, it is WAY more resolution - 4 times more - than the MSI has to push.

Yes, I get that it's closeish to 4K and far higher than 1080p, but it's a far cry from 5K which is what I was replying to.

fastasleep · January 27, 2022 11:48PM

tenthousandthings said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Thanks for writing this. It’s very helpful.

Seems pretty likely we’ll see Apple’s first salvo in the desktop space soon, like March or April. Sounds like it will still be a single-die configuration, basically a desktop tier for the M1 Max. Then a second salvo later in the year, with dual-die configurations in the Apple workstation(s). I think it’s unrealistic to expect quad-die in this initial round, or even for M2, but of course I don’t know anything about the science.

Why would you say it's unrealistic when you say you don't know anything about the science and the rumors say otherwise, that there will be a Jade 2C-Die and Jade 4C-Die?

tenthousandthings · January 28, 2022 4:20AM

fastasleep said:

tenthousandthings said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Thanks for writing this. It’s very helpful.

Seems pretty likely we’ll see Apple’s first salvo in the desktop space soon, like March or April. Sounds like it will still be a single-die configuration, basically a desktop tier for the M1 Max. Then a second salvo later in the year, with dual-die configurations in the Apple workstation(s). I think it’s unrealistic to expect quad-die in this initial round, or even for M2, but of course I don’t know anything about the science.

Why would you say it's unrealistic when you say you don't know anything about the science and the rumors say otherwise, that there will be a Jade 2C-Die and Jade 4C-Die?

Well, maybe I said that because that’s what I think? Are you suggesting I don’t think that? Or that I’m not entitled to think that? Do you really think science is the only factor in these decisions? Regardless, I quickly deleted an explanation I gave for that, because I didn’t want any response to focus on that, but you can see it above in the quoted section of Blastdoor’s immediate response to my question.

There is more than one rumor about the quad — one of them is that it will be on the 3nm process.

blastdoor · January 28, 2022 7:50AM

tht said:

dewme said:

blastdoor said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

A serious question deserves a serious answer. I'm not seeing any so far, so here's my attempt.

Short version: To achieve higher single thread performance, Intel runs Alder Lake at a very high clock speed which requires higher voltage. Increasing voltage has a nonlinear effect on power consumption. So Intel has simply chosen to occupy a region of the power-performance curve that is entirely disjoint from the part Apple occupies. This is why Alder Lake was introduced first on the desktop and why laptops with Alder Lake run hot and have short battery life. Intel cannot compete in the power-performance space where Apple rules, for reasons which require the 'long version.'

Long version:

There are multiple ways to get more performance out of a CPU, some of which involve very tricky tradeoffs.

The "easy" way to get more performance out of a CPU is to improve the manufacturing process, allowing more transistors and higher clock speeds for the same cost. This used to be the main way to improve performance. It's still important today, but it's harder to achieve AND cost reductions aren't always guaranteed. This issue is relevant to your question, though, because Apple's manufacturing partner, TSMC, uses a more advanced manufacturing process than Intel. Apple uses TSMC's "5nm" process while Intel uses a process that is more similar to TSMC's "7nm" process. This is an important advantage for Apple, but it's far from the only advantage.

Holding manufacturing process constant, there are essentially two ways to improve performance -- higher clock speeds or more transistors. Holding voltage constant, increasing the number of transistors and increasing the clock speed both increase power linearly. But at some point, it's impossible to increase the clock speed without increasing voltage, and increasing voltage increases power use non-linearly. This is really the crux of the matter. Apple's M1 processors max out in the low to mid 3 GHz range. Alder Lake hits 5 GHz, which is where it's better single thread performance comes from. But to hit that 5 GHz clock speed, especially on Intel's older manufacturing process, Intel has to run at a higher voltage. Thanks to dynamic voltage scaling (https://en.wikipedia.org/wiki/Dynamic_frequency_scaling), Intel can lower the clock and voltage to lower power usage when needed, but of course performance suffers when they do, and they don't report that lower performance in their marketing.

Instead of increasing performance through clock speeds, Apple increases it through more transistors. The tradeoff, though, is that this is much harder to do from a design standpoint. It's hard because to take advantage of more transistors, you need to execute instructions in parallel rather than in sequence. One way to do that is to have more cores and push the complexity onto software developers. Another way to do it is to hide the complexity from the software developer and use a combination of a smart compiler and a cleverly designed CPU to take a 'regular' program and make instructions run in parallel (even when the developer didn't think to write it that way). All modern processors do this, but they differ in how well they do it. Apple does it better than just about everybody else. They do it better than everybody else for multiple reasons:

1. they use a RISC processor which makes it easier to schedule multiple instructions to execute in parallel
2. they control the compiler
3. they have some of the most talented people
4. they have time, money, and smart management

Many companies have one or two of these, almost nobody else has all of them.

Intel has never had #1, but they typically had 2-4 (Intel's compilers are very good, fyi, as are Microsoft's). Where Intel stumbled badly for many years was lack of smart management. Gelsinger appears to be fixing that, but fixing #1 will be harder.

So for now, Intel's processors cannot directly compete with Apple's. I give Intel credit for being clever -- they have chosen to not compete directly, but to simply move to a different part of the battlefield where Apple currently has no troops. Heaven help Intel if Apple chooses to deploy their superior firepower to that part of the power-performance battlefield, though!

Hmmm.

Ever heard of Intel i860? Or i960? Both are RISC chips. The i860/i960 was actually the first chip that Windows NT prototypes ran on in the late 1980s before being moved to MIPS R3000 RISC and finally Intel x86 CISC chips. There are i860/i960 based systems still in use today in critical national defense systems.

Thanks for the links.

Yes, the ISA is basically immaterial in the grand scheme of things. There's been so much talk over RISC vs CISC, but it's really immaterial. It's really transistor density and how cheap they are to make. Intel being stuck on 14nm for 5 years meant they couldn't really add more transistors, not a doubling that a full node improvement would give, to their chips. There only recourse was to increase clock rates and they running on the fumes of fumes on 14 nm. That's how "thermal velocity boost" become a marketable feature for them.

A huge intangible is going to be product design decisions. The chip processor team has to make the right design decisions and have the right predictions of where fab tech and software are going to be. This is really part of the "magic" of processor teams that continually ship improving designs. It doesn't take much to fail with this.

Given the amount of software compiled for x86, with a rather large percentage of that software not actively maintained or poorly maintained, Intel and AMD have basically an infinite amount of time to worry about a non-x86 systems from truly hurting them. Legacy software is such a huge and inherent driver for customers to keep on buying x86 hardware, even though they may suck. They won't switch unless it is zero effort seamless. So, there is lots of room for mistakes for Intel and AMD.

ISA matters much less for multithreaded total throughput, which is what matters in a server context.

For single thread performance/watt it matters more.

blastdoor · January 28, 2022 8:10AM

tht said:

waveparticle said:

tht said:

cogitodexter said:

waveparticle said:

cogitodexter said:

Serious question:

What does an Intel Core i9 do that requires it to be as power inefficient in the same processing circumstances as an AS M1 Max?

Presumably there's a reason why it draws so much more current to achieve the same ends? Are there features in it that are not replicated in the M1 Max?

I'm assuming the architecture is radically different, but what stops Intel from changing to that architecture?

Power is consumed when a transistor switch from 0 to 1 or 1 to 0. Switching is controlled by clock cycles. The more switching the more power is consumed.

Well, that is presumably a given. And possibly at a slightly lower level than I was alluding to. More specifically, is there some set of processing or overall design feature that Intel does wrong? Or does it do more 'stuff' that the M1 doesn't do? Is it required to support legacy ways of doing stuff that the M1 is free from?

It basically all comes down to economics. There's a lot of hoo-ha about instruction set architecture (RISC vs CISC), but it's not a big deal imo. It comes down to the economics of how many transistors you can have in a chip, what power budgets the OEM is willing to design for, and whether it is profitable in the end.

First and foremost, Intel's fabrication technology - how small they can make the transistors - was effectively broken for close to 4 years. 2 CEOs and executive teams were sacked because of this. The smaller the transistors, the more transistors you can put in a chip and the less power it will take to power them. Intel was the pre-eminent manufacturer of computer chips for the better part of 40 years, with 75 to 80% marketshare for most of those years. It takes a lot of mistakes for them to lose their fab lead.

Two things enabled TSMC, Apple's chip manufacturer, to catch and lap Intel in terms of how small a transistor they could make. The smartphone market became the biggest chip market, both in terms of units and money. It's bigger than PCs and servers. This allowed TSMC to make money, a lot of it Apple fronted, and invest in making smaller and smaller transistors. The other thing was Intel fucked up, on both ends. They decided not to get into the smartphone market (they tried when it became obvious, but failed). They then made certain decisions about the design of their 10nm fab that ended up not working and hence a 4 year delay, allowing TSMC to lap them. Even Samsung caught up and lapped them a little.

More transistors mean more performance. Apple's chips have a lot more transistors than Intel's, probably by a factor of 2. If you aren't economical to have a lot of transistors, you can increase performance by having higher clock rates. High clock rates mean it will take more power to run. It's not a linear relationship. It's an exponential increase in power consumption. So, Apple's chips have transistors that are about 2x smaller than Intel's, there are more of them, and consequently can design their chips to run at relatively lower clock rates, and consuming less power.

Intel can theoretically design chips with the same number of transistors as Apple, but the chips will be 2x as large. They will not be profitable doing this. Well, it's really, they will not enjoy their traditional 60% margins if they do it this way, ie, not profitable "enough". So smallish chips with higher power consumption is their way. Apple hates high power consumption chips, and they do it the opposite way (big chips lower power consumption), and you end up with an M1 Pro having about the same performance as an Alder Lake i9-12900H, but with the M1 Pro needing 30 W and the i9 need 80 to 110 W. And Apple has a 2x to 3x more performant on-chip GPU than Intel has. They can not do this if TSMC didn't become the leader in chip manufacturing.

Intel has plans to regain the fab lead, be able to fab the smallest transistors, but we will see about that. They might, or not.

Intel has lost the PC performance war. It can still enjoy over 50% profit margin primarily due to servers. The same is true for Microsoft. The growth market of hi-tech now is cloud computing which requires a lot of servers with tremendous amount of memory.

The AI article is saying that they just regained the performance lead. Intel has regained the performance lead in both PC desktop and laptops. They need to use 2x to 3x more power to do it, but they do in fact have the performance lead. All Intel really needs is performance parity or only trail by a little bit, somewhere around there, and they will be able to maintain their PC share. The vast majority of buyers don't care about power consumption. They care about upfront costs, but power consumption? No. There isn't any, or much, growth in the PC market, but having 70% of it is a rather large chunk of money that Intel really should try to maintain, and obviously that's what they are trying to do.

The server market is rather interesting. It seems to be getting more diverse in hardware, not commoditizing around x86 like the PC market did. Nvidia has a ridiculous stock market valuation right now, all because of GPU compute and cryptocurrency. All major cloud vendors seem to have projects for designing their own hardware, and not just your generic CPU or GPU. Lots of tensor units, crypto units, along with ARM upstarts. Server vendors care about power consumption, and power efficiency may be enough to get them to switch because electricity savings could be big. And, the market still has a lot of growth left.

Apple could make a play here, but it's not a business they want to be in. An Apple 64+8 server chip running at about 300 W. Would be pretty competitive, especially since it can idle at very low power. Not their thing.

More accurate to say that PC gamers don’t care about power consumption. Performance/watt is very important for mobile and data center.

it was superior performance/watt that brought apple to Intel and its Intel’s loss of that superiority that has taken apple away from Intel.

I agree that it would be interesting to see apple make a data center play. I don’t think it’s totally out of the realm of possibility, especially if we expand the definition of ‘data center play’ a bit

Benchmarks show that Intel's Alder Lake chips aren't M1 Max killers

Comments