Apple may shun Intel for custom A-series chips in new Macs within 1-2 years

machineshedfred · January 14, 2015 2:30PM

Intel's latest designs only have to to µOp translate when dealing with *very* old software. Anything that is x86-64 native (OS X is) isn't using any of that, and is running at full speed with full register use. Intel solved that back with the Pentium Pro in 1996, and re-solved it more recently with Pentium-D when they moved away from the garbage Pentium4 "NetBurst" architecture.

More to the point, anyone thinking that you can just throw cores at it in order to make up the difference is very much falling prey to the biggest modern fallacy of general-purpose computing - multiprocessing absolutely does not scale linearly with core count anywhere except benchmarks. Plus, unless the app is written to spin off shloads of threads and execute out-of-order, you're just creating wait-states as you wait for the result of that one thread you need to come back after all the other ones are done. Oh, and if that result changes the work of the other ones? Whoops - re-do all that work.

There's a reason why multiprocessing has been in the domain of servers for years before making it's way to endpoints. Servers have much more predictable workloads; thus multithreading makes far more sense, and the performance gains from x-way CPU designs are far easier to realize.

zoolook · January 14, 2015 2:32PM

Quote:

Originally Posted by brlawyer

I am still waiting for those apps to appear - GCD currently seems like great vaporware.

Not really - but people seem to think it would magically transform single threaded applications into perfectly balance multi-threaded applications. It doesn't. What it does is balance thread load more evenly depending on demand, so they're more dynamic. Before GCD, it was possible (and often happened) than mutliple threads or instances would end up on one core (see countless Logic Pro threads about 5 instruments sitting on one core choking it, and 4 cores doing nothing) - problem now mostly solved. However, one particularly hungry single threaded application could still take down an entire core. GCD won't automatically break that into two or more threads/cores. That's simply not possible to do systemically.

dick applebaum · January 14, 2015 2:38PM

solipsismy wrote: »

Idle thoughts:I wonder if Apple could have already re-written a good part of Mac OS X in Swift at this point.

Based on some of the bugs reported, anomalies in Xcode/Swift/APIs, rapid evolution of Swift, and [Apple Engineer] comments on the developer forums -- I think you are right on (for iOS too)!

Surely there are parts we should expect to be rewritten for iOS and Mac OS X as major updates appear each year.

The IBM/Apple partnership will drive this effort for mobile (iOS) -- and IT (OS X) to support mobile. I would SWAG that the rewrites are 50% done and will be 90%-100% done by WWDC 2015. I think there are significant code safety and speed advantages to this.

I suspect that many planned products and services (from Apple, IBM and some major IT and 3rd-parties developers) depend on this.

It's happening right in front of our eyes ... if we will only look and understand what we see!

I'd think we should also assume that Apple ported Mac OS X to ARM64 a long time ago.

Yep! You don't have to have an ARM64 chip to run the ARM64 instruction set (and develop variations).

I don't know what Apple uses internally for hardware/software development, today -- back in the 1980s it was a Cray. When Steve returned I suspect they used NeXT software running Intel boxes on LAN/WLAN. IDK if they need a maimframe in today's development world.

machineshedfred · January 14, 2015 2:48PM

Quote:

Originally Posted by pdq2

Kinda funny to see folks so worked up over this. The A8x already has a considerably higher transistor count than many Intel mobile chips, and at least some of the transistors in the Intel chips are to (essentially, at this date) emulate the old x86 architecture/instruction set.

I have no doubt Apple, controlling both the software and the hardware, could get better performance (or performance per watt, or performance per dollar) than Intel which has to support 30-year-old software cruft. As a result, I think (at least some of the) Mac probably _will_ move to ARM. Whether it makes a clean break from the past, or retains some Windows compatibility is a legitimate question that I'm sure Cook will give full consideration to.

http://www.extremetech.com/computing/190946-stop-obsessing-over-transistor-counts-theyre-a-terrible-way-of-comparing-chips

Saying a higher transistor count means it's a better chip is like saying that a V8 will always have more power than a 6-cylinder because it's got two extra cylinders. Hint: forced induction makes this completely incorrect.

Look at the whole picture, not just how many gates someone can shove into their die.

Marvin · January 14, 2015 2:56PM

There's no reason they can't have both ARM and x86 laptops/desktops. The people who run Windows or VMs can continue to buy the more expensive models. There will be people who only use the Mac for:

Webcam, email, Safari/Facebook/Twitter, Word, Garageband, iTunes, xCode or web coding, movies & Netflix, Photos, syncing and backing up iPad and iPhone, basic or no games, no Thunderbolt, no Windows.

For those people, x86 actually doesn't make sense because they are paying more than necessary for hardware capability they're not using. These can just be entry models like a $699 Retina laptop and a $799 21.5" iMac.

Mac software developers would just target 2 architectures to be compatible.

It could potentially drive average selling prices way down if too many current x86 buyers migrate down but $300 cheaper would also attract more PC buyers, businesses and schools.

dick applebaum · January 14, 2015 2:57PM

staticx57 wrote: »

dick applebaum wrote: »

The whole industry doesn't make both the hardware and the OS.

So Apple has not figured out how to run OSX on the limited number of Intel chips for the past 8-9 years?

Not at all! Not what I meant. If anyone could solve software problems by throwing hardware [more cores] at it, it would be Apple because they have the expertise and control of the entire stack -- in the case of ARM and iOS/OSX ... and significant leverage to get special treatment from Intel, etc.

I posted earlier that:

A current Intel CPU chip, in real-time, interprets x386 CISC instructions and translates/executes them as RISC (ARM-like) instructions

How much of the power of, say, an i5 or i7 is spent [wasted] doing that?

Couldn't Apple dedicate an A8X or A9X chip (or 2) to the same task -- translating x86 CISC into ARM RISC ... at equal or better performance, 1/2 the power and 1/3 the cost?

dick applebaum · January 14, 2015 3:02PM

zoolook wrote: »

Much of iOS'es apparent speed, is an exceptionally slick Operating System. I love OS X as much as the next person here, but it's MUCH more bloated than iOS.</span>

But that's changing ... Swiftly.

joelsalt · January 14, 2015 3:08PM

Quote:

Originally Posted by Durandal1707

Translation: I've got nothing, so I'll just make a lame attack with no substance instead and hope nobody notices.

It is a well-known fact that the transition to Intel increased Apple's Mac sales dramatically. Also, Parallels and VMWare are high-selling products on the Mac platform. I don't think Apple will drop the Intel architecture on their main lineup any time soon. They could conceivably do it on an ultra-low-end chromebook-type e-mail reader, if that's what this new 12" one-USB thing turns out to be, but even that is fraught with danger — just look at how Microsoft's low-end Surface model, being slow and incompatible with everything, destroyed the perception of that line.

I'm not arguing with you. But that chart means squat and does not improve your argument one iota. In fact, it weakens it because its such a transparent rhetorical move.

The best part about facts well-known is that you don't have to cite sources!

dick applebaum · January 14, 2015 3:13PM

pdq2 wrote: »

Kinda funny to see folks so worked up over this. The A8x already has a considerably higher transistor count than many Intel mobile chips, and at least some of the transistors in the Intel chips are to (essentially, at this date) emulate the old x86 architecture/instruction set.

I have no doubt Apple, controlling both the software and the hardware, could get better performance (or performance per watt, or performance per dollar) than Intel which has to support 30-year-old software cruft. As a result, I think (at least some of the) Mac probably _will_ move to ARM. Whether it makes a clean break from the past, or retains some Windows compatibility is a legitimate question that I'm sure Cook will give full consideration to.

Well said and succinct!

melgross · January 14, 2015 3:17PM

raoulduke42 wrote: »

Supposedly, but many apps are just not that parallel to begin with. With those, not much can be done.

I suggest that you watch your cores when running some apps, and see what happens. If you only have two cores, you won't learn much, but if you have four or more, you'd be surprised.

Marvin · January 14, 2015 3:24PM

pdq2 wrote: »

Kinda funny to see folks so worked up over this. The A8x already has a considerably higher transistor count than many Intel mobile chips, and at least some of the transistors in the Intel chips are to (essentially, at this date) emulate the old x86 architecture/instruction set.

I have no doubt Apple, controlling both the software and the hardware, could get better performance (or performance per watt, or performance per dollar) than Intel which has to support 30-year-old software cruft.

That's another point, in the transition between PPC and x86, Apple didn't make either of the hardware chips so they had to use software binary translation in the form of Rosetta. With them building the ARM chips, potentially they could implement some sort of hardware translation layer that lets x86 instructions run on the ARM chip without the performance hit of Rosetta. This might not cover Windows compatibility but it should cover x86 Mac software and if the chips are faster anyway, a performance hit would just even it out with Intel.

mstone · January 14, 2015 3:27PM

If Apple creates a two tier system whereby cheaper Macs run ARM and higher end Macs run on x86 it will be like the Rosetta situation back in the PPC -> x86 transition, which was a brilliant solution on Apple's part but it was sort of a pain in the ass, although it was only temporary. I can imagine people having a retina iMac and a Macbook Air and needing two different versions of an application or some sort of virtual environment.

That could be a lot of extra work for some developers, but if Apple makes a drop down selection in Xcode to target one OS or the other, it might not be so bad, except for the really complex apps like Adobe CC or 3D games that use their own custom libraries. For average App Store apps it shouldn't be too difficult to manage.

durandal_1707 · January 14, 2015 3:28PM

joelsalt wrote: »

I'm not arguing with you. But that chart means squat and does not improve your argument one iota. In fact, it weakens it because its such a transparent rhetorical move.

The chart shows very clearly how Apple's sales improved after switching to Intel processors, which bolsters the argument.

The best part about facts well-known is that you don't have to cite sources!

So this is you, basically:

http://thecolbertreport.cc.com/videos/63ite2/the-word---truthiness

dick applebaum · January 14, 2015 3:35PM

Here ya' go!

A marriage made in heaven:

[IMG ALT=""]http://forums.appleinsider.com/content/type/61/id/54206/width/500/height/1000[/IMG]

[IMG ALT=""]http://forums.appleinsider.com/content/type/61/id/54207/width/500/height/1000[/IMG]

melgross · January 14, 2015 3:37PM

dick applebaum wrote: »

That sounds reasonable!

Here's a what if:

What if Apple decides it needs, say, 32 cores for a so-called power user machine ... because the normal power user needs 4-8 cores ... but Apple needs the other cores to provide Intel virtualization in a very-specific, customized to the ARM hardware, VM?

Right now, an Intel chip virtualizes x86 CISC instructions in real-time and then runs them as RISC code. Do you think that 24 A9X ARM cores could do the equivalent?

It's an interesting idea. We know that it takes a processor that's emulating another to have to be about five times as powerful as the one being emulated. That's a hard thing. It's why Macs were so slow with Virtual PC and other emulation software. So I suppose that Apple could use a bunch of cores for that emulation.

But I have a better idea that I've been flinging about, here, and in other places. It's also known that just a handful of the code in a chip comprises 80% of that slowdown from emulation. Since that code is open, and available to everyone, Apple could add those instructions to their ARM chips. Whenever an app requires those specific instructions, rather than emulating them, the system could switch to those instructions. Bingo! Very little emulation loss. This can be done. I've spoken to some people in the industry about that, and they agree. I would hope that Apple is thinking about doing that.

Right now, despite what the article says, the A8x is about as powerful as a lower end i3 low power chip. That i3 has two cores. If Apple used four cores in the A9, and doubled the graphics performance again, they would have a chip that competes with a higher end i3, with slightly better graphics.

But Apple doesn't use an i3 in their machines. They start with a medium level low power i5. So we're not there yet. The Surface Pro does use that i3 in its lowest priced model for $799, sans the required keyboard. No one talks about that model, and I've never seen it reviewed either. But performance ain't so hot, I would imagine.

If Apple could get per core performance up about 25% in a four core model, and increase graphics maybe another 75%, it would be about where a lower level i5 is now, with close to Iris Pro graphics. So that would be usable for a low end Macbook Air. But only if Apple did what I suggest with instructions. Moving to eight cores, would materially offer no performance advantage whatsoever outside of performance testing software, and those very few apps that could use them.

One difficulty Apple would have is that ALL software must run on this machine. They can't say that Photoshop won't run properly, or Excell, or FCP, or anything else. It's all got to run reasonably well for a machine in the price range that this one would be in, regardless of what chip is inside. Apple can't have two lines of OS X machines out there, one running all apps, and the other not. This is a very serious thing for them to consider.

The other thing is the question of how high they can take ARM. Their current wide but low GHz designs can't be run at speeds too much higher, even in a large machine with good cooling. These types of designs run at slower speeds. I've seen people write that Apple could run these at 2.5 to 2.8 GHz, but that's not correct. Chips have overall heating issues that can be alleviated with cooling systems. But as chips get smaller there are spot heating issues that can't be eliminated by conventional cooling. Run a chip too high, and this becomes a major problem. In addition raising the speed requires raising the voltage. This increases wattage significantly.

I'd like to see this happen, even if it's just for lower prices, and smaller machines, but I suspect there's a lot of work yet to be done.

melgross · January 14, 2015 3:50PM

chadbag wrote: »

32 or 64 smaller cores may be the same as say 8 of your high powered cores in terms of CPU cycles. We'll see, but I foresee that a lot of small cores may be a way in the future to achieving better results than trying to high power your cores and just have a few. Context switching wastes your cache, time, etc in general terms. With lots of smaller cores your CPU scheduler needs to be more efficient and can be less efficient at the same time. It needs to keep the cores busy, but it doesn't have to deal with all the context switching.

Anyway, we'll see. And 32 cores is only 8 quad core chips.

For the usage model we're talking about, it still doesn't work. No matter how you slice all those cores, many will be doing nothing most of the time.

melgross · January 14, 2015 4:00PM

machineshedfred wrote: »

http://www.extremetech.com/computing/190946-stop-obsessing-over-transistor-counts-theyre-a-terrible-way-of-comparing-chips

Saying a higher transistor count means it's a better chip is like saying that a V8 will always have more power than a 6-cylinder because it's got two extra cylinders. Hint: forced induction makes this completely incorrect.

Look at the whole picture, not just how many gates someone can shove into their die.

If you're comparing two specific implementations, then, yes, that's correct. But generally, more transistors do indicate something in terms of performance and versatility. It's like comparing brain size. Generally, larger brains, from one species to another do indicate intelligence advances. But it's also what that brain consists of, and brain size to body size. Whales can have much larger brains than humans, but have much lower intelligence.

Apple's chips run at much lower clocks also, because of their design. Like everything else, it's a complex issue. But ARM is coming from behind. Just a few years ago, no one thought that ARM had even the slightest chance of rivaling x86, even on the lowest end. Nvidia's latest announced, but not yet delivered, Tegra, the X-1, I think, looks to be an even more powerful monster than the A8x. It seems that Apple and Nvidia will be duking it out for a while.

The other thing, of course, is that Apple's chips are SoC's, so they have much more inside than comparable intel chips. We also don't know what approximately 35% of the space in the chip is being used for.

melgross · January 14, 2015 4:04PM

Marvin wrote: »

That's another point, in the transition between PPC and x86, Apple didn't make either of the hardware chips so they had to use software binary translation in the form of Rosetta. With them building the ARM chips, potentially they could implement some sort of hardware translation layer that lets x86 instructions run on the ARM chip without the performance hit of Rosetta. This might not cover Windows compatibility but it should cover x86 Mac software and if the chips are faster anyway, a performance hit would just even it out with Intel.

That's somewhat similar to what I've been saying.

nolamacguy · January 14, 2015 4:29PM

Quote:

Originally Posted by nhughes

This analyst has broken some of the biggest Apple-related scoops of the last few years. It's a big story, and it broke this morning, and that's why we used the breaking tag.

you keeping using that word...i do not think it means what you think it means. nothing has "broken" here. it's an analyst's research note -- which may be right, may be wrong. theres nothing breaking because theres nothing happening. its just a rumor, not an event.

dick applebaum · January 14, 2015 4:38PM

melgross wrote: »

dick applebaum wrote: »

That sounds reasonable!

Here's a what if:

What if Apple decides it needs, say, 32 cores for a so-called power user machine ... because the normal power user needs 4-8 cores ... but Apple needs the other cores to provide Intel virtualization in a very-specific, customized to the ARM hardware, VM?

Right now, an Intel chip virtualizes x86 CISC instructions in real-time and then runs them as RISC code. Do you think that 24 A9X ARM cores could do the equivalent?

It's an interesting idea. We know that it takes a processor that's emulating another to have to be about five times as powerful as the one being emulated. That's a hard thing. It's why Macs were so slow with Virtual PC and other emulation software. So I suppose that Apple could use a bunch of cores for that emulation.

Virtual PC was slow -- but better than the alternative, IMO! It is interesting that MS bought VPC -- presumably to kill it

But I have a better idea that I've been flinging about, here, and in other places. It's also known that just a handful of the code in a chip comprises 80% of that slowdown from emulation. Since that code is open, and available to everyone, Apple could add those instructions to their ARM chips. Whenever an app requires those specific instructions, rather than emulating them, the system could switch to those instructions. Bingo! Very little emulation loss. This can be done. I've spoken to some people in the industry about that, and they agree. I would hope that Apple is thinking about doing that.

Now, that is interesting ... The old 80-20 rule ... So, Apple's ARM chip would include the equivalent of the x86 CISC-->RISC instruction translations with some system software to handle the switch, out-of-order, etc.?

I suspect others could do this, but are several years off, who would write the OS/System software, who would they sell to?

Advantage Apple!

Right now, despite what the article says, the A8x is about as powerful as a lower end i3 low power chip. That i3 has two cores. If Apple used four cores in the A9, and doubled the graphics performance again, they would have a chip that competes with a higher end i3, with slightly better graphics.

But Apple doesn't use an i3 in their machines. They start with a medium level low power i5. So we're not there yet. The Surface Pro does use that i3 in its lowest priced model for $799, sans the required keyboard. No one talks about that model, and I've never seen it reviewed either. But performance ain't so hot, I would imagine.

If Apple could get per core performance up about 25% in a four core model, and increase graphics maybe another 75%, it would be about where a lower level i5 is now, with close to Iris Pro graphics. So that would be usable for a low end Macbook Air. But only if Apple did what I suggest with instructions. Moving to eight cores, would materially offer no performance advantage whatsoever outside of performance testing software, and those very few apps that could use them.

One difficulty Apple would have is that ALL software must run on this machine. They can't say that Photoshop won't run properly, or Excell, or FCP, or anything else. It's all got to run reasonably well for a machine in the price range that this one would be in, regardless of what chip is inside. Apple can't have two lines of OS X machines out there, one running all apps, and the other not. This is a very serious thing for them to consider.

IDK, the software publishers are tending to online subscriptions so Photoshop et al are less a problem than several years ago. Office already runs on ARM ... I suspect that FCPX already runs on ARM ... the A8X accesses external DRAM so that need not be an issue ... Love to see Optical Flow renders on multiple A8X GPUs

The other thing is the question of how high they can take ARM. Their current wide but low GHz designs can't be run at speeds too much higher, even in a large machine with good cooling. These types of designs run at slower speeds. I've seen people write that Apple could run these at 2.5 to 2.8 GHz, but that's not correct. Chips have overall heating issues that can be alleviated with cooling systems. But as chips get smaller there are spot heating issues that can't be eliminated by conventional cooling. Run a chip too high, and this becomes a major problem. In addition raising the speed requires raising the voltage. This increases wattage significantly.

I did some surfing and posted in other threads about Silicon on Insulator (SoI) and SoS

... no, not that SoS -- rather Silicon on Sapphire. Sapphire is a better insulator and better heat dissipater than Silicon. If you deposit the Silicon on Sapphire the traces can be closer together, have less leakage, and dissipate more heat -- basically, you can run them at higher speeds with less problems ... Maybe this could help! Also Micron, for one, is doing some experimenting with next-gen stacked DRAM using Silicon -- fast and efficient.

I'd like to see this happen, even if it's just for lower prices, and smaller machines, but I suspect there's a lot of work yet to be done.

Agree!

Apple may shun Intel for custom A-series chips in new Macs within 1-2 years

Comments