My prediction of computers going back to more dedicated/custom chips seems to be holding true...
however: how does adding a chip reduce battery drain?
Simple. Instead of taxing the main CPU to perform, for example, graphics intensive tasks that require a lot of clock cycles, graphics tasks are offloaded to a GPU that is designed to be optimized for image and video manipulation, which can be done in a significantly different, and much more parallel manner due to an image file being a discrete and known entity; processing can be done on multiple portions of an image at the same time, without effecting work being done on other portions. In this manner a GPU can be more energy efficient in doing the same task that a CPU would otherwise need to crunch through sequentially. Same likely holds true for an AI chip, which would likely take the form of a neural network, optimized for quickly crunching a certain type of task, and therefore much more energy efficient at performing that task.
CPUs aren't designed to handle anything a GPU does in its sleep. The custom DSP chip that the AI will be based on will have a very low power design specifically meant to handle very vectorized operations that are big energy wasters on a CPU. The DSP is better optimized to handle FFT and Langrange multipliers, at a fraction of the power and total branches necessary inside a CPU.
It wouldn't surprise me in the slightest if Apple has been working on such a chip for years. One of the advantages of doing some of your own chip design in-house is that you do these kinds of projects. Whether or not it ever sees the light of day is an entirely other matter...
Awesome. I much prefer that AI processing is done locally than in the cloud. Apple has the resources, expertise, motivation, and ecosystem to leapfrog everyone in this area.
I hope this mean that this Chip shows up in the Siri Speaker. We've heard about the speakers on it but not if it has far field microphones. Hopefully it has all this. So that Siri finally understands me better than Alexa. Hopefully this leak is means that its going into production. Now it may be that the speaker will not be released till months after WWDC.
Following the invention of the integrated circuit by Jack Kilby, when he was working at TI, and other investigators, the question of functional partitioning evolved from, 'yeah, that's kinda interesting', to a critical matter today. What do you stuff in a single die? Or more accurately, how do you dispose the suite of desired/necessary functions across X semiconductors?
Process technology: certain device functionality is best fabricated using a particular semiconductor process technology. For example, digital functions typically are fabricated using digital processes, as opposed to analog processes that are used for many RF and other analog functions.
In addition, within digital, logic functions are typically best made using a logic process (think Intel or Samsung or TSMC logic processes) whereas memory (think DRAM) is best fabbed using specialized memory processes (think Micron et al. memory chips). Including memory in an otherwise predominantly logic chip, like a CPU, manufactured using a logic process, illustrates this point: Level X cache is measured in MB, not GB -- it's a little 'unnatural'.
Relevant to this thread, for logic functions which might be fabbed using the same or similar logic processes, functional partitioning -- including some functions in one chip and some in another (not integrating everything in a single die) may be indicated if the architectures are dissimilar. Setting aside the fact that more or less OK but not brilliant GPU functionality is built into mainline CPUs/SOCs (Intel et al.), the killer GPU devices are discrete and have very different architectures (compared to CPUs) that include up to several thousand cores driven by the need for graphics parallel processing.
It's also true that partitioning is driven by die cost, which is a function of die size (square mm) and wafer yield (% good die per wafer), among other factors, which are themselves driven by complexity (including the number of metal and other layers) and process node, where shrinking line widths result in exponentially increasing costs, including for lithography (think about the likely cost of production Extreme Ultra Violet [EUV] lithography tools). Thus, even though more compatible functionality could be crammed onto a single chip, it may be less economically efficient. I note that higher end commercially available GPUs exceed 10 B transistors today.
Other factors of merit:
(i) What functionality may be reasonably integrated into an SOC with interconnected stacked die?
(ii) What clock frequency is really required to perform particular functions? Lower clock speed --> lower power dissipation --> longer battery life; good for mobile applications -- as long as overall system performance is not degraded. If I remember correctly, one of the motivations for Apple's Motion processor is precisely this.
(iii) On the other hand, pushing a signal off chip, through the die's package and onto, across and off a PCB into another discrete device takes power. This negative ramification (more discrete semiconductors usually means more power dissipation) promotes greater die complexity (to maintain or improve power efficiency). This power factor, in conjunction with optimal device complexity (stuffing as much compatible functionality as possible in the same chip subject to maintaining the best overall balance of system performance, bill of materials cost and, for mobile applications, power consumption -- at any particular point in time [optimal device complexity generally increases over time as the semi industry migrates to finer process geometries]) helps drive what is and is not slammed into an IC.
Functionality partitioning is more science than art; think linear programming: So Apple might say, "How do we optimize the user experience on, say, an iPhone, subject to a suite of constraints that include the iPhone's actual physical envelope, max TDP budget, min battery life, max BOM cost, min delivered AI/AR performance, etc. We as users may have issues with thinness, etc., but Apple's semiconductor and system engineers know how to perform that optimization in a way consistent with the Company's requirements. And the semiconductor component of this process -- what functionality is included and how those functions are disposed across X chips -- is a big part of the process.
So, whether AI/AR functionality is included in a discrete device or integrated locally on another semiconductor in the same iPhone/Pad or Mac is much less interesting than whether that functionality is efficiently made available *system wide* on your Apple hardware product.
On the surface, this seems to be a direct response to Google's TensorFlow announcements. In reality, Apple has been working on such a chip for a few years now and has been forced to reveal something on stage at WWDC so they don't appear to be falling behind.
But the reality is that AI is still like a game of Monopoly and each player only holds one property so far. There are so many moves to go that comparing Siri to Echo or Google or Cortana right now is pointless - they all suck.
It actually sounds more like Apple is trying their hand at something like Google's Tango (See Asus Zenphone AR https://www.asus.com/us/Phone/ZenFone-AR-ZS571KL/). The recent announcements about their 2nd gen TPU and TensorFlow is something else entirely and FAR more powerful, certainly not smartphone-bound.
That Apple would be looking at this makes perfect sense of course and if anyone can jump-start augmented reality apps it would be Apple.
No, it's more like TensorFlow. Google has TensorFlow Lite which is scaled down and designed for mobile. Google even worked with Qualcomm to have TensorFlow Lite work with the Hexagon DSP that's in the 835 processor.
So if Apple makes a mobile "neural engine" it will very likely be similar to TensorFlow Lite. Except Apples "neural engine" will most certainly be superior to Qualcomms, just like their processor cores already are.
With Google already shipping a product for mobile use that targets the same AI (not AR) uses cases that Apple is rumored to be interested in I still think they're looking at something closer to Tango than Tensorflow which seems to be more appropriate for larger computing devices and servers. Why do you think Tango isn't in the same market as Apple will be looking at?
On the surface, this seems to be a direct response to Google's TensorFlow announcements. In reality, Apple has been working on such a chip for a few years now and has been forced to reveal something on stage at WWDC so they don't appear to be falling behind.
But the reality is that AI is still like a game of Monopoly and each player only holds one property so far. There are so many moves to go that comparing Siri to Echo or Google or Cortana right now is pointless - they all suck.
It actually sounds more like Apple is trying their hand at something like Google's Tango (See Asus Zenphone AR https://www.asus.com/us/Phone/ZenFone-AR-ZS571KL/). The recent announcements about their 2nd gen TPU and TensorFlow is something else entirely and FAR more powerful, certainly not smartphone-bound.
That Apple would be looking at this makes perfect sense of course and if anyone can jump-start augmented reality apps it would be Apple.
No, it's more like TensorFlow. Google has TensorFlow Lite which is scaled down and designed for mobile. Google even worked with Qualcomm to have TensorFlow Lite work with the Hexagon DSP that's in the 835 processor.
So if Apple makes a mobile "neural engine" it will very likely be similar to TensorFlow Lite. Except Apples "neural engine" will most certainly be superior to Qualcomms, just like their processor cores already are.
With Google already shipping a product for mobile use that targets the same AI (not AR) uses cases that Apple is rumored to be interested in I still think they're looking at something closer to Tango than Tensorflow which seems to be more appropriate for larger computing devices and servers. Why do you think Tango isn't in the same market as Apple will be looking at?
Why are you bringing up TensorFlow when I specifically mentioned TensorFlow Lite (which is the scaled down mobile version)? This announcement is about Apple supposedly developing a mobile "AI" processor. Not a large power hungry processor used in data centres. So it's just like TensorFlow Lite, except Apple is making a dedicated processor instead of a SDK for a DSP (if the rumor is true).
I don't think Apple will limit their AI ambitions simply to AR when there's so much more it can do on a mobile device.
On the surface, this seems to be a direct response to Google's TensorFlow announcements. In reality, Apple has been working on such a chip for a few years now and has been forced to reveal something on stage at WWDC so they don't appear to be falling behind.
But the reality is that AI is still like a game of Monopoly and each player only holds one property so far. There are so many moves to go that comparing Siri to Echo or Google or Cortana right now is pointless - they all suck.
It actually sounds more like Apple is trying their hand at something like Google's Tango (See Asus Zenphone AR https://www.asus.com/us/Phone/ZenFone-AR-ZS571KL/). The recent announcements about their 2nd gen TPU and TensorFlow is something else entirely and FAR more powerful, certainly not smartphone-bound.
That Apple would be looking at this makes perfect sense of course and if anyone can jump-start augmented reality apps it would be Apple.
No, it's more like TensorFlow. Google has TensorFlow Lite which is scaled down and designed for mobile. Google even worked with Qualcomm to have TensorFlow Lite work with the Hexagon DSP that's in the 835 processor.
So if Apple makes a mobile "neural engine" it will very likely be similar to TensorFlow Lite. Except Apples "neural engine" will most certainly be superior to Qualcomms, just like their processor cores already are.
With Google already shipping a product for mobile use that targets the same AI (not AR) uses cases that Apple is rumored to be interested in I still think they're looking at something closer to Tango than Tensorflow which seems to be more appropriate for larger computing devices and servers. Why do you think Tango isn't in the same market as Apple will be looking at?
Why are you bringing up TensorFlow when I specifically mentioned TensorFlow Lite (which is the scaled down mobile version)? This announcement is about Apple supposedly developing a mobile "AI" processor. Not a large power hungry processor used in data centres. So it's just like TensorFlow Lite, except Apple is making a dedicated processor instead of a SDK for a DSP (if the rumor is true).
I don't think Apple will limit their AI ambitions simply to AR when there's so much more it can do on a mobile device.
Thanks. I do understand the point you made.
Google's Tango on smartphones uses a dedicated processor now, similar perhaps to what you think Apple will develop. The AppleInsider article also alludes to the much of same uses as Google's Tango already is trying to address. But yeah maybe they'll mimic TensorFlowLITE software/programming too is some ways. Point taken. Who knows what the rumors lead to.
On the surface, this seems to be a direct response to Google's TensorFlow announcements. In reality, Apple has been working on such a chip for a few years now and has been forced to reveal something on stage at WWDC so they don't appear to be falling behind.
But the reality is that AI is still like a game of Monopoly and each player only holds one property so far. There are so many moves to go that comparing Siri to Echo or Google or Cortana right now is pointless - they all suck.
I don't see Apple announcing something because some tech forum members or tech writers claim that Apple is failing behind. I can think of think of innumerable products and technologies they were years to decades behind or companies for the same category and said nothing of their R&D efforts or came out with an unfinished "me too" product Bryce of a perceived corporate insecurity.
I wonder if this doesn't imply that the future of computing will be machines that are built around a CPU, GPU, AIPU combo.
We can reasonably infer that future architects will increasingly use multiple (>=3) processing units, and that dedicated A.I. will eventually infiltrate each of them. After that point, expect emergent behaviour from these systems — and a reactionary public paranoia about robots taking over.
Drop the Artificial part (it's either intelligence or it isn't), and let's call it the iPU. Hmm, on second thought...
I wonder if this doesn't imply that the future of computing will be machines that are built around a CPU, GPU, AIPU combo.
We can reasonably infer that future architects will increasingly use multiple (>=3) processing units, and that dedicated A.I. will eventually infiltrate each of them. After that point, expect emergent behaviour from these systems — and a reactionary public paranoia about robots taking over.
Drop the Artificial part (it's either intelligence or it isn't), and let's call it the iPU. Hmm, on second thought…
artificial |ˌärdəˈfiSHəl|
adjective - made or produced by human beings rather than occurring naturally
Following the invention of the integrated circuit by Jack Kilby, when he was working at TI, and other investigators, the question of functional partitioning evolved from, 'yeah, that's kinda interesting', to a critical matter today. What do you stuff in a single die? Or more accurately, how do you dispose the suite of desired/necessary functions across X semiconductors?
Process technology: certain device functionality is best fabricated using a particular semiconductor process technology. For example, digital functions typically are fabricated using digital processes, as opposed to analog processes that are used for many RF and other analog functions.
In addition, within digital, logic functions are typically best made using a logic process (think Intel or Samsung or TSMC logic processes) whereas memory (think DRAM) is best fabbed using specialized memory processes (think Micron et al. memory chips). Including memory in an otherwise predominantly logic chip, like a CPU, manufactured using a logic process, illustrates this point: Level X cache is measured in MB, not GB -- it's a little 'unnatural'.
Relevant to this thread, for logic functions which might be fabbed using the same or similar logic processes, functional partitioning -- including some functions in one chip and some in another (not integrating everything in a single die) may be indicated if the architectures are dissimilar. Setting aside the fact that more or less OK but not brilliant GPU functionality is built into mainline CPUs/SOCs (Intel et al.), the killer GPU devices are discrete and have very different architectures (compared to CPUs) that include up to several thousand cores driven by the need for graphics parallel processing.
It's also true that partitioning is driven by die cost, which is a function of die size (square mm) and wafer yield (% good die per wafer), among other factors, which are themselves driven by complexity (including the number of metal and other layers) and process node, where shrinking line widths result in exponentially increasing costs, including for lithography (think about the likely cost of production Extreme Ultra Violet [EUV] lithography tools). Thus, even though more compatible functionality could be crammed onto a single chip, it may be less economically efficient. I note that higher end commercially available GPUs exceed 10 B transistors today.
Other factors of merit:
(i) What functionality may be reasonably integrated into an SOC with interconnected stacked die?
(ii) What clock frequency is really required to perform particular functions? Lower clock speed --> lower power dissipation --> longer battery life; good for mobile applications -- as long as overall system performance is not degraded. If I remember correctly, one of the motivations for Apple's Motion processor is precisely this.
(iii) On the other hand, pushing a signal off chip, through the die's package and onto, across and off a PCB into another discrete device takes power. This negative ramification (more discrete semiconductors usually means more power dissipation) promotes greater die complexity (to maintain or improve power efficiency). This power factor, in conjunction with optimal device complexity (stuffing as much compatible functionality as possible in the same chip subject to maintaining the best overall balance of system performance, bill of materials cost and, for mobile applications, power consumption -- at any particular point in time [optimal device complexity generally increases over time as the semi industry migrates to finer process geometries]) helps drive what is and is not slammed into an IC.
Functionality partitioning is more science than art; think linear programming: So Apple might say, "How do we optimize the user experience on, say, an iPhone, subject to a suite of constraints that include the iPhone's actual physical envelope, max TDP budget, min battery life, max BOM cost, min delivered AI/AR performance, etc. We as users may have issues with thinness, etc., but Apple's semiconductor and system engineers know how to perform that optimization in a way consistent with the Company's requirements. And the semiconductor component of this process -- what functionality is included and how those functions are disposed across X chips -- is a big part of the process.
So, whether AI/AR functionality is included in a discrete device or integrated locally on another semiconductor in the same iPhone/Pad or Mac is much less interesting than whether that functionality is efficiently made available *system wide* on your Apple hardware product.
Excellent summary. I'll add that breaking out to a separate component also allows easier use of that component in multilple devices, say in an iPhone using an A11 CPU and an iPad using an A11x, without having to map that component into the design of each processor. This allows the separated component to be produced in higher volumes, and also to have its own development roadmap somewhat separated from the CPU.
Tech journalists from around the world collectively have heart attacks thinking about all the childish stories they can write based on this acronym. Samsung scrambles to launch a new campaign combining the personalization of their devices–specifically the colors in which they’re available–and their status in the market. The “[Your Name Here] Blue Up” campaign was regarded by historians as the first step toward their final bankruptcy.
Awesome. I much prefer that AI processing is done locally than in the cloud. Apple has the resources, expertise, motivation, and ecosystem to leapfrog everyone in this area.
They already started moving the neural net onto the iPhone last year but I haven't seen any evidence that the iPhone will still process data locally without using Siri via iCloud. I've even tried simple tasks, like disabling the internet data and then trying to make a call and start a Music playlist, which are local capabilities we had prior to Siri.
Awesome. I much prefer that AI processing is done locally than in the cloud. Apple has the resources, expertise, motivation, and ecosystem to leapfrog everyone in this area.
Apple benefits by selling you a more expensive phone and reducing the demand on the data centers. Perhaps the split will be to recognize speech locally, then do the natural language processing on their servers. That makes a big reduction in the amount of data to be transmitted, with consequent improvement to battery life.
Is it a legitimate hardware development strategy to build and test a processor separately, then integrate it into an SOC?
Siri was recently improved without any fanfare, at least for Australians anyway! Previously it would only recognise my speech if I adopted an American accent, which I'm not very good at and it's hard to keep up, but a few of months ago I did an Apple TV search and Siri recognised me wordperfect. Last week I bought "Blade Runner - Final Cut" on iTunes and my girl and I settled down to watch it last night, but it was dubbed in French. I hit pause and thought for a while, with Leon's ugly, agitated mug filing the screen. On the older AppleTV remotes you could hold select and options came up, but that doesn't work on the new remote. Idea! I hit play, held the Siri button and said, in my normal voice, "Change Language to English" and a millisecond later Leon was speaking in English - almost magic! But it made me wonder: when Siri sleeps does she dream of electric sheep?
Following the invention of the integrated circuit by Jack Kilby, when he was working at TI, and other investigators, the question of functional partitioning evolved from, 'yeah, that's kinda interesting', to a critical matter today. What do you stuff in a single die? Or more accurately, how do you dispose the suite of desired/necessary functions across X semiconductors?
Process technology: certain device functionality is best fabricated using a particular semiconductor process technology. For example, digital functions typically are fabricated using digital processes, as opposed to analog processes that are used for many RF and other analog functions.
In addition, within digital, logic functions are typically best made using a logic process (think Intel or Samsung or TSMC logic processes) whereas memory (think DRAM) is best fabbed using specialized memory processes (think Micron et al. memory chips). Including memory in an otherwise predominantly logic chip, like a CPU, manufactured using a logic process, illustrates this point: Level X cache is measured in MB, not GB -- it's a little 'unnatural'.
Relevant to this thread, for logic functions which might be fabbed using the same or similar logic processes, functional partitioning -- including some functions in one chip and some in another (not integrating everything in a single die) may be indicated if the architectures are dissimilar. Setting aside the fact that more or less OK but not brilliant GPU functionality is built into mainline CPUs/SOCs (Intel et al.), the killer GPU devices are discrete and have very different architectures (compared to CPUs) that include up to several thousand cores driven by the need for graphics parallel processing.
It's also true that partitioning is driven by die cost, which is a function of die size (square mm) and wafer yield (% good die per wafer), among other factors, which are themselves driven by complexity (including the number of metal and other layers) and process node, where shrinking line widths result in exponentially increasing costs, including for lithography (think about the likely cost of production Extreme Ultra Violet [EUV] lithography tools). Thus, even though more compatible functionality could be crammed onto a single chip, it may be less economically efficient. I note that higher end commercially available GPUs exceed 10 B transistors today.
Other factors of merit:
(i) What functionality may be reasonably integrated into an SOC with interconnected stacked die?
(ii) What clock frequency is really required to perform particular functions? Lower clock speed --> lower power dissipation --> longer battery life; good for mobile applications -- as long as overall system performance is not degraded. If I remember correctly, one of the motivations for Apple's Motion processor is precisely this.
(iii) On the other hand, pushing a signal off chip, through the die's package and onto, across and off a PCB into another discrete device takes power. This negative ramification (more discrete semiconductors usually means more power dissipation) promotes greater die complexity (to maintain or improve power efficiency). This power factor, in conjunction with optimal device complexity (stuffing as much compatible functionality as possible in the same chip subject to maintaining the best overall balance of system performance, bill of materials cost and, for mobile applications, power consumption -- at any particular point in time [optimal device complexity generally increases over time as the semi industry migrates to finer process geometries]) helps drive what is and is not slammed into an IC.
Functionality partitioning is more science than art; think linear programming: So Apple might say, "How do we optimize the user experience on, say, an iPhone, subject to a suite of constraints that include the iPhone's actual physical envelope, max TDP budget, min battery life, max BOM cost, min delivered AI/AR performance, etc. We as users may have issues with thinness, etc., but Apple's semiconductor and system engineers know how to perform that optimization in a way consistent with the Company's requirements. And the semiconductor component of this process -- what functionality is included and how those functions are disposed across X chips -- is a big part of the process.
So, whether AI/AR functionality is included in a discrete device or integrated locally on another semiconductor in the same iPhone/Pad or Mac is much less interesting than whether that functionality is efficiently made available *system wide* on your Apple hardware product.
So curious all processors have schedulers to work out which the timing and flow of information around the chip right. So which process type makes the best scheduler?
also if Apple are going towards small core big core design would it make any sense to have a common chip of the small core types then package in the large cores each made on better suited fabrication process.
Siri was recently improved without any fanfare, at least for Australians anyway! Previously it would only recognise my speech if I adopted an American accent, which I'm not very good at and it's hard to keep up, but a few of months ago I did an Apple TV search and Siri recognised me wordperfect. Last week I bought "Blade Runner - Final Cut" on iTunes and my girl and I settled down to watch it last night, but it was dubbed in French. I hit pause and thought for a while, with Leon's ugly, agitated mug filing the screen. On the older AppleTV remotes you could hold select and options came up, but that doesn't work on the new remote. Idea! I hit play, held the Siri button and said, in my normal voice, "Change Language to English" and a millisecond later Leon was speaking in English - almost magic! But it made me wonder: when Siri sleeps does she dream of electric sheep?
No, Androids dream of electric sheep, Siri is iOS.
From what I can see, this is the way forward with security. Build a brain chip. Put all the capability in a discrete unit and allow the OS and apps to utilize. Ideally, with minimal or no need for internet. This is the Leap Frog approach.
I wonder if this doesn't imply that the future of computing will be machines that are built around a CPU, GPU, AIPU combo.
We can reasonably infer that future architects will increasingly use multiple (>=3) processing units, and that dedicated A.I. will eventually infiltrate each of them. After that point, expect emergent behaviour from these systems — and a reactionary public paranoia about robots taking over.
Drop the Artificial part (it's either intelligence or it isn't), and let's call it the iPU. Hmm, on second thought…
artificial |ˌärdəˈfiSHəl|
adjective - made or produced by human beings rather than occurring naturally
Everything inside a computer is artificial. Should be referring to artificial memory as well? Artificial storage? Artificial logic?
"Brain-like", as it markets well, while failing to mention the human brain is likely simultaneously both quantum and organic, something you can't build in a microprocessor lab from silicon. A discussion for another time.
One of the problems I have with the term AI is that it really hasn't become an intelligence at this point. Current AI techniques are just another way to process data that mimic some operations in the brain. This really has nothing to do with intelligence in the same way that a normal computer program solving a problem for you does not represent intelligence.
As for the quantum world that is a very real concern in modern semiconductor processes. It wouldn't be impossible to leverage quantum realities to produce an AI chip that performs with unique capabilities. I'm still not sure that would mean "intelligence" in the sense of a human being.
Yeah, I don't think it would. I used be a concrete materialist kind of guy and so with an interest in physics I've been looking into physics for a number of years now and have become basically convinced there's more than materialism and in turn I have gravitated I would submit quite naturally to a sense that complexity by itself won't bring awareness to a system. So, no, I don't believe it would be intelligent in how we think of intelligence. It may be a system that learns for itself from how it was setup. It would perform complex pattern matching and super-quickly analysing truly huge amounts of data, comparing and contrasting to give of an illusion of awareness. It will solve problems and provide solutions but won't be alive and kicking, essentially.
Comments
Process technology: certain device functionality is best fabricated using a particular semiconductor process technology. For example, digital functions typically are fabricated using digital processes, as opposed to analog processes that are used for many RF and other analog functions.
In addition, within digital, logic functions are typically best made using a logic process (think Intel or Samsung or TSMC logic processes) whereas memory (think DRAM) is best fabbed using specialized memory processes (think Micron et al. memory chips). Including memory in an otherwise predominantly logic chip, like a CPU, manufactured using a logic process, illustrates this point: Level X cache is measured in MB, not GB -- it's a little 'unnatural'.
Relevant to this thread, for logic functions which might be fabbed using the same or similar logic processes, functional partitioning -- including some functions in one chip and some in another (not integrating everything in a single die) may be indicated if the architectures are dissimilar. Setting aside the fact that more or less OK but not brilliant GPU functionality is built into mainline CPUs/SOCs (Intel et al.), the killer GPU devices are discrete and have very different architectures (compared to CPUs) that include up to several thousand cores driven by the need for graphics parallel processing.
It's also true that partitioning is driven by die cost, which is a function of die size (square mm) and wafer yield (% good die per wafer), among other factors, which are themselves driven by complexity (including the number of metal and other layers) and process node, where shrinking line widths result in exponentially increasing costs, including for lithography (think about the likely cost of production Extreme Ultra Violet [EUV] lithography tools). Thus, even though more compatible functionality could be crammed onto a single chip, it may be less economically efficient. I note that higher end commercially available GPUs exceed 10 B transistors today.
Other factors of merit:
(i) What functionality may be reasonably integrated into an SOC with interconnected stacked die?
(ii) What clock frequency is really required to perform particular functions? Lower clock speed --> lower power dissipation --> longer battery life; good for mobile applications -- as long as overall system performance is not degraded. If I remember correctly, one of the motivations for Apple's Motion processor is precisely this.
(iii) On the other hand, pushing a signal off chip, through the die's package and onto, across and off a PCB into another discrete device takes power. This negative ramification (more discrete semiconductors usually means more power dissipation) promotes greater die complexity (to maintain or improve power efficiency). This power factor, in conjunction with optimal device complexity (stuffing as much compatible functionality as possible in the same chip subject to maintaining the best overall balance of system performance, bill of materials cost and, for mobile applications, power consumption -- at any particular point in time [optimal device complexity generally increases over time as the semi industry migrates to finer process geometries]) helps drive what is and is not slammed into an IC.
Functionality partitioning is more science than art; think linear programming: So Apple might say, "How do we optimize the user experience on, say, an iPhone, subject to a suite of constraints that include the iPhone's actual physical envelope, max TDP budget, min battery life, max BOM cost, min delivered AI/AR performance, etc. We as users may have issues with thinness, etc., but Apple's semiconductor and system engineers know how to perform that optimization in a way consistent with the Company's requirements. And the semiconductor component of this process -- what functionality is included and how those functions are disposed across X chips -- is a big part of the process.
So, whether AI/AR functionality is included in a discrete device or integrated locally on another semiconductor in the same iPhone/Pad or Mac is much less interesting than whether that functionality is efficiently made available *system wide* on your Apple hardware product.
Why are you bringing up TensorFlow when I specifically mentioned TensorFlow Lite (which is the scaled down mobile version)? This announcement is about Apple supposedly developing a mobile "AI" processor. Not a large power hungry processor used in data centres. So it's just like TensorFlow Lite, except Apple is making a dedicated processor instead of a SDK for a DSP (if the rumor is true).
I don't think Apple will limit their AI ambitions simply to AR when there's so much more it can do on a mobile device.
Google's Tango on smartphones uses a dedicated processor now, similar perhaps to what you think Apple will develop. The AppleInsider article also alludes to the much of same uses as Google's Tango already is trying to address. But yeah maybe they'll mimic TensorFlowLITE software/programming too is some ways. Point taken. Who knows what the rumors lead to.
- made or produced by human beings rather than occurring naturally
Tech journalists from around the world collectively have heart attacks thinking about all the childish stories they can write based on this acronym. Samsung scrambles to launch a new campaign combining the personalization of their devices–specifically the colors in which they’re available–and their status in the market. The “[Your Name Here] Blue Up” campaign was regarded by historians as the first step toward their final bankruptcy.
Is it a legitimate hardware development strategy to build and test a processor separately, then integrate it into an SOC?
also if Apple are going towards small core big core design would it make any sense to have a common chip of the small core types then package in the large cores each made on better suited fabrication process.
Hopefully, it won't become self aware too soon!