Originally posted by Tomb of the Unknown
I'm not saying it can't be done. I'm saying it doesn't make sense to do it. The 440 was expressly designed to be extensible. A good SoC solution needs to be extensible because in the embedded market, one size rarely fits all plus an extensible design will have longer legs as far as product family lifecycle goes. (You don't have to retool for every new fad in communications technology.)
The very fact that the 440 is extensible is what makes the concept plausable. As to wether or not it makes sense that would depend on the goals of the developers. If your going the SOC route the 440 would be either a good base for prototyping or the real hardware. Again it is matter of what your goals are.
So yes, you can add VMX and 440 FPU2 units to the 440 quite handily.
Well he said two MCMs with two 440s each. Which is nonsensical at the outset, since an MCM describes the Power4 and Power5 packaging and IBM does not use the terminology elsewhere, to my knowledge. But leaving that aside, there is the issue that what an MCM provides is interchip communications busses. It's purpose is to allow the cores to communicate, so separating them into pairs with only two per MCM is counterproductive if you know you'll need four cores. So I assumed that all four would be on one MCM.
The technology that produces a MCM has been around for a long time. It is not a POWER technology. In the past it hasn't been cheap either, often being used only for military and space based projects. I understood it to be pairs of SOC processors mounted on an MCM. It really doesn't matter though because the minute you move a signal off a chip you change things.
Well, if you want to create imaginary architectures to make Nr9's scenario more plausible, go right ahead.
Not so my imagination but information picked off the web indicating that Apple and IBM are working together on a nwe laptop chip. That info could very well mean a mainstream chip, or it could mena they are pushing technology for laptops in a new direction.
There would also have to be support in the core for thread locking, etc. (Means more silicon, more heat.)
What thread locking?
This is actually a self contradictory two part objection. Please rest assured that (as Nr9 points out) a distributed architecture would require an entirely new programmatic model despite Mac OS X's unix foundations. It would mean rewritting the OS from the ground up. You would not be able to use the Mach kernel, for instance.
Nope I don't believe that Apple would produce a machine that would require a new operating system. They may add to the operating system and some functionality may be implemented differrently on the system side, but user apps should not be impacted.
Look at it this way PowerMacs have been used for cluster computing for some time. Just because there is support added to them to enable cluster computing does not mean that ordinary MAC applications do not run on them. In fact you can download at least a couple of different message passing libraries for UNIX/Linux off the internet.
Its not self contradictory at all it just reflects what is available now in the way of Message Passing technology.
As to the second part of your objection, yes an SMP implementation is possible, just not with the 440 core.
I'm not willing to go so far as to say it is impossible. To an extent the cache system can be modified along with the bus interface.
Whether he said it or not, it becomes a fundamental requirement as the only alternative is to run it on a slow core effectively undoing any expected perfromance benefit.
You are trying to say that something that is currently impossible is required for this to be successful. This is very convienent but is really not a very good arguement. Yes some applications may not take advantage of a threaded environment to the extent of others, but there is no reason to focus on them. Even then some single threaded applications will see very good performance due to the rest of the system being threaded.
Well, in some applications, the advantages of distributed architectures are overwhelming. But that is not to say that this is true in all cases. Yes, it may be true that there are limits to SMP systems but can you tell me what those limits might be? IBM is heavily invested in the Power5 architecture which is carrying SMP down to the thread level (SMT).
Kinda depends on the application and hardware implementation doesn't it. What Nr9 has described sounds almost like a cluster of SMP machines, this may or may not be the case. What I do know is that at some point it does pay to look at other arraingements than SMP. It would be very difficult to nail down any one size of SMP machine as being the best choice, there are to many variables.
As to SMT and SMP, the Power 5 does not carry the abstraction down to the thread level. SMT allows the processeor to work on two threads, from the hardware point of view, at the same time. These hardware threads could be application threads or the could be completely different processes. Generally what is run in the two contextes is up to the scheduler. It is not quite the same thing as multithread programming.
Sigh. You really can't just jumble up terms like MCM, SoC, SMP and "cluster". Each has a specific meaning in the context of this discussion so you can't posit the integration of seemingly anti-podal or contrasting technologies without a great deal more explanation as to how it would be achieved.
That is complete garbage. You can have a cluster of SMP machines just like VT built. If you have the money you can also build a cluster on a MCM, if things are small enought you can even build a cluster on a piece of silicon, that being one variant of a SoC. There is no contrasting technology here, it is a matter of integration and the goals you have set. Sure you won't get the entire VT cluster on one MCM but you could very well put 4 machines on one if you really wanted and two would not be a problem. Each step down in size also allows you do do things differrently, so instead of networking you have interchip connects. What becomes a problem is our old friend heat, thus nothing suggested at the start of this thread is even possible without low power parts.
You miss my point. Right now, the only kinds of applications (software) that use the kind of cellular programming model described are high end, highly parallel, high performance applications such as climatological modeling software run by government agencies and academic institutions.
OK, so what your saying is that all of those massively clustered SMP systems that corporations have installed in the last few years have on real application. You should pass this informaiton onto the share holders. Even Formula 1 race teams have their own clusters.
You'd think so, wouldn't you? But then, you'd be wrong. It would be extremely difficult for any number of reasons, not the least of which is that "ain't no one been there yet".
It kinda amazes me that you continue to believe that software that uses Message Passing doesn't exist. Further I find it funny that you believe that this can't exist on top of OS/X and maintain backward compatablity. History is not on your side.
No, it's more like taking the APU of the PPC 970 and replacing it with a 440 core. Then repeat with the FPUs. And so on.
I stand by my original statement, modify the 440 core a bit does not imply that you now have a different family. Just as minor modifications to the 970 does not imply a completely different family. That doesn't mean that IBM & Apple can't come up with something that is totally different and thus send the boat down a different branch in the river. It just means that 440 being a core can be modified and still be considered a 440.
I'm trying to figure out if you have a clue or not. Apparently you don't as you have continued to confuse what multithreading is. The point being that multithreaded applications are not tied to the processor implementation on the motherboard (SMP) nor to the type of processor. A multithreaded application can be runned on a 8 bit processor or on a single processor 970. Multithreading has nothing to do with the processor as far as a programming concept. What modern operating systems along with SMP and SMT do is provide a way to enhance what can be offered by a multithread program. The gotcha is that multithreaded applications though are not dependant on features like SMP and SMT being there.
As described, the architecture of the laptop in question is as foreign to Mac OS X applications as the P4 is. (Actually, the P4 is a kissing cousin compared to the implementation described.) So, if Apple were to ship this next week, there would be no software for it.
Again garbage! You would just run your old software but would not get full beenfit from the system. Much in the same way as someone running Excel on VT's cluster and getting no benefit from all of the other nodes available.
That, of course, is the thrust of my argument.
Yep these are all rumors but you have not provided one bit of compelling evidence that they are not atleast possible.
Nope, sorry. Not buying it.
I believe your not buying it because you are not aware of the possibilities. You really need to take a serious look at what is already available, and then add a bit of imagination.
No, at least some of us have been discussing the cell architecture under development by the STI group. Whether it is implemented starting from PPC or x86 designs won't matter that much. You could begin with transmeta's architecture, but in the end you will have a bunch of instructions that make no sense to any other architecture. The problem space is too divergent. Hence, a new instruction set.
There is no new instruction set!!! This thread has been about a PPC implementation that could be new to laptops. That other have added the distractions of other systems does not mean that they are applicable to the base of this discussion. While I can see adding hardware supprot for certain sorts of functions in a machine of the type discussed you will still have the same base instruction set. Just as the 440 is a PPC with an extended instruction set with support for DSP for example.
Even if we did end up extending the PPC instruction set, there is very little likely hood that the user would ever see these new capabilities. These would be obscured by operating system and library abstractions. There is no doubt in my mind that the PPC instruction set will continue to evolve, as it has in the past, this evolution will not break user applications.
Now, you might be able to convince me it could be done as an extension to an existing ISA, but you'd have to work pretty hard at it.
Just look around at all of the clusters operating in the world using standard off the shelf processors. I don't have to do the convincing prior art exists. Even then if they do extend the instruction set it doesn't mean anything to the user at all.
Did all of the user apps suddenly fail to work when Alt-Vec was introduced? That one additioned added more capability and instructions to the PPC than we are ever likely to see from hardware additions to support message passing. On top of that is the reality that all of the above could be done with out adding any new instructions at all to the PPC base. But agian this has been done again and again to the PPC programming model without breaking user apps, with DSP and vector instructions.