I am aware of the difference. Try not to be so literal in your reading of my words.
I was suggesting that Apple might be working on something unexpected for the next or replacement Mac Pro.
Sorry I thought I wrote more of a response than that. I was going to mention the N-core scaling thing. Many applications have either a hard limit or diminishing returns with core scaling, and ARM setups like this tend to leverage many many cores, so they require tasks that run well in a distributed fashion without IO traffic jams. I don't know how they'd fair against Intel's QPI for tasks meant to run in perceptually real time. Was the Dell designed to be a low cost power efficient server or something of that sort?
This supercomputer is due late 2012 and will have 3200 machines. $2.5m is to be spent on Knights Corner for 8 petaflops of DP computation. That would mean 8000 Knights Corner chips, between 2 and 3 per machine. This means $312 for each individual Knights Corner chip.
A 6-core E5 processor will be around 150GFLOPs DP. Knights Corner would be about 6.5x faster.
8" Cube
Xeon E5-2630 (150GFLOPs) - $612
Knight Corner (1000GFLOPs) - $312
AMD 7970 (947GFLOPs) - $400
The look on the faces of old-school Mac Pro buyers persuaded to buy a Thunderbolt-chainable $2999 toy box with over 2 TFLOPs of double precision computation - priceless
The look on their faces will be one of disappointment over the slow connections and lack of a high speed backplane so all that horsepower is stuck inside each cube.
The interconnects in the supercomputer are 4 lanes of FDR infiniband at 14 Gbps each for an aggregate of 56Gbps on a switch, not chained.
On the Mac Pro with PCIe 3.0 the 16 lane slot will be 128 GT/s (or 16 GB/s) dedicated vs 10 Gbps chained.
The Red Rocket wants to be in the PCIe x 16 slot with at least 8 lanes of PCIe 2.0. It'll work over thunderbolt as seen in the video...but if it really want 32 Gbps bandwidth and is sharing 10 Gbps it's going to get starved at the high end of its performance band.
Latency is also an issue.
So an 8" cube with a double slot and nothing but TB is not so great as a Mac Pro replacement at the high end because in the end TB will be the bottleneck...even the 20 Gbps version.
The flip side of this is that Intel is going all in with Infiniband. That is they are building it into some of their high performance chips. So the possibility is there for high performance clustering I/O. Further being built into the chip, PCI is bypassed.
The look on their faces will be one of disappointment over the slow connections and lack of a high speed backplane so all that horsepower is stuck inside each cube.
The problem here is that I don't see Apple going after large cluster installations, but rather trying to lock up the small work station cluster market.
The interconnects in the supercomputer are 4 lanes of FDR infiniband at 14 Gbps each for an aggregate of 56Gbps on a switch, not chained.
So? Really we need to consider what is practicle for Apple and it's customers.
On the Mac Pro with PCIe 3.0 the 16 lane slot will be 128 GT/s (or 16 GB/s) dedicated vs 10 Gbps chained.
The Red Rocket wants to be in the PCIe x 16 slot with at least 8 lanes of PCIe 2.0. It'll work over thunderbolt as seen in the video...but if it really want 32 Gbps bandwidth and is sharing 10 Gbps it's going to get starved at the high end of its performance band.
Latency is also an issue.
So an 8" cube with a double slot and nothing but TB is not so great as a Mac Pro replacement at the high end because in the end TB will be the bottleneck...even the 20 Gbps version.
Think about what might be possible on a future machine, not about what is common these days. Intel, Micron, NVidia, AMD and other have some very interesting solutions coming to market in the near future. Knights Corner is just one part of the puzzle, 3D memory might be ready soon and other technologies are coming. I honestly believe that if Apple wasn't waiting on these technologies (one or more) we would have seen a far different Mac Pro update.
I agree, I figured it was a possibility for WWDC but someone said they won't be making them in volume. I found this article: http://www.hpcwire.com/hpcwire/2011-09-22/dell_to_build_10-petaflop_supercomputer_for_science.html
This supercomputer is due late 2012 and will have 3200 machines. $2.5m is to be spent on Knights Corner for 8 petaflops of DP computation. That would mean 8000 Knights Corner chips, between 2 and 3 per machine. This means $312 for each individual Knights Corner chip.
Neat huh?
My understanding is that Knights Corner operates more or less as a coprocessors witha local OS. This should work very well with Apple curent infrastructure. That is Grand Central Dispatch and OpenCL.
A 6-core E5 processor will be around 150GFLOPs DP. Knights Corner would be about 6.5x faster.
8" Cube
Xeon E5-2630 (150GFLOPs) - $612
Knight Corner (1000GFLOPs) - $312
AMD 7970 (947GFLOPs) - $400
The potential is there for people to be stuned. Sadly if Apple doesn't come up with something impressive the Mac Pro wil be cooked.
The look on the faces of old-school Mac Pro buyers persuaded to buy a Thunderbolt-chainable $2999 toy box with over 2 TFLOPs of double precision computation - priceless
It is interesting that the MBPR now has two TB ports. The obvious implication is that Apple sees more intense usage of TB. Even if we don't see clustering over TB, I suspect that we will be seeing 4 or more ports on the Mac Pro.
I still have a lot of interest in what is up with the Mini and to a lesser extent the iMac. Still hoping to see the same innovation effort put into the MBPR go into these machines. As such I hope they don't castrate the Mini again. With a Cluster capable Mac Pro there should be no reason to underpower the Mini.
The problem here is that I don't see Apple going after large cluster installations, but rather trying to lock up the small work station cluster market.
My point is that the supercomputer nodes work because there is a high bandwidth interconnect between the nodes. Far higher bandwidth than TB.
Quote:
So? Really we need to consider what is practicle for Apple and it's customers.
What's "practical" for Mac Pro users is having one PCIe 3.0 x16, one PCIe 3.0 x8 and two PCIe 3.0 x4 That allows you to have 1 higher end GPU, 2 high throughput cards and a slot for an expansion chassis if need be.
What's not "practical" is expecting two 10 Gbps TB links to carry up to around 32 lanes worth of PCIe 3.0 bandwidth. That's around 32 GB/s worth vs 2.5 GB/s on the two TB.
Quote:
Think about what might be possible on a future machine, not about what is common these days. Intel, Micron, NVidia, AMD and other have some very interesting solutions coming to market in the near future. Knights Corner is just one part of the puzzle, 3D memory might be ready soon and other technologies are coming. I honestly believe that if Apple wasn't waiting on these technologies (one or more) we would have seen a far different Mac Pro update.
It is interesting that the MBPR now has two TB ports. The obvious implication is that Apple sees more intense usage of TB. Even if we don't see clustering over TB, I suspect that we will be seeing 4 or more ports on the Mac Pro.
I still have a lot of interest in what is up with the Mini and to a lesser extent the iMac. Still hoping to see the same innovation effort put into the MBPR go into these machines. As such I hope they don't castrate the Mini again. With a Cluster capable Mac Pro there should be no reason to underpower the Mini.
Again, TB is great for laptops, AIO and SFF computers. Not so great in comparison to having real slots in the Mac Pro. Each one burns 4 PCIe 2.0 lanes...which the MBPR had available but the Mac Pro is somewhat short. For example it would have been nice to have been able to configure it into two PCIe x16 slots...
4 TB ports means 8ish PCIe 3.0 lanes gone which is fewer for the internal slots and it's less flexible than being able to configure the slots the way you needs them.
Meh...2 seems more likely. Remember also that if you're running a display on them the available bandwidth drops to 5.something Gbps.
A solid Mac Pro update to SB that is rack mountable but good looking would make many folks happy. A cube sized MP with esoteric hardware most software can't utilize not so much. If Apple wants to dabble with TB based clusters the mini is a far better platform to tinker with than to cripple the Mac Pro.
Also isn't Knights Corner kind of a response to NVidia's tesla cards? I will have to read more about them, but general computing is a bit different from HPCs running highly customized code.
There's some work involved but developers are already doing it. Apple is using OpenCL in their software and it should run on Knights Corner. You'd still get one multi-core general CPU but the Knights Corner and high-end GPU are the only ways to properly tackle high resource workflows in a way that a second CPU just can't come close to.
CERN ported some code over to it for a benchmark:
Some people can port code over in a matter of hours:
No matter the effort required to port code, the benefits are worthwhile. For a community that requires the fastest performance no matter the cost, it's the best way to go.
Thunderbolt latency is 9ns, FDR is 160ns. The interconnects won't be an issue for performance at all - loads of supercomputers are using Gig-E interconnects.
I don't see Apple going after large cluster installations, but rather trying to lock up the small work station cluster market.
I think they have a chance at both but primarily the workstation environment. With optical Thunderbolt, a creative office can connect every Mac Pro and even iMac and Mini across large distances to use the compute power in every machine. Submit a render job on one machine and it can be distributed across 10 other machines with a mix of i7, Xeon, Knights Corner, GPUs etc. If a GPU can be run over a Thunderbolt box, Apple can have the computers all work like a co-processor so it's transparent. Think of an office with 20TFLOPs of double precision compute for under $30k.
This would be amazing for marketing. It has a great identity being a Cube, they can call it 'your own personal supercomputer' and it actually means something.
The Red Rocket wants to be in the PCIe x 16 slot with at least 8 lanes of PCIe 2.0.
Would you need a Red Rocket with Knights Corner? The idea is that the massive compute performance will allow people to bypass a lot of custom hardware.
There's some work involved but developers are already doing it. Apple is using OpenCL in their software and it should run on Knights Corner. You'd still get one multi-core general CPU but the Knights Corner and high-end GPU are the only ways to properly tackle high resource workflows in a way that a second CPU just can't come close to.
Sorta. You still need to overcome the bottlenecks associated with high parallelisms for the 50-64 x86 cores in Knights Corner. To get peak performance you need to optimize your code for the 512 bit SMID. Arguably easier for KNC than CUDA but there's quite a bit of work involved anyway. One that HPC programs are typically reasonably well suited for and one that general purpose computing not as much. How long did it take Adobe to handle multiple cores?
KNC I understand is pretty power hungry in comparison to GPUs. There are advantages and disadvantages between the shallow 6 stage pipeline in the KNC and the longer pipelines in GPUs (20+). It can be nice but there are always tradeoffs.
Quote:
No matter the effort required to port code, the benefits are worthwhile. For a community that requires the fastest performance no matter the cost, it's the best way to go.
Thunderbolt latency is 9ns, FDR is 160ns. The interconnects won't be an issue for performance at all - loads of supercomputers are using Gig-E interconnects.
The 160ns FDR latency is including the switch. What is the latency to the TB device at the end of the chain?
Interconnects are an issue for super computers but is mitigated by the way the parallelism is structured. Communicating between compute nodes is also very different than talking to your GPU or other high speed PCIe cards.
Quote:
I think they have a chance at both but primarily the workstation environment. With optical Thunderbolt, a creative office can connect every Mac Pro and even iMac and Mini across large distances to use the compute power in every machine. Submit a render job on one machine and it can be distributed across 10 other machines with a mix of i7, Xeon, Knights Corner, GPUs etc. If a GPU can be run over a Thunderbolt box, Apple can have the computers all work like a co-processor so it's transparent. Think of an office with 20TFLOPs of double precision compute for under $30k.
This would be amazing for marketing. It has a great identity being a Cube, they can call it 'your own personal supercomputer' and it actually means something.
They could do that today over GigE but the software complexity is large in comparison to a dedicated render farm.
A GPU can run over a Thunderbolt box just very slowly in comparison to inside a Mac Pro on a x16 slot.
What you are suggesting really hoses the workstation environment by removing the flexibility of 4 PCIe slots...which was none too many to begin with.
Again, if they really wanted to do this then the Mac Mini server is a better form factor to start with. Make it a cube, put two TB ports on it, a top end Core i7, a decent GPU and a couple extra RAM slots. Then use 3rd party expansion chassis for GPGPUs and KNCs boards and provide the software framework to make it usable.
That's a hell of a lot better than crippling the Mac Pro unless you're going to provide dedicated 16 lanes PCIe 3.0 for an offboard chassis like an uber Cubix.
Many Pros need more slots and what PCIe 3.0 brings, not fewer. You use one slot and 16 lanes for your primary GPU, then another slot for a good SAS or FC card, then maybe one for a Decklink card. Then you want another GPU since CS6 and Resolve can use multiple GPUs but you also want a Rocket. Doh. Even if you HAD another slot you're likely out of PCIe 2.0 lanes without downgrading one of the other cards. Like using only 8 lanes for your primary GPU. Ugh.
If you're building a dump truck, build a dump truck. Not try to replicate it with a bunch of modular pickups.
Quote:
Would you need a Red Rocket with Knights Corner? The idea is that the massive compute performance will allow people to bypass a lot of custom hardware.
Asking this is like asking would you need a GPU with that massive compute performance from KNC...that was the whole idea of Larabee in the first place.
In theory you could do the wavelet decompression and debayering R3Ds in GPGPU software since without a Rocket you're falling back to the CPU anyway but given that 3D, HDRx and multiple streams require multiple Rockets my guess is that Red looked at it and concluded that a software implementation in OpenCL or CUDA wasn't going to be good enough. It's not like high end video workstations aren't sporting Teslas today.
I am very much in the "MBPR and iMac are good enough for 90% of Pro use cases" camp. But I'm also very much in the "don't dumb down the Mac Pro into a fancy cube" camp. Those slots are highly critical for many high end use cases. TB is NOT a viable alternative. The case size doesn't f-ing matter except that it would be nice if it fit into a 3 or 4U slot in a rack.
My point is that the supercomputer nodes work because there is a high bandwidth interconnect between the nodes. Far higher bandwidth than TB.
Yes for a super computer you need state of the art bandwidth. We aren't talking super computer here but a small cluster of machines.
Beyong that how many generations and varieties of super computer interconnects have there been? There have been more than a few some of which would never be practicle for Apple.
What's "practical" for Mac Pro users is having one PCIe 3.0 x16, one PCIe 3.0 x8 and two PCIe 3.0 x4 That allows you to have 1 higher end GPU, 2 high throughput cards and a slot for an expansion chassis if need be.
You seem to imply that TB ports and even an Infiniband port would use up all available lanes in a new Mac Pro. That is very unlikely especially with Intel supporting 40 lanes of PCI Express on its new chips.
What's not "practical" is expecting two 10 Gbps TB links to carry up to around 32 lanes worth of PCIe 3.0 bandwidth. That's around 32 GB/s worth vs 2.5 GB/s on the two TB.
OK who suggested such a thing?
We're looking at the 2013 time frame.
Yes 2013 is in the future. Intels multi core technology is slated to arrive by the end of the year. NVidias new compute engines arrive in that time frame too. So yeah 2013 is the future and in that time frame new hardware will be available to make a radically different work station.
Again, TB is great for laptops, AIO and SFF computers. Not so great in comparison to having real slots in the Mac Pro. Each one burns 4 PCIe 2.0 lanes...which the MBPR had available but the Mac Pro is somewhat short. For example it would have been nice to have been able to configure it into two PCIe x16 slots...
Reading things into my posts never said isn't helping your position. Supporting TB does not mean that slots have to disappear. There are plenty of PCI Express lanes to go around in a work station. In any event looking at today's Mac Pro to determine what is or should be available on a future machine is foolish.
4 TB ports means 8ish PCIe 3.0 lanes gone which is fewer for the internal slots and it's less flexible than being able to configure the slots the way you needs them.
How many slots you can manage along side those TB ports is a function of the chipset selected. Without knowing the specifics you can't say for sure if you are loosing or gaining.
Meh...2 seems more likely. Remember also that if you're running a display on them the available bandwidth drops to 5.something Gbps.
The serial nature of the ports means that more than one of anything can lead to contention and bandwidth sharing. Thus it makes even more sense to support 4 or more TB ports on a Mac Pro.
A solid Mac Pro update to SB that is rack mountable but good looking would make many folks happy.
This is certainly true. However such a machine will not move the industry forward. Nor will such a machine support modern software workloads optimally. In some cases apps wouldn't even be possible without OpenCL support, at least not in a realtime sense.
A cube sized MP with esoteric hardware most software can't utilize not so much. If Apple wants to dabble with TB based clusters the mini is a far better platform to tinker with than to cripple the Mac Pro.
You won't be fitting such hardware into a Mini anytime soon.
As to software, there are already apps out there that strain current systems even with GPU acceleration and dual socket CPUs. Going to Sandy Bridge E alone would not have a serious impact on those apps. New processor technology and clustering might change the equation favorably. Further if an app doesn't need clustering support it will still run fine on one of these machines.
Well, great resolution is great, but I personally (and a few other people) prefer the larger screen size, because what we work with requires it to be bigger and I like to work at a desk, leaned back where my face is 2-3 feet from the screen. If you want more real estate, just have apple increase the resolution of the 17inch as well.
But once again, I understand their business move
A leak is an unsanctioned. This came directly from Tim. Doesn't get more official.
What about the Mac G4 Cube? Nope, still dead.
Mac Pro: still alive.
I never said the Mac Pro or the MacBook Pro were dead, just the 17" flavor. If it's dead, how about an official announcement? I don't mind piecing up the There's no excuse for not being upfront about products. They're usually closed-mouth about when product will be upgraded so it doesn't disrupt sales but the 17" was eliminated without explanation. It's one thing to hold off on an update or not announcing a coming update in an effort to continue sales but it's another thing entirely pulling a product without explanation.
You seem to imply that TB ports and even an Infiniband port would use up all available lanes in a new Mac Pro. That is very unlikely especially with Intel supporting 40 lanes of PCI Express on its new chips.
Ideally you want to have two x16 3.0 lanes for two GPUs running full speed. That's 32 lanes by themselves leaving 8 for everything else. Seems too tight.
So assume you have one x16 lane and a desire for two x8 cards and one x4 card. That's 36 lanes out of 40.
You want 4 TB ports which is 8 lanes.
So even with PCIe 3.0 lanes are a precious commodity. 2 TB ports using 4 lanes seems like the max for the Mac Pro if you don't want to compromise what you can do internally with the 4 slots.
Quote:
OK who suggested such a thing?
Marvin. A 8" cube with a single slot for GPU and everything else via TB.
Quote:
Reading things into my posts never said isn't helping your position. Supporting TB does not mean that slots have to disappear. There are plenty of PCI Express lanes to go around in a work station. In any event looking at today's Mac Pro to determine what is or should be available on a future machine is foolish.
Yes, it does. The math above shows that. The current Mac Pro is anemic when it comes to lanes and slots. Moving to PCIe 3.0 helps that but not if you burn them on the TB ports.
The 8" form factor also forces the slots to disappear.
Quote:
How many slots you can manage along side those TB ports is a function of the chipset selected. Without knowing the specifics you can't say for sure if you are loosing or gaining.
If you want full speed x16 PCI-e 3.0 slots then the math is pretty simple for the SB Xeon E5-2600. 2 max. With 1 you end up with 24 left for 3 more slots. Whatever you take from them to implement TB means fewer x8 slot configurations with the primary x16 GPU.
Quote:
The serial nature of the ports means that more than one of anything can lead to contention and bandwidth sharing. Thus it makes even more sense to support 4 or more TB ports on a Mac Pro.
Again, it is far better in a tower to leave those lanes available to use in your slots. In a laptop, AIO or SFF there are no slots.
Quote:
This is certainly true. However such a machine will not move the industry forward. Nor will such a machine support modern software workloads optimally. In some cases apps wouldn't even be possible without OpenCL support, at least not in a realtime sense.
Modern WORKSTATION workloads are currently handled using PCIe cards like the Rocket, GPUs like Tesla, etc. These are all bandwidth hungry cards that state that they want more bandwidth than is available even if it had the entire TB link.
What you and Marvin favor is a machine that is less capable and more ridged. Not everyone will benefit from KNC and might want something else BUT you can drop a KNC card into a x8 slot. 2-3 if you want if you have the bandwidth. Which you won't if you have 4+ TB ports.
Quote:
You won't be fitting such hardware into a Mini anytime soon.
Sure you can. It just has to be bigger to fit the KNC board and not fry itself from the heat. Something the size of the old mini might work. It's not like a 8" cube is that much bigger.
Quote:
As to software, there are already apps out there that strain current systems even with GPU acceleration and dual socket CPUs. Going to Sandy Bridge E alone would not have a serious impact on those apps. New processor technology and clustering might change the equation favorably. Further if an app doesn't need clustering support it will still run fine on one of these machines.
Except the box Marvin suggests can't have the same amount of GPU and dedicated hardware that the current Mac Pro can support running at the same speeds that they do today. TB would be a huge bottleneck in comparison to native PCIe 3.0 and I don't believe that bonding is in the spec.
You guys still want a freaking xMac...fine but don't go advocating wrecking the Mac Pro to get it. A single slot Mac Pro cube is only marginally better than the iMac.
There's some work involved but developers are already doing it. Apple is using OpenCL in their software and it should run on Knights Corner. You'd still get one multi-core general CPU but the Knights Corner and high-end GPU are the only ways to properly tackle high resource workflows in a way that a second CPU just can't come close to.
The fact that this hardware fits well into Apples existing infra structure seems to be a big win from what I can see. The cores in the Knights series seem to complement the GPU approach in that they can run more code effectively. This means each Knights processor can handle more types of code than a GPU can handle which means more types of codes can be speed up. The one thing that people can't seem to grasp about GPU acceleration is that they are effectively limited to SIMD type acceleration.
CERN ported some code over to it for a benchmark:
Some people can port code over in a matter of hours:
No matter the effort required to port code, the benefits are worthwhile. For a community that requires the fastest performance no matter the cost, it's the best way to go.
The cores in Knights Corner aren't perfect, they are in order machines so single threads may or may not execute well. However cores can be a huge advantage for many users.
Thunderbolt latency is 9ns, FDR is 160ns. The interconnects won't be an issue for performance at all - loads of supercomputers are using Gig-E interconnects.
Yeah I'm not sure what the problem with the interconnect is. TB is bidirectional so it will perform well but even over alternative interconnects small cluster can work well. Of course if one searches hard enough I'm sure workloads can be found that will suffer, but that is life.
I think they have a chance at both but primarily the workstation environment. With optical Thunderbolt, a creative office can connect every Mac Pro and even iMac and Mini across large distances to use the compute power in every machine. Submit a render job on one machine and it can be distributed across 10 other machines with a mix of i7, Xeon, Knights Corner, GPUs etc. If a GPU can be run over a Thunderbolt box, Apple can have the computers all work like a co-processor so it's transparent. Think of an office with 20TFLOPs of double precision compute for under $30k.
That capability already exist but isn't used widely.
This would be amazing for marketing. It has a great identity being a Cube, they can call it 'your own personal supercomputer' and it actually means something.
Would you need a Red Rocket with Knights Corner? The idea is that the massive compute performance will allow people to bypass a lot of custom hardware.
Many industries would change drastically with so much power available on the desktop. This is especially the case if they can control coats on the machine. Controlling costs is why I expect drastically overhauled physical design.
I get it, more importantly I don't live in the past.
Ideally you want to have two x16 3.0 lanes for two GPUs running full speed. That's 32 lanes by themselves leaving 8 for everything else. Seems too tight.
PCI Express has enough bandwidth to drive GPUs over 8 lanes. However you assume that unreleased products will be limited to X number of PCI lanes I'm just not convinced that we know what is being released and with what chip sets.
So assume you have one x16 lane and a desire for two x8 cards and one x4 card. That's 36 lanes out of 40.
You want 4 TB ports which is 8 lanes.
So even with PCIe 3.0 lanes are a precious commodity. 2 TB ports using 4 lanes seems like the max for the Mac Pro if you don't want to compromise what you can do internally with the 4 slots.
All engineering is a task in finding a balance between available materials and the goals you want to achieve. Apple could easily tackle this issue with three eight lane slots, which would leave lanes for TB and onboard devices.
Marvin. A 8" cube with a single slot for GPU and everything else via TB.
Actually even though Marvin and I have similar ideas we are not actually looking at the same vision. If I was designing the next Mac Pro the GPU would be planted right on the motherboard as would be Knights Corner.
Yes, it does. The math above shows that. The current Mac Pro is anemic when it comes to lanes and slots. Moving to PCIe 3.0 helps that but not if you burn them on the TB ports.
Only in your world view. Beside your position is asinine you are looking at old technology to inform you of what is possible on new technology. The current Mac Pro is a very old design to be doing that with. Sandy Bridge E comes with 40 lanes of PCI Express, I hardly think there will be a problem finding the right distribution of those lanes.
The 8" form factor also forces the slots to disappear.
Well no it doesn't but that presupposes that the machine will be 8" in size.
If you want full speed x16 PCI-e 3.0 slots then the math is pretty simple for the SB Xeon E5-2600. 2 max. With 1 you end up with 24 left for 3 more slots. Whatever you take from them to implement TB means fewer x8 slot configurations with the primary x16 GPU.
yep
This is no big deal though.
Again, it is far better in a tower to leave those lanes available to use in your slots. In a laptop, AIO or SFF there are no slots.
Well let's just say the tower has gone the way of the DoDo. Even so you still have lanes to make up a number of slot arrangements.
Modern WORKSTATION workloads are currently handled using PCIe cards like the Rocket, GPUs like Tesla, etc. These are all bandwidth hungry cards that state that they want more bandwidth than is available even if it had the entire TB link.
In your world view! If the GPU accelerator and Knight Corner are already on board your need for slots is minimized. That is not to say slots aren't needed just that you change the dynamics by shipping standard hardware with high performance features baked in.
What you and Marvin favor is a machine that is less capable and more ridged. Not everyone will benefit from KNC and might want something else BUT you can drop a KNC card into a x8 slot. 2-3 if you want if you have the bandwidth. Which you won't if you have 4+ TB ports.
Not everyone benefits from a GPU. Your arguement makes no sense. You build such platforms for running tomorrows software not today's.
As to the TB ports I really don't get what the emotion is all about. The hardware implement a cross bar switch so the lanes can be routed as needed. Further it would be not unheard of for one or two of those ports end up being dedicated to driving video displays.
Sure you can. It just has to be bigger to fit the KNC board and not fry itself from the heat. Something the size of the old mini might work. It's not like a 8" cube is that much bigger.
First you say you can't, then you can, then you can't. I really don't follow what your problem is. Please don't get hung up on a specific cube size either.
Except the box Marvin suggests can't have the same amount of GPU and dedicated hardware that the current Mac Pro can support running at the same speeds that they do today. TB would be a huge bottleneck in comparison to native PCIe 3.0 and I don't believe that bonding is in the spec.
TB has plenty of bandwidth where it is needed. Plus with PCI Express 3 you have even more bandwidth to devices that need it.
You guys still want a freaking xMac...fine but don't go advocating wrecking the Mac Pro to get it. A single slot Mac Pro cube is only marginally better than the iMac.
Anybody with a reasonable understanding of workstation technology would realize that the current Mac Pro is a wreck! You are hung up on the box and frankly I see the current box as a negative when it comes to implementing a modern machine with a wide appeal to all the different high performance computer users out there.
In any event let say that cube has two slots with a GPU and a Knights Corner on the motherboard, how is that not a high performance workstation? It would be for most definitions and frankly they could move the processors out to cards if they wanted but the reality is that cards are compromises. If you want to ship a block of computing power it is best to integrate it onto one board.
I like the fact that you completely ignored his question, and thereby proved his point. Most people bitching about the Mac Pro don't even need one based on what they do, and are bitching for the hell of it. I know people who do nothing but watch porn and needless customization to their computers and call themselves 'power users'. His question is what the current Mac Pro can't do, that you need it to do.
I ignored the question because it's irrelevant. Who is he (or you) to tell other users what they "need"? The point is that Apple telling me that I'm important to them while charging me full price for just-updated, "new" hardware that is already one to three years out of date is insulting. When I got my last Mac Pro, it came out a month or two before anyone else in the industry shipped a machine with those chips. THAT was making your customers feel important. Now, they upgrade a machine to year-old processors, when brand new processors are shipping from other vendors RIGHT NOW. That just makes your customers feel s**t on.
As for me personally, I agree with you. I DON'T need a Mac Pro. I never have, but I got one anyway. Why? Because I could afford it, and I do need more than an iMac can provide at a reasonable cost. Yes, I'm aware you can get a pile of thunderbolt accessories.. but they cost a small fortune. Not to mention buying a whole new display every time you upgrade the machine.
That's why I'm switching to Windows on my next upgrade cycle. While I'm not a big fan of Windows, I'm not an Apple fanboy, either, and I'm fluent in both. Both have their strengths and both have their warts. It was already hard enough justifying buying a Mac Pro when all you need is a souped-up iMac, but there's just no way I'm paying Apple's price for a Mac Pro that is already a half a cycle old compared to hardware I can get from their competitors. And after the recent Final Cut Pro debacle, I don't see the situation improving any time soon.
The 160ns FDR latency is including the switch. What is the latency to the TB device at the end of the chain?
It's supposed to be 9ns across 7 devices but real-world scenarios may prove otherwise. It certainly should be under 9ns end-to-end across one device though.
In theory you could do the wavelet decompression and debayering R3Ds in GPGPU software since without a Rocket you're falling back to the CPU anyway but given that 3D, HDRx and multiple streams require multiple Rockets my guess is that Red looked at it and concluded that a software implementation in OpenCL or CUDA wasn't going to be good enough.
Knights Corner can use more general purpose code though.
Marvin. A 8" cube with a single slot for GPU and everything else via TB.
My Cube would have 6 Thunderbolt ports with 16 lanes for the GPU. Although they are lower bandwidth, I'd say 6 TB ports is better than 3x PCI slots. Some cards will prevent you even using all 3 slots.
Actually even though Marvin and I have similar ideas we are not actually looking at the same vision. If I was designing the next Mac Pro the GPU would be planted right on the motherboard as would be Knights Corner.
I think the Knights Corner (or more likely Knights Landing if late 2013) would be on the motherboard but Apple put the iMac GPU in a slot so I figure they'd do the same with the Mac Pro. Either way is good though.
Except the box Marvin suggests can't have the same amount of GPU and dedicated hardware that the current Mac Pro can support running at the same speeds that they do today. TB would be a huge bottleneck in comparison to native PCIe 3.0 and I don't believe that bonding is in the spec.
I don't see there being a huge bottleneck. Thunderbolt is a multi-protocol connection. If you need channel bonding, it can be implemented just like it is on a fibre channel PCI board. Thunderbolt is external PCI. Whatever you use internal PCI for can be used for external PCI. Here is a 10Gbps ethernet Thunderbolt box with link aggregation:
You guys still want a freaking xMac...fine but don't go advocating wrecking the Mac Pro to get it. A single slot Mac Pro cube is only marginally better than the iMac.
Nah, the xMac was always the cheaper i7 box that could never happen. This is rethinking what a workstation-class machine should represent. It shouldn't be hacking together ugly PCI cards and only having Xeons that take forever to improve in performance.
6 Thunderbolt ports = lots of ports for expansion if you need it and/or up to 6 displays.
powerful GPU that is the best in class and well-supported
Knights Landing - 1.5-2 TFLOPs of double precision semi-general purpose computing for high resource workloads like video encoding/decoding and rendering
6-10 core Xeon for reliable general purpose tasks that can't be or don't need to be accelerated by the co-processor.
Affordable - $2999.
Scalable - just hook more together. No matter if Intel screw up their rollout again, the performance can be scaled linearly.
However it appears that they recognize that they screwed up thus the "leaks" about a new Mac Pro coming in 2013. If the advance is as strong as rumored it might not be a bad idea to put off the platform change.
I ignored the question because it's irrelevant. Who is he (or you) to tell other users what they "need"? The point is that Apple telling me that I'm important to them while charging me full price for just-updated, "new" hardware that is already one to three years out of date is insulting. When I got my last Mac Pro, it came out a month or two before anyone else in the industry shipped a machine with those chips. THAT was making your customers feel important. Now, they upgrade a machine to year-old processors, when brand new processors are shipping from other vendors RIGHT NOW. That just makes your customers feel s**t on.
It certainly was an incredibly stupid move on Apples part. However maybe the decision was made to encourage people to stay away from the Mac Pro until they have deliverable replacement hardware.
The problem isn't so much the hardware as the lack of communications about why it was so poorly updated. This is an example of Apples secrecy working against them.
As for me personally, I agree with you. I DON'T need a Mac Pro. I never have, but I got one anyway. Why? Because I could afford it, and I do need more than an iMac can provide at a reasonable cost. Yes, I'm aware you can get a pile of thunderbolt accessories.. but they cost a small fortune. Not to mention buying a whole new display every time you upgrade the machine.
Few of us actually need any of Apples hardware. In my case one of the reasons I support the XMac concept is that we do need a desktop computer that is better than the iMac instead of being forced to choose between Mini, Mac Pro and the laptops.
That's why I'm switching to Windows on my next upgrade cycle. While I'm not a big fan of Windows, I'm not an Apple fanboy, either, and I'm fluent in both. Both have their strengths and both have their warts. It was already hard enough justifying buying a Mac Pro when all you need is a souped-up iMac, but there's just no way I'm paying Apple's price for a Mac Pro that is already a half a cycle old compared to hardware I can get from their competitors. And after the recent Final Cut Pro debacle, I don't see the situation improving any time soon.
The third alternative is Linux.
In any event I'd suggest not leaving the fold to quickly. It appears that Apple realizes that they screwed up with this micro update to the Mac Pro. At this point their only choice is to ride out the hostility until they get the next generation hardware on the market. Speaking of which I'm rather put off by their excessive focus on the laptop segment with little effort putinto innovating on the desktop. Apples desktop line up right now is pure crap from the Mini on up, it is no wonder sales suck.
Comments
I am aware of the difference. Try not to be so literal in your reading of my words.
I was suggesting that Apple might be working on something unexpected for the next or replacement Mac Pro.
Quote:
Originally Posted by hmm
There's a difference between a workstation and server.
Quote:
Originally Posted by WelshDog
I am aware of the difference. Try not to be so literal in your reading of my words.
I was suggesting that Apple might be working on something unexpected for the next or replacement Mac Pro.
Sorry I thought I wrote more of a response than that. I was going to mention the N-core scaling thing. Many applications have either a hard limit or diminishing returns with core scaling, and ARM setups like this tend to leverage many many cores, so they require tasks that run well in a distributed fashion without IO traffic jams. I don't know how they'd fair against Intel's QPI for tasks meant to run in perceptually real time. Was the Dell designed to be a low cost power efficient server or something of that sort?
Quote:
Originally Posted by Marvin
I agree, I figured it was a possibility for WWDC but someone said they won't be making them in volume. I found this article:
http://www.hpcwire.com/hpcwire/2011-09-22/dell_to_build_10-petaflop_supercomputer_for_science.html
This supercomputer is due late 2012 and will have 3200 machines. $2.5m is to be spent on Knights Corner for 8 petaflops of DP computation. That would mean 8000 Knights Corner chips, between 2 and 3 per machine. This means $312 for each individual Knights Corner chip.
A 6-core E5 processor will be around 150GFLOPs DP. Knights Corner would be about 6.5x faster.
8" Cube
Xeon E5-2630 (150GFLOPs) - $612
Knight Corner (1000GFLOPs) - $312
AMD 7970 (947GFLOPs) - $400
The look on the faces of old-school Mac Pro buyers persuaded to buy a Thunderbolt-chainable $2999 toy box with over 2 TFLOPs of double precision computation - priceless
The look on their faces will be one of disappointment over the slow connections and lack of a high speed backplane so all that horsepower is stuck inside each cube.
The interconnects in the supercomputer are 4 lanes of FDR infiniband at 14 Gbps each for an aggregate of 56Gbps on a switch, not chained.
On the Mac Pro with PCIe 3.0 the 16 lane slot will be 128 GT/s (or 16 GB/s) dedicated vs 10 Gbps chained.
The Red Rocket wants to be in the PCIe x 16 slot with at least 8 lanes of PCIe 2.0. It'll work over thunderbolt as seen in the video...but if it really want 32 Gbps bandwidth and is sharing 10 Gbps it's going to get starved at the high end of its performance band.
Latency is also an issue.
So an 8" cube with a double slot and nothing but TB is not so great as a Mac Pro replacement at the high end because in the end TB will be the bottleneck...even the 20 Gbps version.
Quote:
Originally Posted by Dick Applebaum
I agree!
How about modular boxes daisy chained together with thunderbolt and/or fiber optics...
Separate boxes with: RAM/CPUs; SSDs; GPUs; HDD RAIDS... mix or match these as needed to address current needs.
Apple already has software to manage this distributed computing system
Yes, this is what I've been seeing as well. Plug and play cluster computing. A stackable, gangable cube of raw power.
The problem here is that I don't see Apple going after large cluster installations, but rather trying to lock up the small work station cluster market. So? Really we need to consider what is practicle for Apple and it's customers.
Think about what might be possible on a future machine, not about what is common these days. Intel, Micron, NVidia, AMD and other have some very interesting solutions coming to market in the near future. Knights Corner is just one part of the puzzle, 3D memory might be ready soon and other technologies are coming. I honestly believe that if Apple wasn't waiting on these technologies (one or more) we would have seen a far different Mac Pro update.
My understanding is that Knights Corner operates more or less as a coprocessors witha local OS. This should work very well with Apple curent infrastructure. That is Grand Central Dispatch and OpenCL. The potential is there for people to be stuned. Sadly if Apple doesn't come up with something impressive the Mac Pro wil be cooked.
It is interesting that the MBPR now has two TB ports. The obvious implication is that Apple sees more intense usage of TB. Even if we don't see clustering over TB, I suspect that we will be seeing 4 or more ports on the Mac Pro.
I still have a lot of interest in what is up with the Mini and to a lesser extent the iMac. Still hoping to see the same innovation effort put into the MBPR go into these machines. As such I hope they don't castrate the Mini again. With a Cluster capable Mac Pro there should be no reason to underpower the Mini.
Quote:
Originally Posted by wizard69
The problem here is that I don't see Apple going after large cluster installations, but rather trying to lock up the small work station cluster market.
My point is that the supercomputer nodes work because there is a high bandwidth interconnect between the nodes. Far higher bandwidth than TB.
Quote:
So? Really we need to consider what is practicle for Apple and it's customers.
What's "practical" for Mac Pro users is having one PCIe 3.0 x16, one PCIe 3.0 x8 and two PCIe 3.0 x4 That allows you to have 1 higher end GPU, 2 high throughput cards and a slot for an expansion chassis if need be.
What's not "practical" is expecting two 10 Gbps TB links to carry up to around 32 lanes worth of PCIe 3.0 bandwidth. That's around 32 GB/s worth vs 2.5 GB/s on the two TB.
Quote:
Think about what might be possible on a future machine, not about what is common these days. Intel, Micron, NVidia, AMD and other have some very interesting solutions coming to market in the near future. Knights Corner is just one part of the puzzle, 3D memory might be ready soon and other technologies are coming. I honestly believe that if Apple wasn't waiting on these technologies (one or more) we would have seen a far different Mac Pro update.
We're looking at the 2013 time frame.
Quote:
Originally Posted by wizard69
It is interesting that the MBPR now has two TB ports. The obvious implication is that Apple sees more intense usage of TB. Even if we don't see clustering over TB, I suspect that we will be seeing 4 or more ports on the Mac Pro.
I still have a lot of interest in what is up with the Mini and to a lesser extent the iMac. Still hoping to see the same innovation effort put into the MBPR go into these machines. As such I hope they don't castrate the Mini again. With a Cluster capable Mac Pro there should be no reason to underpower the Mini.
Again, TB is great for laptops, AIO and SFF computers. Not so great in comparison to having real slots in the Mac Pro. Each one burns 4 PCIe 2.0 lanes...which the MBPR had available but the Mac Pro is somewhat short. For example it would have been nice to have been able to configure it into two PCIe x16 slots...
4 TB ports means 8ish PCIe 3.0 lanes gone which is fewer for the internal slots and it's less flexible than being able to configure the slots the way you needs them.
Meh...2 seems more likely. Remember also that if you're running a display on them the available bandwidth drops to 5.something Gbps.
A solid Mac Pro update to SB that is rack mountable but good looking would make many folks happy. A cube sized MP with esoteric hardware most software can't utilize not so much. If Apple wants to dabble with TB based clusters the mini is a far better platform to tinker with than to cripple the Mac Pro.
There's some work involved but developers are already doing it. Apple is using OpenCL in their software and it should run on Knights Corner. You'd still get one multi-core general CPU but the Knights Corner and high-end GPU are the only ways to properly tackle high resource workflows in a way that a second CPU just can't come close to.
CERN ported some code over to it for a benchmark:
Some people can port code over in a matter of hours:
No matter the effort required to port code, the benefits are worthwhile. For a community that requires the fastest performance no matter the cost, it's the best way to go.
Thunderbolt latency is 9ns, FDR is 160ns. The interconnects won't be an issue for performance at all - loads of supercomputers are using Gig-E interconnects.
I think they have a chance at both but primarily the workstation environment. With optical Thunderbolt, a creative office can connect every Mac Pro and even iMac and Mini across large distances to use the compute power in every machine. Submit a render job on one machine and it can be distributed across 10 other machines with a mix of i7, Xeon, Knights Corner, GPUs etc. If a GPU can be run over a Thunderbolt box, Apple can have the computers all work like a co-processor so it's transparent. Think of an office with 20TFLOPs of double precision compute for under $30k.
This would be amazing for marketing. It has a great identity being a Cube, they can call it 'your own personal supercomputer' and it actually means something.
Would you need a Red Rocket with Knights Corner? The idea is that the massive compute performance will allow people to bypass a lot of custom hardware.
Quote:
Originally Posted by Marvin
There's some work involved but developers are already doing it. Apple is using OpenCL in their software and it should run on Knights Corner. You'd still get one multi-core general CPU but the Knights Corner and high-end GPU are the only ways to properly tackle high resource workflows in a way that a second CPU just can't come close to.
Sorta. You still need to overcome the bottlenecks associated with high parallelisms for the 50-64 x86 cores in Knights Corner. To get peak performance you need to optimize your code for the 512 bit SMID. Arguably easier for KNC than CUDA but there's quite a bit of work involved anyway. One that HPC programs are typically reasonably well suited for and one that general purpose computing not as much. How long did it take Adobe to handle multiple cores?
KNC I understand is pretty power hungry in comparison to GPUs. There are advantages and disadvantages between the shallow 6 stage pipeline in the KNC and the longer pipelines in GPUs (20+). It can be nice but there are always tradeoffs.
Quote:
No matter the effort required to port code, the benefits are worthwhile. For a community that requires the fastest performance no matter the cost, it's the best way to go.
Thunderbolt latency is 9ns, FDR is 160ns. The interconnects won't be an issue for performance at all - loads of supercomputers are using Gig-E interconnects.
The 160ns FDR latency is including the switch. What is the latency to the TB device at the end of the chain?
Interconnects are an issue for super computers but is mitigated by the way the parallelism is structured. Communicating between compute nodes is also very different than talking to your GPU or other high speed PCIe cards.
Quote:
I think they have a chance at both but primarily the workstation environment. With optical Thunderbolt, a creative office can connect every Mac Pro and even iMac and Mini across large distances to use the compute power in every machine. Submit a render job on one machine and it can be distributed across 10 other machines with a mix of i7, Xeon, Knights Corner, GPUs etc. If a GPU can be run over a Thunderbolt box, Apple can have the computers all work like a co-processor so it's transparent. Think of an office with 20TFLOPs of double precision compute for under $30k.
This would be amazing for marketing. It has a great identity being a Cube, they can call it 'your own personal supercomputer' and it actually means something.
They could do that today over GigE but the software complexity is large in comparison to a dedicated render farm.
A GPU can run over a Thunderbolt box just very slowly in comparison to inside a Mac Pro on a x16 slot.
What you are suggesting really hoses the workstation environment by removing the flexibility of 4 PCIe slots...which was none too many to begin with.
Again, if they really wanted to do this then the Mac Mini server is a better form factor to start with. Make it a cube, put two TB ports on it, a top end Core i7, a decent GPU and a couple extra RAM slots. Then use 3rd party expansion chassis for GPGPUs and KNCs boards and provide the software framework to make it usable.
That's a hell of a lot better than crippling the Mac Pro unless you're going to provide dedicated 16 lanes PCIe 3.0 for an offboard chassis like an uber Cubix.
Many Pros need more slots and what PCIe 3.0 brings, not fewer. You use one slot and 16 lanes for your primary GPU, then another slot for a good SAS or FC card, then maybe one for a Decklink card. Then you want another GPU since CS6 and Resolve can use multiple GPUs but you also want a Rocket. Doh. Even if you HAD another slot you're likely out of PCIe 2.0 lanes without downgrading one of the other cards. Like using only 8 lanes for your primary GPU. Ugh.
If you're building a dump truck, build a dump truck. Not try to replicate it with a bunch of modular pickups.
Quote:
Would you need a Red Rocket with Knights Corner? The idea is that the massive compute performance will allow people to bypass a lot of custom hardware.
Asking this is like asking would you need a GPU with that massive compute performance from KNC...that was the whole idea of Larabee in the first place.
In theory you could do the wavelet decompression and debayering R3Ds in GPGPU software since without a Rocket you're falling back to the CPU anyway but given that 3D, HDRx and multiple streams require multiple Rockets my guess is that Red looked at it and concluded that a software implementation in OpenCL or CUDA wasn't going to be good enough. It's not like high end video workstations aren't sporting Teslas today.
I am very much in the "MBPR and iMac are good enough for 90% of Pro use cases" camp. But I'm also very much in the "don't dumb down the Mac Pro into a fancy cube" camp. Those slots are highly critical for many high end use cases. TB is NOT a viable alternative. The case size doesn't f-ing matter except that it would be nice if it fit into a 3 or 4U slot in a rack.
Yes for a super computer you need state of the art bandwidth. We aren't talking super computer here but a small cluster of machines.
Beyong that how many generations and varieties of super computer interconnects have there been? There have been more than a few some of which would never be practicle for Apple. You seem to imply that TB ports and even an Infiniband port would use up all available lanes in a new Mac Pro. That is very unlikely especially with Intel supporting 40 lanes of PCI Express on its new chips. OK who suggested such a thing?
Yes 2013 is in the future. Intels multi core technology is slated to arrive by the end of the year. NVidias new compute engines arrive in that time frame too. So yeah 2013 is the future and in that time frame new hardware will be available to make a radically different work station.
You won't be fitting such hardware into a Mini anytime soon.
As to software, there are already apps out there that strain current systems even with GPU acceleration and dual socket CPUs. Going to Sandy Bridge E alone would not have a serious impact on those apps. New processor technology and clustering might change the equation favorably. Further if an app doesn't need clustering support it will still run fine on one of these machines.
So basically, you don't understand.
Have fun using Windows, in that case.
Quote:
Originally Posted by Suddenly Newton
A leak is an unsanctioned. This came directly from Tim. Doesn't get more official.
What about the Mac G4 Cube? Nope, still dead.
Mac Pro: still alive.
I never said the Mac Pro or the MacBook Pro were dead, just the 17" flavor. If it's dead, how about an official announcement? I don't mind piecing up the There's no excuse for not being upfront about products. They're usually closed-mouth about when product will be upgraded so it doesn't disrupt sales but the 17" was eliminated without explanation. It's one thing to hold off on an update or not announcing a coming update in an effort to continue sales but it's another thing entirely pulling a product without explanation.
Quote:
Originally Posted by wizard69
Most of your points are meaningless.
You think so because you don't get it.
Quote:
You seem to imply that TB ports and even an Infiniband port would use up all available lanes in a new Mac Pro. That is very unlikely especially with Intel supporting 40 lanes of PCI Express on its new chips.
Ideally you want to have two x16 3.0 lanes for two GPUs running full speed. That's 32 lanes by themselves leaving 8 for everything else. Seems too tight.
So assume you have one x16 lane and a desire for two x8 cards and one x4 card. That's 36 lanes out of 40.
You want 4 TB ports which is 8 lanes.
So even with PCIe 3.0 lanes are a precious commodity. 2 TB ports using 4 lanes seems like the max for the Mac Pro if you don't want to compromise what you can do internally with the 4 slots.
Quote:
OK who suggested such a thing?
Marvin. A 8" cube with a single slot for GPU and everything else via TB.
Quote:
Reading things into my posts never said isn't helping your position. Supporting TB does not mean that slots have to disappear. There are plenty of PCI Express lanes to go around in a work station. In any event looking at today's Mac Pro to determine what is or should be available on a future machine is foolish.
Yes, it does. The math above shows that. The current Mac Pro is anemic when it comes to lanes and slots. Moving to PCIe 3.0 helps that but not if you burn them on the TB ports.
The 8" form factor also forces the slots to disappear.
Quote:
How many slots you can manage along side those TB ports is a function of the chipset selected. Without knowing the specifics you can't say for sure if you are loosing or gaining.
If you want full speed x16 PCI-e 3.0 slots then the math is pretty simple for the SB Xeon E5-2600. 2 max. With 1 you end up with 24 left for 3 more slots. Whatever you take from them to implement TB means fewer x8 slot configurations with the primary x16 GPU.
Quote:
The serial nature of the ports means that more than one of anything can lead to contention and bandwidth sharing. Thus it makes even more sense to support 4 or more TB ports on a Mac Pro.
Again, it is far better in a tower to leave those lanes available to use in your slots. In a laptop, AIO or SFF there are no slots.
Quote:
This is certainly true. However such a machine will not move the industry forward. Nor will such a machine support modern software workloads optimally. In some cases apps wouldn't even be possible without OpenCL support, at least not in a realtime sense.
Modern WORKSTATION workloads are currently handled using PCIe cards like the Rocket, GPUs like Tesla, etc. These are all bandwidth hungry cards that state that they want more bandwidth than is available even if it had the entire TB link.
What you and Marvin favor is a machine that is less capable and more ridged. Not everyone will benefit from KNC and might want something else BUT you can drop a KNC card into a x8 slot. 2-3 if you want if you have the bandwidth. Which you won't if you have 4+ TB ports.
Quote:
You won't be fitting such hardware into a Mini anytime soon.
Sure you can. It just has to be bigger to fit the KNC board and not fry itself from the heat. Something the size of the old mini might work. It's not like a 8" cube is that much bigger.
Quote:
As to software, there are already apps out there that strain current systems even with GPU acceleration and dual socket CPUs. Going to Sandy Bridge E alone would not have a serious impact on those apps. New processor technology and clustering might change the equation favorably. Further if an app doesn't need clustering support it will still run fine on one of these machines.
Except the box Marvin suggests can't have the same amount of GPU and dedicated hardware that the current Mac Pro can support running at the same speeds that they do today. TB would be a huge bottleneck in comparison to native PCIe 3.0 and I don't believe that bonding is in the spec.
You guys still want a freaking xMac...fine but don't go advocating wrecking the Mac Pro to get it. A single slot Mac Pro cube is only marginally better than the iMac.
Many industries would change drastically with so much power available on the desktop. This is especially the case if they can control coats on the machine. Controlling costs is why I expect drastically overhauled physical design.
PCI Express has enough bandwidth to drive GPUs over 8 lanes. However you assume that unreleased products will be limited to X number of PCI lanes I'm just not convinced that we know what is being released and with what chip sets. All engineering is a task in finding a balance between available materials and the goals you want to achieve. Apple could easily tackle this issue with three eight lane slots, which would leave lanes for TB and onboard devices. Actually even though Marvin and I have similar ideas we are not actually looking at the same vision. If I was designing the next Mac Pro the GPU would be planted right on the motherboard as would be Knights Corner. Only in your world view. Beside your position is asinine you are looking at old technology to inform you of what is possible on new technology. The current Mac Pro is a very old design to be doing that with. Sandy Bridge E comes with 40 lanes of PCI Express, I hardly think there will be a problem finding the right distribution of those lanes. Well no it doesn't but that presupposes that the machine will be 8" in size. yep
This is no big deal though. Well let's just say the tower has gone the way of the DoDo. Even so you still have lanes to make up a number of slot arrangements.
In your world view! If the GPU accelerator and Knight Corner are already on board your need for slots is minimized. That is not to say slots aren't needed just that you change the dynamics by shipping standard hardware with high performance features baked in. Not everyone benefits from a GPU. Your arguement makes no sense. You build such platforms for running tomorrows software not today's.
As to the TB ports I really don't get what the emotion is all about. The hardware implement a cross bar switch so the lanes can be routed as needed. Further it would be not unheard of for one or two of those ports end up being dedicated to driving video displays. First you say you can't, then you can, then you can't. I really don't follow what your problem is. Please don't get hung up on a specific cube size either. TB has plenty of bandwidth where it is needed. Plus with PCI Express 3 you have even more bandwidth to devices that need it.
Anybody with a reasonable understanding of workstation technology would realize that the current Mac Pro is a wreck! You are hung up on the box and frankly I see the current box as a negative when it comes to implementing a modern machine with a wide appeal to all the different high performance computer users out there.
In any event let say that cube has two slots with a GPU and a Knights Corner on the motherboard, how is that not a high performance workstation? It would be for most definitions and frankly they could move the processors out to cards if they wanted but the reality is that cards are compromises. If you want to ship a block of computing power it is best to integrate it onto one board.
Quote:
Originally Posted by Slurpy
I like the fact that you completely ignored his question, and thereby proved his point. Most people bitching about the Mac Pro don't even need one based on what they do, and are bitching for the hell of it. I know people who do nothing but watch porn and needless customization to their computers and call themselves 'power users'. His question is what the current Mac Pro can't do, that you need it to do.
I ignored the question because it's irrelevant. Who is he (or you) to tell other users what they "need"? The point is that Apple telling me that I'm important to them while charging me full price for just-updated, "new" hardware that is already one to three years out of date is insulting. When I got my last Mac Pro, it came out a month or two before anyone else in the industry shipped a machine with those chips. THAT was making your customers feel important. Now, they upgrade a machine to year-old processors, when brand new processors are shipping from other vendors RIGHT NOW. That just makes your customers feel s**t on.
As for me personally, I agree with you. I DON'T need a Mac Pro. I never have, but I got one anyway. Why? Because I could afford it, and I do need more than an iMac can provide at a reasonable cost. Yes, I'm aware you can get a pile of thunderbolt accessories.. but they cost a small fortune. Not to mention buying a whole new display every time you upgrade the machine.
That's why I'm switching to Windows on my next upgrade cycle. While I'm not a big fan of Windows, I'm not an Apple fanboy, either, and I'm fluent in both. Both have their strengths and both have their warts. It was already hard enough justifying buying a Mac Pro when all you need is a souped-up iMac, but there's just no way I'm paying Apple's price for a Mac Pro that is already a half a cycle old compared to hardware I can get from their competitors. And after the recent Final Cut Pro debacle, I don't see the situation improving any time soon.
It's supposed to be 9ns across 7 devices but real-world scenarios may prove otherwise. It certainly should be under 9ns end-to-end across one device though.
Not very slowly, A GTX 680 and 7970 (the fastest consumer GPUs in the world) run at 73% and 86% respectively in PCI 1 x4:
http://www.overclock.net/t/1253914/tpu-ivy-bridge-pci-express-scaling-with-hd-7970-and-gtx-680
That performance drop is pretty much unnoticeable.
Knights Corner can use more general purpose code though.
My Cube would have 6 Thunderbolt ports with 16 lanes for the GPU. Although they are lower bandwidth, I'd say 6 TB ports is better than 3x PCI slots. Some cards will prevent you even using all 3 slots.
I think the Knights Corner (or more likely Knights Landing if late 2013) would be on the motherboard but Apple put the iMac GPU in a slot so I figure they'd do the same with the Mac Pro. Either way is good though.
I don't see there being a huge bottleneck. Thunderbolt is a multi-protocol connection. If you need channel bonding, it can be implemented just like it is on a fibre channel PCI board. Thunderbolt is external PCI. Whatever you use internal PCI for can be used for external PCI. Here is a 10Gbps ethernet Thunderbolt box with link aggregation:
http://attotech.com/products/product.php?scat=32&sku=TLNT-1102-D00
Nah, the xMac was always the cheaper i7 box that could never happen. This is rethinking what a workstation-class machine should represent. It shouldn't be hacking together ugly PCI cards and only having Xeons that take forever to improve in performance.
6 Thunderbolt ports = lots of ports for expansion if you need it and/or up to 6 displays.
powerful GPU that is the best in class and well-supported
Knights Landing - 1.5-2 TFLOPs of double precision semi-general purpose computing for high resource workloads like video encoding/decoding and rendering
6-10 core Xeon for reliable general purpose tasks that can't be or don't need to be accelerated by the co-processor.
Affordable - $2999.
Scalable - just hook more together. No matter if Intel screw up their rollout again, the performance can be scaled linearly.
However it appears that they recognize that they screwed up thus the "leaks" about a new Mac Pro coming in 2013. If the advance is as strong as rumored it might not be a bad idea to put off the platform change. It certainly was an incredibly stupid move on Apples part. However maybe the decision was made to encourage people to stay away from the Mac Pro until they have deliverable replacement hardware.
The problem isn't so much the hardware as the lack of communications about why it was so poorly updated. This is an example of Apples secrecy working against them. Few of us actually need any of Apples hardware. In my case one of the reasons I support the XMac concept is that we do need a desktop computer that is better than the iMac instead of being forced to choose between Mini, Mac Pro and the laptops.
The third alternative is Linux.
In any event I'd suggest not leaving the fold to quickly. It appears that Apple realizes that they screwed up with this micro update to the Mac Pro. At this point their only choice is to ride out the hostility until they get the next generation hardware on the market. Speaking of which I'm rather put off by their excessive focus on the laptop segment with little effort putinto innovating on the desktop. Apples desktop line up right now is pure crap from the Mini on up, it is no wonder sales suck.