The etc part, yes, but AlitVec, no. IFAIK. There might be som cool stuff in IBM's compilers for doing autovectorizing of some things in Linpack, but I think that the numbers we see from the VT cluster is mostly the double FPUs talkng.
Does anyone have an estimate as to how much more could be added to the final score if Altivec was factored in?
[9:00 AM CST] TMO Reports - Big Mac Passes 10 TFlops, #3 Ranking Not Expected To Change
by Misha Sakellaropoulo
Virginia Tech's "Big Mac" Power Mac G5 cluster has secured its place as the third fastest supercomputer in the world. According to the latest preliminary numbers from the Top 500 Supercomputer Sites, Big Mac is hitting 10.28 TFlop/s on the LINPACK benchmark, giving it a 19 percent performance lead over the forth place supercomputer. Final numbers will be released November 17 at the Supercomputer Conference in Phoenix.
"I don't expect the top five to change [from the November 2 numbers]," Dr. Jack Dongarra, Director of the Innovative Computing Laboratory at the University of Tennessee, told The Mac Observer. Dongarra is one for authors that maintain the list of the Top 500.
Just last week, Big Mac was hitting 9.56 TFlop/s, and system architect Srinidhi Varadarajan told Wired he was hoping for another 10 percent boost "shortly." With the November 2 numbers, the team has almost achieved that goal, and there could still be time for more improvement.
Theoretical performance of 17.6 TFlop/s
"They're hitting 58 percent of the peak performance, which is fairly good," Dongarra said.
The RPeak value, or theoretical maximum performance, is calculated by adding together the theoretical performance of all the processors involved. In Big Mac's case, the cluster is comprised of 1,100 Power Mac G5s, each with two G5 processors. Each G5 processor features two floating point units, and each floating point unit can perform two add-multiply operations per cycle. With each processor running at 2 billion cycles per second, the result is theoretically 8 GFlop/s per processor. This is, interestingly enough, identical to the theoretical performance of each processor used in NEC's Earth Simulator, the world's fastest supercomputer.
Thanks to its supercomputer-specific architecture, however, the Earth Simulator is able to hit 86 percent of its theoretical peak, or 35.86 TFlop/s. The Earth Simulator, which went live March 11, 2002, cost an estimated $350 million ($9.77 million per TFlop/s). Compare that with Big Mac's $5.2 million price tag (or about $500,000 per TFlop/s), however, and its clear the G5 cluster is the value leader.
The fact that Virginia Tech has managed to put together such a cluster in a matter of months with off-the-shelf parts is also impressive. "What's interesting is they're using a new system with a new processor and new interconnects, and the system is based on a processor not designed for scientific computing," Dongarra noted.
Much of the code for the cluster also had to be written from scratch by Varadarajan and his team at Virginia Tech, while other pieces has to be ported over to Mac OS X.
While the performance of Big Mac could still be tweaked, don't expect a G5 cluster--or really any kind of cluster--to surpass the Earth Simulator any time soon. "If you look at the Top 5 on LINPACK, they each have different processors, and four use commercial processors. These are processors not specifically designed for scientific research," Dongarra said.
A good deal of the Earth Simulator's ability to hit 86 percent of its theoretical peak can be attributed to its supercomputer architecture, which allows it to move data around its processors more quickly as opposed to commercial processors and interconnects. There's also a law of diminishing returns at play when you add processors, and the Earth Simulator features 5,120 of them -- almost three thousand more than Big Mac.
Just a benchmark, but an important one
"An important thing to remember is that this benchmark measures just one problem," Dongarra cautions. "Real applications cover many things, so you need to be careful in making one concrete statement about performance."
Concrete or not, Big Mac's No. 3 rank on the Top 500 will garner Virginia Tech, Apple, and IBM plenty of publicity. For its part, Virginia Tech expects to make back many times its investment in the cluster through research that companies will pay the school to perform. The school also plans to release full details of its cluster and the software it uses, which should pave the way for plenty more G5 clusters to be assembled around the world.
Um, didn't Apple buy a Cray supercomputer back during the wacky 90's when it was thinking about doing its own chip design? As I recall, didn't turn out to be the best investment...
Apple won't create his own cluster : it's a waste of money, and will be considered like a Megolamaniac demonstration.
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
He can have a small cluster of let's say 10 G5 : it's enough to develop clustering technologies. But i don't think that the initiator of this topic was referring to such a small type of cluster.
However the virginia G5 cluster will bring a lot of free advertising for Apple and promote G5 clustering all around the world. Imagine that Apple make his own cluster, and that he is less efficient than the Varadajan's one : what a joke.
An Apple build cluster would have make sense only if the Vtech did not exist. And if Apple made his own clusters, you will see billions of peoples around the web claiming that the benchmarks are twiked by cupertino as always. I can hear PC afficionados claiming : it's an another demonstration of the Apple's distorsion field.
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
I am fairly sure that Hardware design is in building 4.
November 10, 2003 - Sources said that IBM is on the verge of releasing the first of its PowerPC 970-based "blades."
While sources couldn't confirm all details, the first product, reportedly called the JS20 BladeCenter module, is said to include a dual-1.6 GHz PowerPC 970 with 256MB RAM in its standard configuration.
It will carry a price tag of $2700. Sources said the form factor will be similar to that of IBM's Xeon-based blades.
While IBM plans to offer support for the AIX operating system down the road, the first blades will run only Linux. AIX support will reportedly be offered next year.
In the past, IBM has projected PowerPC 970 blades at speeds ranging from 1.8 GHz to 2.5 GHz.
Over the summer, eWEEK reported on the forthcoming blades, noting that IBM has developed a three-year plan for producing the 64-bit blades for the "enterprise entry level."
I did not buy any additional RAM, nor did I count any discounts. This is the off the shelf price. The IBM solution is quite a lot mor expensive. But It'll take just 17 standard racks compared to the VT cluster witch took 122 extra wide racks. If density is an issue.. IBM might be the way to go.
Think Secret also mentions earlier statements from IBM regarding four way 2U servers. That might best solution if density is an issue, but I guess that that'll cost you an arm and a leg. IBM isn't cheap.
Witch brings me to Apple's future offerings.
It's quite clear that if IBM can put 2 1.6 GHz G5-processors in a blade module Apple can surely do the same, or even better in a 1U case. As I've said in other threads, this isn't something that's impossible to do. Apple will do it, and probably cheaper than IBM. Expect some quite impressive Xserve G5s from Cupertino in the future.
Here is some new Cluster news that I though was pretty interesting:
By Robert McMillan, IDG News Service MacCentral
IBM Corp. on Tuesday will unveil a new line of low-power blade servers based on the same 64-bit PowerPC 970 processor that Apple Computer Inc. uses in its Power Mac G5 computers.
The new system, called the eServer BladeCenter JS20 will be IBM's first 64-bit blade offering, joining the 32-bit Xeon HS20 systems IBM is already shipping.
The JS20 will come with a lower price tag than its 32-bit Xeon relatives, IBM said. Dual-processor systems will be priced starting at US$2,699, one dollar less than the starting price of the HS20, said Jeff Benck, IBM's vice president of eServer BladeCenter.
"The processor was designed to be a cost-effective, high-performance processor," he said. "I won't say it was developed uniquely for blades, but it's extremely well suited for blades."
IBM is targeting the new blades at the high-performance computing space, and when the systems ship in the first quarter of 2004, they will ship with the Linux (news - web sites) operating system that is coming to dominate high-performance computing, Benck said. "Because of the 64-bit capability and the strong floating-point performance we see it as a natural for Linux clusters and the high-performance space," he said.
The PowerPC 970 is already a proven commodity in high-performance computing. In September, Virginia Polytechnic Institute and State University announced plans to build a $5.2 million 1,100-node G5 cluster for scientific research at the University, that it expects to be the third most powerful supercomputer in the world.
The JS20s will fit into the same 14-blade chassis as the HS20 blades. They will ship with 1.6GHz processors and a standard configuration will have 512M bytes of memory, dual Gigabit Ethernet connections, and an ATA-100 IDE (Integrated Drive Electronics) controller that will support up to two 40G-byte hard drives.
IBM plans to boost the 970's clock speed to 2.4GHz in mid-2004, around the same time that the blades will support the AIX operating system. The company is also planning to add new 10G bit per second (bps) Ethernet connections and 4G bps Fibre Channel interfaces later in 2004, Benck said.
And also:
IBM Builds Supercomputer Based on Gaming Chip
Fri Nov 14,12:01 AM ET
Add Technology - Reuters to My Yahoo!
SAN FRANCISCO (Reuters) - IBM Corp. (NYSE:IBM - news) said on Friday that it has built a supercomputer the size of a television based on microchip technology to be used in gaming consoles due out next year.
IBM said the supercomputer, which can perform two trillion calculations per second, is a small-scale prototype of the Blue Gene/L supercomputer that it is building for the Lawrence Livermore National Laboratory in California.
The computer made it onto the Top 500 supercomputer list, which is compiled by a member of the University of Tennessee's computer science department.
IBM vice president of technology and strategy Irving Wladawsky-Berger said that the supercomputer used 1,000 microprocessors that are based on PowerPC microchip technology. The PowerPC chip is currently used in Apple Computer Inc. (Nasdaq:AAPL - news) computers.
It is also the technology that will be the foundation of the next generation of gaming consoles from Nintendo (news - web sites) Co. (7974.OS) and Sony Corp (news - web sites). (6758.T), which IBM is working on, he said.
He said the chips were less expensive and consumed less power than traditional microprocessors, making it possible to pack the same amount of computing power into a smaller space. Producing the chips in volume for gaming will help offset the costs of building supercomputers, he said.
"Varadarajan said Apple provided significant technical help and gave Virginia Tech some of the first G5s off the production line, but the college paid full price for the machines, which cost $3,000 apiece. "
That said, Apple would be foolish not to offer discounts to future Cluster Builders.
I don't think Apple needs to make their own cluster. But they really should come out with G5 Xserve and then a scalable package for Universities to repeat the big mac thing. But with Racks.
With a standard cooling and layout system I bet they could significantly cut the cost from the Big Mac level.
Comments
Originally posted by Henriok
The etc part, yes, but AlitVec, no. IFAIK. There might be som cool stuff in IBM's compilers for doing autovectorizing of some things in Linpack, but I think that the numbers we see from the VT cluster is mostly the double FPUs talkng.
Does anyone have an estimate as to how much more could be added to the final score if Altivec was factored in?
[9:00 AM CST] TMO Reports - Big Mac Passes 10 TFlops, #3 Ranking Not Expected To Change
by Misha Sakellaropoulo
Virginia Tech's "Big Mac" Power Mac G5 cluster has secured its place as the third fastest supercomputer in the world. According to the latest preliminary numbers from the Top 500 Supercomputer Sites, Big Mac is hitting 10.28 TFlop/s on the LINPACK benchmark, giving it a 19 percent performance lead over the forth place supercomputer. Final numbers will be released November 17 at the Supercomputer Conference in Phoenix.
"I don't expect the top five to change [from the November 2 numbers]," Dr. Jack Dongarra, Director of the Innovative Computing Laboratory at the University of Tennessee, told The Mac Observer. Dongarra is one for authors that maintain the list of the Top 500.
Just last week, Big Mac was hitting 9.56 TFlop/s, and system architect Srinidhi Varadarajan told Wired he was hoping for another 10 percent boost "shortly." With the November 2 numbers, the team has almost achieved that goal, and there could still be time for more improvement.
Theoretical performance of 17.6 TFlop/s
"They're hitting 58 percent of the peak performance, which is fairly good," Dongarra said.
The RPeak value, or theoretical maximum performance, is calculated by adding together the theoretical performance of all the processors involved. In Big Mac's case, the cluster is comprised of 1,100 Power Mac G5s, each with two G5 processors. Each G5 processor features two floating point units, and each floating point unit can perform two add-multiply operations per cycle. With each processor running at 2 billion cycles per second, the result is theoretically 8 GFlop/s per processor. This is, interestingly enough, identical to the theoretical performance of each processor used in NEC's Earth Simulator, the world's fastest supercomputer.
Thanks to its supercomputer-specific architecture, however, the Earth Simulator is able to hit 86 percent of its theoretical peak, or 35.86 TFlop/s. The Earth Simulator, which went live March 11, 2002, cost an estimated $350 million ($9.77 million per TFlop/s). Compare that with Big Mac's $5.2 million price tag (or about $500,000 per TFlop/s), however, and its clear the G5 cluster is the value leader.
The fact that Virginia Tech has managed to put together such a cluster in a matter of months with off-the-shelf parts is also impressive. "What's interesting is they're using a new system with a new processor and new interconnects, and the system is based on a processor not designed for scientific computing," Dongarra noted.
Much of the code for the cluster also had to be written from scratch by Varadarajan and his team at Virginia Tech, while other pieces has to be ported over to Mac OS X.
While the performance of Big Mac could still be tweaked, don't expect a G5 cluster--or really any kind of cluster--to surpass the Earth Simulator any time soon. "If you look at the Top 5 on LINPACK, they each have different processors, and four use commercial processors. These are processors not specifically designed for scientific research," Dongarra said.
A good deal of the Earth Simulator's ability to hit 86 percent of its theoretical peak can be attributed to its supercomputer architecture, which allows it to move data around its processors more quickly as opposed to commercial processors and interconnects. There's also a law of diminishing returns at play when you add processors, and the Earth Simulator features 5,120 of them -- almost three thousand more than Big Mac.
Just a benchmark, but an important one
"An important thing to remember is that this benchmark measures just one problem," Dongarra cautions. "Real applications cover many things, so you need to be careful in making one concrete statement about performance."
Concrete or not, Big Mac's No. 3 rank on the Top 500 will garner Virginia Tech, Apple, and IBM plenty of publicity. For its part, Virginia Tech expects to make back many times its investment in the cluster through research that companies will pay the school to perform. The school also plans to release full details of its cluster and the software it uses, which should pave the way for plenty more G5 clusters to be assembled around the world.
Originally posted by Rhumgod
Configured with dual drives, a must:
Item: Xserve Dual 1.33GHz
Part No: Z09P
Est Ship: 3-5 bus.days
$ Each: $4,049.00
Qty: 999 (the most you can enter)
Total: $4,044,951.00
Yeah, baby!
Apple's share price jumped up for a split second there until you hit 'cancel'.
Originally posted by Powerdoc
Apple won't create his own cluster : it's a waste of money, and will be considered like a Megolamaniac demonstration.
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
Originally posted by McCrab
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
He can have a small cluster of let's say 10 G5 : it's enough to develop clustering technologies. But i don't think that the initiator of this topic was referring to such a small type of cluster.
However the virginia G5 cluster will bring a lot of free advertising for Apple and promote G5 clustering all around the world. Imagine that Apple make his own cluster, and that he is less efficient than the Varadajan's one : what a joke.
An Apple build cluster would have make sense only if the Vtech did not exist. And if Apple made his own clusters, you will see billions of peoples around the web claiming that the benchmarks are twiked by cupertino as always. I can hear PC afficionados claiming : it's an another demonstration of the Apple's distorsion field.
Originally posted by McCrab
How can Apple develop and test clustering technologies/Xgrid if it doesn't have a cluster of its own? It is quite possible that it has such a cluster locked up deep in the bowels of 1 Infinite Loop.
I am fairly sure that Hardware design is in building 4.
IBM to release first PowerPC 970 blades
By Nick dePlume, Publisher and Editor in Chief
November 10, 2003 - Sources said that IBM is on the verge of releasing the first of its PowerPC 970-based "blades."
While sources couldn't confirm all details, the first product, reportedly called the JS20 BladeCenter module, is said to include a dual-1.6 GHz PowerPC 970 with 256MB RAM in its standard configuration.
It will carry a price tag of $2700. Sources said the form factor will be similar to that of IBM's Xeon-based blades.
While IBM plans to offer support for the AIX operating system down the road, the first blades will run only Linux. AIX support will reportedly be offered next year.
In the past, IBM has projected PowerPC 970 blades at speeds ranging from 1.8 GHz to 2.5 GHz.
Over the summer, eWEEK reported on the forthcoming blades, noting that IBM has developed a three-year plan for producing the 64-bit blades for the "enterprise entry level."
Originally posted by mello
Just read this from Think Secret:
I did some quick calculations:
One rack filled with Xserves:
1x Xserve = $3799
41x Xserve Cluster Node (41 * $2799)
Sum: $ 118 558
84 1.33 GHz G4-processors
One rack filled with JS20 blade modules:
6x BladeCenter chassis (6x $2789)
84x JS20 blade module (84x $2699)
Sum: $ 243 450
Fot twice the price for a filled rack, you get twice as many processors, but they are quite a bit more powerful too.
168x PowerMac G5 1.6 (168x $1999)
Sum: $ 335 832
If you wan't the same ammount of processors from Apple. The IBM solution is clearly cheaper.
But when comapred to the VT cluster it's another story though. 2 GHz is 25% more than 1.6, so I compensated for that.
1375 blades (with 2 processors each)
16 full racks + 2 full chassis + 1 chassis with 3 modules:
(16 * $243 450) + (2 * $40 575) + $2 789 + (3 * $2 699)
Sum: $ 3 987 236
1100 Power Mac G5
1100 * $2999
Sum: $ 3 298 900
I did not buy any additional RAM, nor did I count any discounts. This is the off the shelf price. The IBM solution is quite a lot mor expensive. But It'll take just 17 standard racks compared to the VT cluster witch took 122 extra wide racks. If density is an issue.. IBM might be the way to go.
Think Secret also mentions earlier statements from IBM regarding four way 2U servers. That might best solution if density is an issue, but I guess that that'll cost you an arm and a leg. IBM isn't cheap.
Witch brings me to Apple's future offerings.
It's quite clear that if IBM can put 2 1.6 GHz G5-processors in a blade module Apple can surely do the same, or even better in a 1U case. As I've said in other threads, this isn't something that's impossible to do. Apple will do it, and probably cheaper than IBM. Expect some quite impressive Xserve G5s from Cupertino in the future.
By Robert McMillan, IDG News Service MacCentral
IBM Corp. on Tuesday will unveil a new line of low-power blade servers based on the same 64-bit PowerPC 970 processor that Apple Computer Inc. uses in its Power Mac G5 computers.
The new system, called the eServer BladeCenter JS20 will be IBM's first 64-bit blade offering, joining the 32-bit Xeon HS20 systems IBM is already shipping.
The JS20 will come with a lower price tag than its 32-bit Xeon relatives, IBM said. Dual-processor systems will be priced starting at US$2,699, one dollar less than the starting price of the HS20, said Jeff Benck, IBM's vice president of eServer BladeCenter.
"The processor was designed to be a cost-effective, high-performance processor," he said. "I won't say it was developed uniquely for blades, but it's extremely well suited for blades."
IBM is targeting the new blades at the high-performance computing space, and when the systems ship in the first quarter of 2004, they will ship with the Linux (news - web sites) operating system that is coming to dominate high-performance computing, Benck said. "Because of the 64-bit capability and the strong floating-point performance we see it as a natural for Linux clusters and the high-performance space," he said.
The PowerPC 970 is already a proven commodity in high-performance computing. In September, Virginia Polytechnic Institute and State University announced plans to build a $5.2 million 1,100-node G5 cluster for scientific research at the University, that it expects to be the third most powerful supercomputer in the world.
The JS20s will fit into the same 14-blade chassis as the HS20 blades. They will ship with 1.6GHz processors and a standard configuration will have 512M bytes of memory, dual Gigabit Ethernet connections, and an ATA-100 IDE (Integrated Drive Electronics) controller that will support up to two 40G-byte hard drives.
IBM plans to boost the 970's clock speed to 2.4GHz in mid-2004, around the same time that the blades will support the AIX operating system. The company is also planning to add new 10G bit per second (bps) Ethernet connections and 4G bps Fibre Channel interfaces later in 2004, Benck said.
And also:
IBM Builds Supercomputer Based on Gaming Chip
Fri Nov 14,12:01 AM ET
Add Technology - Reuters to My Yahoo!
SAN FRANCISCO (Reuters) - IBM Corp. (NYSE:IBM - news) said on Friday that it has built a supercomputer the size of a television based on microchip technology to be used in gaming consoles due out next year.
IBM said the supercomputer, which can perform two trillion calculations per second, is a small-scale prototype of the Blue Gene/L supercomputer that it is building for the Lawrence Livermore National Laboratory in California.
The computer made it onto the Top 500 supercomputer list, which is compiled by a member of the University of Tennessee's computer science department.
IBM vice president of technology and strategy Irving Wladawsky-Berger said that the supercomputer used 1,000 microprocessors that are based on PowerPC microchip technology. The PowerPC chip is currently used in Apple Computer Inc. (Nasdaq:AAPL - news) computers.
It is also the technology that will be the foundation of the next generation of gaming consoles from Nintendo (news - web sites) Co. (7974.OS) and Sony Corp (news - web sites). (6758.T), which IBM is working on, he said.
He said the chips were less expensive and consumed less power than traditional microprocessors, making it possible to pack the same amount of computing power into a smaller space. Producing the chips in volume for gaming will help offset the costs of building supercomputers, he said.
Originally posted by sCreeD
I'm sure VT got a discount on such a huge order.
Screed
From a Wired interview with Varadarajan.
"Varadarajan said Apple provided significant technical help and gave Virginia Tech some of the first G5s off the production line, but the college paid full price for the machines, which cost $3,000 apiece. "
That said, Apple would be foolish not to offer discounts to future Cluster Builders.
-zip
With a standard cooling and layout system I bet they could significantly cut the cost from the Big Mac level.
Scienticsts would love it.
Everyone is quoting the sub-US$3,000 price for a dual PPC970 blade...
But no one is taking into account the US$3,000 cost of the actual blade chassis that these blade servers slide into!
So, if you want to get into a blade server, the starting price is actually about US$6,000 or so...