Project Wolf/XGrid
With the long to be anticipated release of the 970, as well as the 64-bit goodness, and 10.3, let's talk about computing grids, clusters, and distributed computing. The rumored Wolf project. The last I heard of it was several months ago which that thread in itself was simply a rebirth of information from allenmcjones (?).
It seems completely reasonable for Apple to take an added advantage of the 970 by advertising it as a plug-and-play clustering computer, especially when they hit the Xserves. What does anyone else anticipate with Wolf? I've spent some time just getting myself familiar with the terms and technologies behind some clustering/grid systems. I'll post later with some of that info. What do you expect?
I thought of this when I was posting, apparently Apple has XGrid trademarked.
It seems completely reasonable for Apple to take an added advantage of the 970 by advertising it as a plug-and-play clustering computer, especially when they hit the Xserves. What does anyone else anticipate with Wolf? I've spent some time just getting myself familiar with the terms and technologies behind some clustering/grid systems. I'll post later with some of that info. What do you expect?
I thought of this when I was posting, apparently Apple has XGrid trademarked.
Comments
(But neurons get replaced without the help of techies. )
Imagine rendering a giant 3D file, and having a unit crash. The whole redering becomes flawed. But with Xgrid, it would instantly detect the unit is dow, alert Administrators, and delegate the work to another unit. Failproof.
Interesting work on Grid Computing.
OS10geek- From what I read 3D doesn't lend itself to distributed computing unless you're talking individual frames. But then your point stands
I mean, your uses are great, but uncommon. Apple is consumer oriented, with a reach for Hollywood, and strong ties to publishing. Although, the information from "jones" stated that the team all had PhDs and were considered to have written the book on clustering. So think your uses will be feasible.
you do realize that the network overhead for 200 macs randomly clustered for various user tasks will be huge, right? I mean, even Gigabit ethernet would be slow. Talk about packet collision.
<strong>macserverx,
you do realize that the network overhead for 200 macs randomly clustered for various user tasks will be huge, right? I mean, even Gigabit ethernet would be slow. Talk about packet collision.</strong><hr></blockquote>
gigawire has still not been explained
This illustrates the difference between distributed computing and clustering: In clustering, you basically have a dedicated high-speed network for the various nodes to exploit. With a distributed solution, the chunks of data are larger, and the cooperation is looser. An ad hoc, Rendezvous network would be best served by a distributed model, with clustering reserved for racks of Xserves and the like. And, of course, the two aren't mutually exclusive: a cluster of Xserves could be a node in a distributed network.
In my own defense, I specifically stated that the network must be switched. I properly designed switched network will have zero, zilch, nada collisions. A switch creates a virtual circuit between the two nodes for the times of transmission. Now if someone puts a hub somewhere or we have AirPort. The computers on the hub and AirPort will cause collisions and corruption and a broadcast to clear the network of traffic. But in seconds it would be back to normal. I'm preparing for my CCNA (Cisco Certified Network Associate). I have a good idea what I'm talking about.
-----
Now to my promised contribution. I expect to see a System Preference for Wolf/XGrid. By default, computers will be configured as both Managers and Workers. A Manager sends out processes to Workers for processing, a Worker processes tasks and sends them back to the Manager to use. Rendezvous is fully integrated into Wolf to provide dynamic resource discovery and feedback. A, maybe, 128-bit encryption option for transfer of processes across insecure or untrust networks. Also allow user to specify times which a Worker can be used for Work.
When a Worker becomes available to the network (either option just enabled or put on the network), Rendezvous sends out an advertisement, providing the configuration details. For computers set to be available only at certain times, Rendezvous will advertise every time the Worker becomes available, with configuration, start and end times. If the end time is changed during this period a new advertisement will be sent stating the original availability and the new Working times.
For new Managers (just brought online or option just enabled), an advertisement for Workers will be made. To prevent hundreds or thousands of computers from responding, the computer will choose an address range of maybe 16 addresses and these computers will send Worker information from their own tables up to a certain limit to prevent huge files from transfering.
The Condor project (preparing for open-source release) has what they call Flocking. This would allow a computer to use a computer it itself cannot see via Rendezvous, other subnets basically if I understand enough about the current implementation of Rendezvous. (According to the ZeroConf website, Apple's implementation is not complete and I don't know whether a full implementation will make non-local network discovery possible.) Other networks or subnets are configured in the Preferences as well. A table listing the available networks (pools) of computers. I'm thinking table because of passwords, etc to access resources in that pool. I'll see if I can cook up a little ASStudio interface for it.
The Echelon Grid provides a very good philosophy for grid management. They want to user accessibility and manageability, automatic discovery of resources, and minimal user intervention, Apple trademarks.
This technology along with NetBoot will allow for the Tablet and absolutely dumb IQ 10 terminals (aka a Tablet standing up, with mouse and keyboard.
links from this post
<a href="http://www.geocities.com/echelongrid/" target="_blank">http://www.geocities.com/echelongrid/</a>
<a href="http://www.condorproject.org/" target="_blank">http://www.condorproject.org/</a> - redirected URL
-----
There are of course lots of logistical problems I have not had to time to throughly evaluate solutions. But I believe I've added some fuel to the fire.
http://www.apple.com/shake/specs.html
Shake Qmaster
* Network render management solution for Mac OS X, maximizes utility of existing resources, increasing ROI of Mac equipment
* Distributed rendering allows provides uninterrupted workflow for artists by offloading processor intensive tasks to other computers
* Qmaster can create multiple ?clusters? of Apple G4 computers in order to create pools of processing power dedicated to specific jobs, artists or applications.
* Fault-tolerant architecture ensures successful job completion and accurate results, even in the event of resource deallocation
* Optimized usage of network resources through load-balancing algorithms ensures efficient operation and timely job completion.
* Compatible with third party command line rendering applications running on Mac OS X.
This explains the XServe Cluster nodes.
What I think is significant here is the fact that it is compatible with third party renderers. This could make it possible to use existing SUN and Intel (Pixar) renderfarms until the budget gets a boost. And something that is important for maximizing processing power, Darwin-based rendering so you can boot your XServes using Darwin and X11, cutting out the overhead of Aqua and allowing remote administration through X11.app (or framework, since it should be built in and a System Preference).
edit:
I watched the Tour Movie on the rendering. According to it, the command line rendering applications means Maya will be able to take advantage of the distribution manager.
And Rendezvous is used to discover licensed PowerMac G4s or XServes. No G3s or even iMac G4s are ever mentioned. But we do know Apple has the technology, now it just needs to be moved to a consumer market.
How does distributed.net, SETI, or any of those distributed protien folding apps work? I didn't claim to be an expert, just that I had done my research. Also, look at my profile. High school student.
Well, if you go on the model of how I assume d.net and SETI work, packets from Maya, Shake, FCP, Compressor. This would of course be an unconventional use as I have no cluster in my basement. Only experts would know how to set it up and only experts would use it.
I watched the Tour Movie on the rendering. According to it, the command line rendering applications means Maya will be able to take advantage of the distribution manager. And Rendezvous is used to discover licensed PowerMac G4s or XServes. No G3s or even iMac G4s are ever mentioned. But we do know Apple has the technology, now it just needs to be moved to a consumer market.
Those tutor movies always give alot of tantalizing details! I wonder how much of it requires Altivec? Of course in schools, there are lots of licensed PMacs just sitting around.
"Licensed" doesn't really say how licensed. Copies of Shake or registered Apple computers? We have no copies of Shake nor intend to get any I would bet, but if bought my PowerBook (planning to get one next year right before I tell the government how much money I have )and had the urge to buy Shake and used it at school, would I be able to take advantage of those computers?
If this was moved to a consumer setup, I would be able to use PShop, and Compressor (new MPEG-2 coder) with no setup procedure except perhaps to enable it.
As for AltiVec, that probably has more to do with Shake than with the distribution. Using Shake on a G3 is prolly like a snail pulling a train. And the only G3 still available is the iBook, right?