Project Wolf/XGrid

Posted:
in macOS edited January 2014
With the long to be anticipated release of the 970, as well as the 64-bit goodness, and 10.3, let's talk about computing grids, clusters, and distributed computing. The rumored Wolf project. The last I heard of it was several months ago which that thread in itself was simply a rebirth of information from allenmcjones (?).

It seems completely reasonable for Apple to take an added advantage of the 970 by advertising it as a plug-and-play clustering computer, especially when they hit the Xserves. What does anyone else anticipate with Wolf? I've spent some time just getting myself familiar with the terms and technologies behind some clustering/grid systems. I'll post later with some of that info. What do you expect?



I thought of this when I was posting, apparently Apple has XGrid trademarked.

Comments

  • Reply 1 of 18
    os10geekos10geek Posts: 413member
    I am seeing a Mac-Unix supercomputer, infintely adaptive. Have one unit fail? No problem. Like a neuron in a human brain, it will be replaced without downtime to the rest of the system.

    (But neurons get replaced without the help of techies. )

    Imagine rendering a giant 3D file, and having a unit crash. The whole redering becomes flawed. But with Xgrid, it would instantly detect the unit is dow, alert Administrators, and delegate the work to another unit. Failproof.
  • Reply 2 of 18
    hmurchisonhmurchison Posts: 12,440member
    <a href="http://maccentral.macworld.com/news/0303/13.gridiron.php"; target="_blank">http://maccentral.macworld.com/news/0303/13.gridiron.php</a>;



    Interesting work on Grid Computing.



    OS10geek- From what I read 3D doesn't lend itself to distributed computing unless you're talking individual frames. But then your point stands
  • Reply 3 of 18
    os10geekos10geek Posts: 413member
    Yes. I meant animating in 3D. But couldn't you divvy up a 3D scene into a "grid", and then have each node in a 10 node grid do 10% of the image?
  • Reply 4 of 18
    airslufairsluf Posts: 1,861member
  • Reply 5 of 18
    I was thinking more along the lines of a workplace supercomputer. An office of 200 Macs on a switched network (must have), all sharing processes. The graphics artist doing PShop work uses processes from the pampered execs quad-970 while he's typing up his latest report in Document.



    I mean, your uses are great, but uncommon. Apple is consumer oriented, with a reach for Hollywood, and strong ties to publishing. Although, the information from "jones" stated that the team all had PhDs and were considered to have written the book on clustering. So think your uses will be feasible.
  • Reply 6 of 18
    1337_5l4xx0r1337_5l4xx0r Posts: 1,558member
    macserverx,



    you do realize that the network overhead for 200 macs randomly clustered for various user tasks will be huge, right? I mean, even Gigabit ethernet would be slow. Talk about packet collision.
  • Reply 7 of 18
    paulpaul Posts: 5,278member
    [quote]Originally posted by 1337_5L4Xx0R:

    <strong>macserverx,



    you do realize that the network overhead for 200 macs randomly clustered for various user tasks will be huge, right? I mean, even Gigabit ethernet would be slow. Talk about packet collision.</strong><hr></blockquote>



    gigawire has still not been explained
  • Reply 8 of 18
    airslufairsluf Posts: 1,861member
  • Reply 9 of 18
    amorphamorph Posts: 7,112member
    AirSluf is right: If you distribute relatively large amounts of work (say, one frame of cinema-quality 3D animation) to various machines, you can keep network traffic low while reaping the benefits of having a lot of machines tackle the problem (because there are a lot of frames!).



    This illustrates the difference between distributed computing and clustering: In clustering, you basically have a dedicated high-speed network for the various nodes to exploit. With a distributed solution, the chunks of data are larger, and the cooperation is looser. An ad hoc, Rendezvous network would be best served by a distributed model, with clustering reserved for racks of Xserves and the like. And, of course, the two aren't mutually exclusive: a cluster of Xserves could be a node in a distributed network.
  • Reply 10 of 18
    [quote] on a switched network (must have) <hr></blockquote>



    In my own defense, I specifically stated that the network must be switched. I properly designed switched network will have zero, zilch, nada collisions. A switch creates a virtual circuit between the two nodes for the times of transmission. Now if someone puts a hub somewhere or we have AirPort. The computers on the hub and AirPort will cause collisions and corruption and a broadcast to clear the network of traffic. But in seconds it would be back to normal. I'm preparing for my CCNA (Cisco Certified Network Associate). I have a good idea what I'm talking about.

    -----

    Now to my promised contribution. I expect to see a System Preference for Wolf/XGrid. By default, computers will be configured as both Managers and Workers. A Manager sends out processes to Workers for processing, a Worker processes tasks and sends them back to the Manager to use. Rendezvous is fully integrated into Wolf to provide dynamic resource discovery and feedback. A, maybe, 128-bit encryption option for transfer of processes across insecure or untrust networks. Also allow user to specify times which a Worker can be used for Work.



    When a Worker becomes available to the network (either option just enabled or put on the network), Rendezvous sends out an advertisement, providing the configuration details. For computers set to be available only at certain times, Rendezvous will advertise every time the Worker becomes available, with configuration, start and end times. If the end time is changed during this period a new advertisement will be sent stating the original availability and the new Working times.



    For new Managers (just brought online or option just enabled), an advertisement for Workers will be made. To prevent hundreds or thousands of computers from responding, the computer will choose an address range of maybe 16 addresses and these computers will send Worker information from their own tables up to a certain limit to prevent huge files from transfering.



    The Condor project (preparing for open-source release) has what they call Flocking. This would allow a computer to use a computer it itself cannot see via Rendezvous, other subnets basically if I understand enough about the current implementation of Rendezvous. (According to the ZeroConf website, Apple's implementation is not complete and I don't know whether a full implementation will make non-local network discovery possible.) Other networks or subnets are configured in the Preferences as well. A table listing the available networks (pools) of computers. I'm thinking table because of passwords, etc to access resources in that pool. I'll see if I can cook up a little ASStudio interface for it.



    The Echelon Grid provides a very good philosophy for grid management. They want to user accessibility and manageability, automatic discovery of resources, and minimal user intervention, Apple trademarks.



    This technology along with NetBoot will allow for the Tablet and absolutely dumb IQ 10 terminals (aka a Tablet standing up, with mouse and keyboard.



    links from this post

    <a href="http://www.geocities.com/echelongrid/"; target="_blank">http://www.geocities.com/echelongrid/</a>;

    <a href="http://www.condorproject.org/"; target="_blank">http://www.condorproject.org/</a>; - redirected URL

    -----

    There are of course lots of logistical problems I have not had to time to throughly evaluate solutions. But I believe I've added some fuel to the fire.
  • Reply 11 of 18
    curiousuburbcuriousuburb Posts: 3,325member
    details on JPL's use of the Pooch clustering software to connect 33 XServe's to crank 217 Gigaflops (in Nov 2002) can be found here
  • Reply 12 of 18
    Well Apple just released new stripped down cluster xServes. I think we will see a custom clustering solution from Apple that incorporates Rendezvous.
  • Reply 13 of 18
    macserverxmacserverx Posts: 217member
    QMaster

    http://www.apple.com/shake/specs.html

    Quote:

    Shake Qmaster



    * Network render management solution for Mac OS X, maximizes utility of existing resources, increasing ROI of Mac equipment

    * Distributed rendering allows provides uninterrupted workflow for artists by offloading processor intensive tasks to other computers

    * Qmaster can create multiple ?clusters? of Apple G4 computers in order to create pools of processing power dedicated to specific jobs, artists or applications.

    * Fault-tolerant architecture ensures successful job completion and accurate results, even in the event of resource deallocation

    * Optimized usage of network resources through load-balancing algorithms ensures efficient operation and timely job completion.

    * Compatible with third party command line rendering applications running on Mac OS X.



    This explains the XServe Cluster nodes.

    What I think is significant here is the fact that it is compatible with third party renderers. This could make it possible to use existing SUN and Intel (Pixar) renderfarms until the budget gets a boost. And something that is important for maximizing processing power, Darwin-based rendering so you can boot your XServes using Darwin and X11, cutting out the overhead of Aqua and allowing remote administration through X11.app (or framework, since it should be built in and a System Preference).



    edit:

    I watched the Tour Movie on the rendering. According to it, the command line rendering applications means Maya will be able to take advantage of the distribution manager.

    And Rendezvous is used to discover licensed PowerMac G4s or XServes. No G3s or even iMac G4s are ever mentioned. But we do know Apple has the technology, now it just needs to be moved to a consumer market.
  • Reply 14 of 18
    macgregormacgregor Posts: 1,434member
    So macserverX, how does it work on a SUN system if it needs Mac OSX? or is that only for third party apps?
  • Reply 15 of 18
    This was kinda sprung on me, so this may not entirely fit together yet.



    How does distributed.net, SETI, or any of those distributed protien folding apps work? I didn't claim to be an expert, just that I had done my research. Also, look at my profile. High school student.



    Well, if you go on the model of how I assume d.net and SETI work, packets from Maya, Shake, FCP, Compressor. This would of course be an unconventional use as I have no cluster in my basement. Only experts would know how to set it up and only experts would use it.
  • Reply 16 of 18
    macgregormacgregor Posts: 1,434member
    macserverX: Sorry to be so inquisitive of you. I'm just glad someone else is interested in the subject.



    Quote:

    I watched the Tour Movie on the rendering. According to it, the command line rendering applications means Maya will be able to take advantage of the distribution manager. And Rendezvous is used to discover licensed PowerMac G4s or XServes. No G3s or even iMac G4s are ever mentioned. But we do know Apple has the technology, now it just needs to be moved to a consumer market.



    Those tutor movies always give alot of tantalizing details! I wonder how much of it requires Altivec? Of course in schools, there are lots of licensed PMacs just sitting around.
  • Reply 17 of 18
    Hence why it needs moved to consumer. My school has one mac lab, I'd guess about 18 Quicksilvers set up and 6 MDDs in their boxes waiting for new tables to arrive.



    "Licensed" doesn't really say how licensed. Copies of Shake or registered Apple computers? We have no copies of Shake nor intend to get any I would bet, but if bought my PowerBook (planning to get one next year right before I tell the government how much money I have )and had the urge to buy Shake and used it at school, would I be able to take advantage of those computers?



    If this was moved to a consumer setup, I would be able to use PShop, and Compressor (new MPEG-2 coder) with no setup procedure except perhaps to enable it.



    As for AltiVec, that probably has more to do with Shake than with the distribution. Using Shake on a G3 is prolly like a snail pulling a train. And the only G3 still available is the iBook, right?
  • Reply 18 of 18
    myahmacmyahmac Posts: 222member
    in regards to how it does renders i believe it is frame based like the AFter Effects Render Engine. With the Pro Dundle you can set up a final folder and a start folder. Then every engine watchiing the startfolder will just pick the next frame in the sequence. unfortunately it is a way to use the full power of dual procs. Set up a render then run the engine on the other proc. yields about 95% per cpu, vs 65.
Sign In or Register to comment.