I posted <a href="http://www.xsorbit.com/users/flamingo/index.cgi?board=Future_Hardware&action=display&num
=1000321459" target="_blank">a thread about this</a> over at BadFlamingo a while back, which links to <a href="http://www.techreview.com/magazine/oct01/tristram.asp
" target="_blank">this more informative article on techreview.com</a>. I'm not familiar enough with chip tech to offer a 14 page explanation, but I can offer an overview, which is based on the above-linked article.
In any processor, the time it takes to perform a given calculation is determined by the number of logic gates the electrical current has to go through, the physical distance across the chip that it has to travel, and also the strength of the current (this is one reason why the x86 desktop chips consume so much power). Different operations take different amounts of time to resolve. Because of the nature of the technology, there will be a "value" in the result register while the calculation is in process, but it will fluctuate randomly. When it stabilizes, the calculation is finished.
The traditional solution to determining when calculations are finished is to make the whole chip march to a clock. This means that every operation has a fixed amount of time to decide on their final answer, and when that time is up, every part of the chip whose action depends on that answer can be certain that the answer they're reading is correct. Obviously, the clock has to be set to allow the slowest operation on the chip time to complete. (There are workarounds: For instance, on the 7450 G4 the vector permute instruction requires two
clock cycles). The flip side of this is that if an operation is really fast, then it finishes its calculation long before the next clock tick, and sits idle. "Clockless" chips are designed to prevent that idleness.
The idea is dead simple: You let everything in the chip run at its natural speed. The implementation is tricky, because if you have, say, an instruction in the pipeline waiting on an integer add (fast) and a vector permute (slow), there's no clock to tell it when the values are readable - and if it reads a value too soon, it gets random garbage. So there has to be some way for an instruction to raise a flag or send a message saying "I'm done," and possibly also to be able to raise a flag or send a message saying, "wait, I'm not done." Once you've decided on which scheme to implement, you have to make sure it's efficient enough that it doesn't squander the speed you've gained by freeing faster instructions from the tyranny of the clock.
If you're implementing a complex instruction that can be broken up into a lot of steps, you can also adjust the lengths of the connecting wires so that the timing works out naturally - one instruction never needs the result of another before it's ready. Obviously, when you're doing this for millions upon millions of logic gates and wires, this is delicate work. And so, even though the concept of a clockless chip is simpler than that of a clocked chip, the implementation is more difficult. This is why almost all chips are clocked.
The other advantage to clockless designs is that they only consume the power necessary to perform the calculations they're currently working on. On a clocked chip, every part of the processor consumes power as if it was running. So the clockless designs are more efficient as well as faster.